PaddleX is a PaddlePaddle-based framework for building, deploying, and fine-tuning AI model pipelines, with pre-built support for computer vision, OCR, document analysis, and time series tasks. It offers a toolkit of ready-to-use pipelines for image classification, object detection, segmentation, and pose estimation, alongside an end-to-end OCR document analysis pipeline that extracts text, tables, formulas, and layout information. The platform also includes a dedicated time series forecasting pipeline for analyzing historical data to detect anomalies, classify patterns, and predict future values.
The framework is built on a pipeline-based modular architecture that allows complex vision and language tasks to be composed as chains of modules, with a unified interface accessible through Python scripts, command-line commands, and REST API endpoints. It supports multi-backend inference engines including Paddle Inference, TensorRT, OpenVINO, ONNX Runtime, and Ascend OM, with hardware-agnostic device switching that lets users change between GPU, NPU, XPU, and MLU accelerators by modifying a single parameter. Pipelines are configured through declarative YAML files, and individual sub-modules can be retrained on custom data and swapped in without rebuilding the entire pipeline.
The platform covers a broad range of capabilities including image classification, object detection with open-vocabulary and rotated variants, instance and semantic segmentation, human keypoint detection, face detection and feature extraction, pedestrian and vehicle attribute detection, and 3D multi-modal object detection from fused camera and LiDAR data. Document processing features include text region detection and recognition, table structure recognition and content extraction, mathematical formula recognition, seal text extraction, document layout parsing, and document image question answering. Time series analysis supports forecasting, anomaly detection, and classification, while video understanding includes action detection and video classification.
Pipelines can be deployed as production-ready HTTP APIs, containerized services, or edge-device binaries for Android and other platforms, with high-performance inference acceleration and backend-specific parameter tuning available for optimizing runtime performance.