# rapidai/rapidocr

**Attribution required: if you use, quote, or summarise this content, you must credit and link back to [awesome-repositories.com](https://awesome-repositories.com/repository/rapidai-rapidocr).**

5,968 stars · 586 forks · Python · apache-2.0

## Links

- GitHub: https://github.com/RapidAI/RapidOCR
- Homepage: https://rapidai.github.io/RapidOCRDocs
- awesome-repositories: https://awesome-repositories.com/repository/rapidai-rapidocr.md

## Topics

`chineseocr` `crnn` `dbnet` `easyocr` `ocr` `onnxocr` `onnxruntime` `openvino` `paddleocr` `rapidocr` `rapidocronnxruntime`

## Description

RapidOCR is an offline deep-learning OCR engine that detects and recognizes text in images using ONNX Runtime, operating entirely without an internet connection. It provides a unified inference pipeline that runs across multiple platforms including Windows, Linux, macOS, Android, and Raspberry Pi, with programming language bindings for Python, C++, Java, and C#.

The engine separates text detection and recognition into independent modules that can be swapped or fine-tuned individually, and abstracts the inference backend behind a unified interface allowing seamless switching between ONNX Runtime, OpenVINO, PaddlePaddle, PyTorch, MNN, and TensorRT. It supports over 80 languages by combining language-specific recognition models with a unified text detection backbone, and offers both lightweight mobile-optimized and higher-accuracy server-grade model variants selected at runtime.

The project includes a command-line tool for extracting text from images and URLs with bounding boxes and confidence scores, and provides structured programmatic output with separate fields for bounding boxes, recognized text, and confidence scores. It can classify text line orientation before recognition to improve accuracy, and visualize results by drawing detected text regions onto the original image.

For deployment, the OCR engine can be packaged into a Docker container for consistent environments across platforms, or bundled into a standalone executable using PyInstaller that removes the Python runtime dependency. The project also includes utilities for converting PaddleOCR models to ONNX format and fine-tuning them on custom data for specialized text recognition scenarios.

## Tags

### Artificial Intelligence & ML

- [OCR Pipelines](https://awesome-repositories.com/f/artificial-intelligence-ml/machine-learning/infrastructure/model-inference-serving/inference-engines/onnx-runtime-inference/ocr-pipelines.md) — Runs the entire OCR pipeline through ONNX Runtime for offline, cross-platform text recognition.
- [ONNX Runtime Inference](https://awesome-repositories.com/f/artificial-intelligence-ml/machine-learning/infrastructure/model-inference-serving/inference-engines/onnx-runtime-inference.md) — Runs the entire OCR pipeline through ONNX Runtime for cross-platform deployment without external framework dependencies.
- [Offline Deployments](https://awesome-repositories.com/f/artificial-intelligence-ml/machine-learning/infrastructure/model-training-and-tuning/fine-tuning-and-customization/model-fine-tuning/pre-trained-model-zoos/model-deployment/offline-deployments.md) — Runs text recognition entirely on-device using converted ONNX models without internet access. ([source](https://cdn.jsdelivr.net/gh/rapidai/rapidocr@main/README.md))
- [OCR Language Support](https://awesome-repositories.com/f/artificial-intelligence-ml/machine-learning/language-tools/ocr-language-support.md) — Recognizes text in over 80 languages including Chinese, English, Japanese, and Arabic without internet access.
- [Multi-Language Recognition Models](https://awesome-repositories.com/f/artificial-intelligence-ml/machine-learning/language-tools/ocr-language-support/multi-language-recognition-models.md) — Combines over 80 language-specific recognition models with a shared detection backbone for multilingual text extraction.
- [OCR Engines](https://awesome-repositories.com/f/artificial-intelligence-ml/ocr-engines.md) — Provides a complete offline OCR engine that runs across Windows, Linux, macOS, and mobile platforms. ([source](https://cdn.jsdelivr.net/gh/rapidai/rapidocr@main/README.md))
- [OCR API Bindings](https://awesome-repositories.com/f/artificial-intelligence-ml/optical-character-recognition/ocr-api-bindings.md) — Provides Python, C++, Java, and C# bindings for programmatic OCR access. ([source](https://rapidai.github.io/RapidOCRDocs/latest/))
- [OCR Command Line Interfaces](https://awesome-repositories.com/f/artificial-intelligence-ml/optical-character-recognition/ocr-command-line-interfaces.md) — Provides a command-line tool for extracting text from images and URLs with bounding boxes and confidence scores.
- [Cross-Platform Offline OCR](https://awesome-repositories.com/f/artificial-intelligence-ml/voice-assistants/offline/cross-platform-offline-ocr.md) — Executes text recognition entirely offline on Windows, Linux, macOS, Android, and embedded devices using ONNX Runtime. ([source](https://rapidai.github.io/RapidOCRDocs/latest/))
- [OCR Pipelines](https://awesome-repositories.com/f/artificial-intelligence-ml/voice-assistants/offline/ocr-pipelines.md) — Provides a complete offline OCR pipeline supporting multiple languages and platforms through a unified ONNX-based engine.
- [Pre-Converted OCR Model Selectors](https://awesome-repositories.com/f/artificial-intelligence-ml/face-detection/model-selection/pre-converted-ocr-model-selectors.md) — Picks a pre-converted PaddleOCR model in ONNX, MNN, or PyTorch format and downloads it automatically. ([source](https://rapidai.github.io/RapidOCRDocs/main/model_list/))
- [Multi-Backend Abstractions](https://awesome-repositories.com/f/artificial-intelligence-ml/machine-learning/frameworks/model-construction/multi-backend-abstractions.md) — Abstracts ONNX, OpenVINO, PaddlePaddle, and TensorRT behind a unified inference interface.
- [OCR Deployments](https://awesome-repositories.com/f/artificial-intelligence-ml/machine-learning/infrastructure/model-deployment-and-serving/local-and-on-device-inference/edge-ai-model-deployment/cross-platform-deployments/ocr-deployments.md) — Runs OCR inference on Windows, Linux, Android, Web, and Raspberry Pi using multiple language bindings. ([source](https://rapidai.github.io/RapidOCRDocs))
- [PaddleOCR to ONNX Converters](https://awesome-repositories.com/f/artificial-intelligence-ml/machine-learning/infrastructure/model-deployment-and-serving/serialization-and-export-formats/onnx-model-exporters/paddleocr-to-onnx-converters.md) — Converts PaddleOCR models to ONNX format for faster, cross-platform inference without external dependencies. ([source](https://rapidai.github.io/RapidOCRDocs))
- [OCR Model Customizers](https://awesome-repositories.com/f/artificial-intelligence-ml/machine-learning/infrastructure/model-training-and-tuning/fine-tuning-and-customization/model-customization/ocr-model-customizers.md) — Fine-tunes base PaddleOCR models with custom data for specialized text recognition scenarios. ([source](https://cdn.jsdelivr.net/gh/rapidai/rapidocr@main/README.md))
- [Custom Data OCR Fine-Tuners](https://awesome-repositories.com/f/artificial-intelligence-ml/machine-learning/infrastructure/model-training-and-tuning/fine-tuning-and-customization/model-fine-tuning/custom-data-ocr-fine-tuners.md) — Retrains PaddleOCR models with custom labeled data for specialized text recognition scenarios. ([source](https://rapidai.github.io/RapidOCRDocs/latest/))
- [OCR Model Conversion and Fine-Tuning](https://awesome-repositories.com/f/artificial-intelligence-ml/machine-learning/infrastructure/model-training-and-tuning/fine-tuning-and-customization/model-fine-tuning/ocr-model-conversion-and-fine-tuning.md) — Converts PaddleOCR models to ONNX and retrains them on custom data for specialized text recognition.
- [OCR Model Fine-Tuners](https://awesome-repositories.com/f/artificial-intelligence-ml/machine-learning/infrastructure/model-training-and-tuning/fine-tuning-and-customization/model-fine-tuning/ocr-model-fine-tuners.md) — Retrains PaddleOCR models on custom data for specialized text recognition scenarios. ([source](https://rapidai.github.io/RapidOCRDocs/))
- [OCR Deployments](https://awesome-repositories.com/f/artificial-intelligence-ml/machine-learning/infrastructure/model-training-and-tuning/fine-tuning-and-customization/model-fine-tuning/pre-trained-model-zoos/model-deployment/offline-deployments/ocr-deployments.md) — Runs OCR fully offline on Windows, Linux, Android, Web, and Raspberry Pi using multiple language bindings. ([source](https://rapidai.github.io/RapidOCRDocs/))
- [OCR](https://awesome-repositories.com/f/artificial-intelligence-ml/model-format-converters/ocr.md) — Provides a utility to convert PaddleOCR models to ONNX format for cross-platform inference.
- [Backend Selectors](https://awesome-repositories.com/f/artificial-intelligence-ml/ocr-engines/backend-selectors.md) — Lets users switch between ONNX Runtime, OpenVINO, PaddlePaddle, PyTorch, MNN, and TensorRT backends. ([source](https://rapidai.github.io/RapidOCRDocs/main/model_list/))
- [Preconfigured Defaults](https://awesome-repositories.com/f/artificial-intelligence-ml/on-device-models/vision-language-models/ocr/preconfigured-defaults.md) — Runs text recognition out of the box with pre-configured models and no manual setup. ([source](https://rapidai.github.io/RapidOCRDocs/main/model_list/))
- [Structured OCR Outputs](https://awesome-repositories.com/f/artificial-intelligence-ml/on-device-models/vision-language-models/ocr/structured-ocr-outputs.md) — Returns structured OCR results with bounding boxes, text, and confidence scores as separate fields. ([source](https://rapidai.github.io/RapidOCRDocs/main/quickstart/))

### Part of an Awesome List

- [Text Extraction and OCR](https://awesome-repositories.com/f/awesome-lists/more/text-extraction-and-ocr.md) — Extracts text from images with bounding boxes and confidence scores via a command-line tool. ([source](https://rapidai.github.io/RapidOCRDocs/main/quickstart/))
- [OCR](https://awesome-repositories.com/f/awesome-lists/ai/model-variants/ocr.md) — Offers both mobile-optimized and server-grade OCR model variants selected at runtime.
- [Modular Detection and Recognition](https://awesome-repositories.com/f/awesome-lists/ai/text-recognition/modular-detection-and-recognition.md) — Separates text detection and recognition into independent, swappable modules for individual fine-tuning.

### Software Engineering & Architecture

- [Multi-Language Support](https://awesome-repositories.com/f/software-engineering-architecture/infrastructure-configuration-languages/multi-language-support.md) — Detects and recognizes text in Chinese, English, Japanese, Korean, Arabic, and other languages from images.

### Web Development

- [OCR Libraries](https://awesome-repositories.com/f/web-development/cross-platform-libraries/ocr-libraries.md) — Ships a text recognition library with Python, C++, Java, and C# bindings using a unified ONNX pipeline.

### DevOps & Infrastructure

- [OCR Deployments](https://awesome-repositories.com/f/devops-infrastructure/cross-platform-deployment-targets/ocr-deployments.md) — Integrates OCR into Python, C++, Java, or C# apps with a single pipeline across Windows, Linux, macOS, and Android.
- [PaddleOCR Model Converters](https://awesome-repositories.com/f/devops-infrastructure/model-conversion/paddleocr-model-converters.md) — Converts PaddleOCR-trained models into ONNX format for offline deployment and cross-language integration.

### Mobile Development

- [Mobile and Server OCR Models](https://awesome-repositories.com/f/mobile-development/mobile-model-deployment/mobile-and-server-ocr-models.md) — Switches between lightweight mobile-optimized and higher-accuracy server-grade OCR models. ([source](https://rapidai.github.io/RapidOCRDocs/main/model_list/))
