# hiroi-sora/Umi-OCR

**Attribution required: if you use, quote, or summarise this content, you must credit and link back to [awesome-repositories.com](https://awesome-repositories.com/repository/hiroi-sora-umi-ocr).**

42,159 stars · 4,158 forks · Python · mit

## Links

- GitHub: https://github.com/hiroi-sora/Umi-OCR
- awesome-repositories: https://awesome-repositories.com/repository/hiroi-sora-umi-ocr.md

## Topics

`ocr` `ocr-python` `paddleocr` `qml` `qt` `screenshot` `umi-ocr`

## Description

Umi-OCR is an optical character recognition engine designed to convert visual text from images and documents into machine-readable character data. It functions as a local-first toolkit, processing all visual data directly on the host machine using embedded neural network models to maintain privacy and offline availability.

The project distinguishes itself through its focus on automated document digitization and integrated barcode and QR code decoding. By utilizing a modular, Python-based orchestration layer, it enables users to transform static image files and multi-page documents into searchable text formats. The system is built to handle high-volume tasks, employing asynchronous task queueing to maintain throughput during batch processing operations.

Beyond its core recognition capabilities, the software provides a command-line interface that allows for the automation of repetitive extraction workflows. This interface exposes internal processing functions to external scripts, enabling the execution of batch recognition tasks without manual intervention. The project maintains consistent functionality across different operating system environments through its cross-platform native integration.

## Tags

### Artificial Intelligence & ML

- [Optical Character Recognition](https://awesome-repositories.com/f/artificial-intelligence-ml/optical-character-recognition.md) — Performs optical character recognition on image files to extract text and associated metadata. ([source](https://github.com/hiroi-sora/Umi-OCR/tree/main/docs/http))
- [Local Inference Engines](https://awesome-repositories.com/f/artificial-intelligence-ml/local-inference-engines.md) — Processes visual data entirely on the host machine using embedded neural network models for privacy and offline use.
- [Document Analysis Tools](https://awesome-repositories.com/f/artificial-intelligence-ml/document-analysis-tools.md) — Converts document pages into readable text by analyzing page layouts and returning identified character strings. ([source](https://github.com/hiroi-sora/Umi-OCR/tree/main/docs/http))
- [Barcode Decoders](https://awesome-repositories.com/f/artificial-intelligence-ml/barcode-decoders.md) — Extracts hidden text from image files by identifying and decoding visual patterns found within scanned codes. ([source](https://github.com/hiroi-sora/Umi-OCR/tree/main/docs/http))

### Content Management & Publishing

- [Digitization Systems](https://awesome-repositories.com/f/content-management-publishing/digitization-systems.md) — Converts large volumes of scanned documents or images into searchable text files automatically.

### Development Tools & Productivity

- [Orchestration Frameworks](https://awesome-repositories.com/f/development-tools-productivity/orchestration-frameworks.md) — Coordinates image processing pipelines and document parsing tasks through a modular script-based architecture.
- [Workflow Automation](https://awesome-repositories.com/f/development-tools-productivity/workflow-automation.md) — Executes repetitive text extraction tasks across multiple files using command line tools to improve efficiency.
- [Automation Scripts](https://awesome-repositories.com/f/development-tools-productivity/automation-scripts.md) — Executes batch text recognition processes through command line scripts to handle multiple files without manual intervention. ([source](https://github.com/hiroi-sora/Umi-OCR/tree/main/docs/http))

### Software Engineering & Architecture

- [Task Queues](https://awesome-repositories.com/f/software-engineering-architecture/task-queues.md) — Distributes document and image analysis jobs across a non-blocking execution pipeline for improved throughput.