awesome-repositories.com
© 2026 Bringes Technology SRL·VAT RO45896025·hello@bringes.io
MCPSitemapPrivacyTerms
Optical Character Recognition · Awesome GitHub Repositories

4 repos

Awesome GitHub RepositoriesOptical Character Recognition

Technologies that convert images of printed or handwritten text into machine-readable digital data.

Explore 4 awesome GitHub repositories matching artificial intelligence & ml · Optical Character Recognition. Refine with filters or upvote what's useful.

  1. Home
  2. Artificial Intelligence & ML
  3. Computer Vision Systems
  4. Optical Character Recognition

Awesome Optical Character Recognition GitHub Repositories

Describe the repository you're looking for…
We'll search the best matching repositories with AI.
  • microsoft/PowerToys

    microsoft/PowerToys

    129,929GitHubView on GitHub↗

    PowerToys is a collection of background-resident system utilities designed to extend native operating system functionality and streamline desktop workflows. It operates as a modular toolkit, utilizing a central plugin-based host architecture that allows users to dynamically enable or disable specific features for syste

    C#advanced-pastecolor-pickercommand-palette
  • microsoft/markitdown

    microsoft/markitdown

    87,305GitHubView on GitHub↗

    This project is an AI-powered document processing engine designed to transform diverse file formats into structured Markdown. By leveraging multimodal language models, it performs complex layout analysis and semantic text extraction, allowing for the conversion of both unstructured files and scanned images into machine

    Pythonautogenautogen-extensionlangchain
  • tesseract-ocr/tesseract

    tesseract-ocr/tesseract

    72,460GitHubView on GitHub↗

    Tesseract is a neural network-based optical character recognition engine designed to convert scanned images and digital documents into machine-readable, searchable text. It functions as both a command-line utility for automating large-scale digitization workflows and a cross-platform library that can be embedded into d

    C++hacktoberfestlstmmachine-learning
  • PaddlePaddle/PaddleOCR

    PaddlePaddle/PaddleOCR

    70,931GitHubView on GitHub↗

    PaddleOCR is a comprehensive optical character recognition framework designed for detecting and transcribing text from images and documents into structured, machine-readable formats. It provides a modular computer vision pipeline that decouples image preprocessing, text detection, and character recognition into indepen

    Pythonai4sciencechineseocrdocument-parsing

Explore sub-tags

  • Mobile OCR IntegrationsSDKs and wrappers for performing OCR on mobile platforms.
  • Multilingual OCR SupportCapabilities for configuring OCR engines to recognize multiple languages and scripts.
  • Multilingual Text RecognitionModels and algorithms capable of identifying and transcribing text across diverse languages and character sets.
  • OCR API BindingsLanguage-specific interfaces for programmatic access to document recognition and pattern matching engines.
  • OCR Command Line InterfacesTools for executing optical character recognition tasks via terminal commands.
  • OCR Configuration PluginsPlugins that allow integration of external language models or services into document text extraction workflows.
  • OCR Data Export FormatsCapabilities for exporting recognized text into structured machine-readable formats.
  • OCR EnginesCore engines for performing optical character recognition.
  • Page Segmentation OptimizersTools for configuring and optimizing document layout analysis and segmentation modes to improve OCR accuracy.
  • Screen Text ExtractorsUtilities that perform OCR on arbitrary screen regions to capture non-selectable text.