awesome-repositories.com
© 2026 Bringes Technology SRL·VAT RO45896025·hello@bringes.io
MCPSitemapPrivacyTerms
Automated Data Extraction · Awesome GitHub Repositories

1 repo

Awesome GitHub RepositoriesAutomated Data Extraction

Systems for converting unstructured media into structured digital formats.

Distinguishing note: Focuses on the conversion process rather than the underlying recognition engine.

Explore 1 awesome GitHub repository matching data & databases · Automated Data Extraction. Refine with filters or upvote what's useful.

  1. Home
  2. Data & Databases
  3. Automated Data Extraction

Awesome Automated Data Extraction GitHub Repositories

Describe the repository you're looking for…
Find the best repos with AI.We'll search the best matching repositories with AI.
  • naptha/tesseract.js

    naptha/tesseract.js

    37,866View on GitHub↗

    Tesseract.js is a JavaScript library that provides optical character recognition capabilities directly within web browsers and Node.js environments. It functions as a client-side engine, enabling the conversion of images containing printed text into machine-readable strings without the need for external APIs or server-side infrastructure. The library distinguishes itself by running the original C++ optical character recognition engine within the browser through WebAssembly modules. To maintain interface responsiveness during intensive computation, it utilizes background threads for parallel p

    Converts scanned documents or photographs into structured digital data to streamline workflows like form processing and information retrieval.

    JavaScriptdeep-learningjavascriptocr
    37,866View on GitHub↗