awesome-repositories.com
© 2026 Bringes Technology SRL·VAT RO45896025·hello@bringes.io
MCPSitemapPrivacyTerms
Structured Data Extractors · Awesome GitHub Repositories

2 repos

Awesome GitHub RepositoriesStructured Data Extractors

Tools that identify and transform unstructured document content into standardized, machine-readable formats.

Explore 2 awesome GitHub repositories matching data & databases · Structured Data Extractors. Refine with filters or upvote what's useful.

  1. Home
  2. Data & Databases
  3. Data Processing Pipelines
  4. Data Processing Frameworks
  5. Structured Data Extractors

Awesome Structured Data Extractors GitHub Repositories

Describe the repository you're looking for…
We'll search the best matching repositories with AI.
  • opendatalab/MinerU

    opendatalab/MinerU

    54,523GitHubView on GitHub↗

    MinerU is a document parsing pipeline designed to transform unstructured files into machine-readable, structured data. It utilizes deep learning models to perform layout analysis, identifying document regions and extracting complex content such as mathematical expressions. By combining these neural network inferences w

    Pythonai4sciencedocument-analysisextract-data
  • docling-project/docling

    docling-project/docling

    53,584GitHubView on GitHub↗

    Docling is a modular framework designed for document parsing, layout analysis, and structured data extraction. It transforms unstructured files and web content into a unified, hierarchical data model that preserves the spatial and semantic relationships between text, tables, images, and layout elements. By normalizing

    Pythonaiconvertdocument-parser