awesome-repositories.com
© 2026 Bringes Technology SRL·VAT RO45896025·hello@bringes.io
MCPSitemapPrivacyTerms
Structured Data Extraction · Awesome GitHub Repositories

3 repos

Awesome GitHub RepositoriesStructured Data Extraction

Tools that convert unstructured web or document content into clean, typed, and organized data formats.

Explore 3 awesome GitHub repositories matching data & databases · Structured Data Extraction. Refine with filters or upvote what's useful.

  1. Home
  2. Data & Databases
  3. Data Engineering and Infrastructure
  4. Data Extraction & Ingestion
  5. Data Extraction
  6. Structured Data Extraction

Awesome Structured Data Extraction GitHub Repositories

Describe the repository you're looking for…
We'll search the best matching repositories with AI.
  • browser-use/browser-use

    browser-use/browser-use

    78,576GitHubView on GitHub↗

    Browser-use is a framework for building autonomous agents that navigate, interact with, and extract data from web interfaces using natural language instructions. By acting as an orchestration layer between large language models and browser automation protocols, it enables the execution of complex, multi-step workflows

    Pythonai-agentsai-toolsbrowser-automation
  • unclecode/crawl4ai

    unclecode/crawl4ai

    60,452GitHubView on GitHub↗

    Crawl4AI is an AI-powered web crawling and data extraction engine designed to transform complex web content into structured formats. It functions as a headless browser orchestrator, enabling the navigation of dynamic websites, the execution of custom scripts, and the capture of visual assets like screenshots and PDFs.

    Python
  • docling-project/docling

    docling-project/docling

    53,584GitHubView on GitHub↗

    Docling is a modular framework designed for document parsing, layout analysis, and structured data extraction. It transforms unstructured files and web content into a unified, hierarchical data model that preserves the spatial and semantic relationships between text, tables, images, and layout elements. By normalizing

    Pythonaiconvertdocument-parser