awesome-repositories.com
© 2026 Bringes Technology SRL·VAT RO45896025·hello@bringes.io
MCPSitemapPrivacyTerms
Document Ingestion Pipelines · Awesome GitHub Repositories

2 repos

Awesome GitHub RepositoriesDocument Ingestion Pipelines

Workflows that parse raw files into structured text chunks and metadata to facilitate semantic search and data retrieval.

Explore 2 awesome GitHub repositories matching data & databases · Document Ingestion Pipelines. Refine with filters or upvote what's useful.

  1. Home
  2. Data & Databases
  3. Data Processing Pipelines
  4. Data Ingestion Pipelines
  5. Document Ingestion Pipelines

Awesome Document Ingestion Pipelines GitHub Repositories

Describe the repository you're looking for…
We'll search the best matching repositories with AI.
  • zylon-ai/private-gpt

    zylon-ai/private-gpt

    57,116GitHubView on GitHub↗

    This project is a privacy-first backend service designed to facilitate retrieval-augmented generation by processing local documents into searchable vector representations. It provides a modular architecture that allows users to ingest diverse file formats, manage document metadata, and perform semantic searches to prov

    Parses raw files into structured text chunks and metadata to facilitate semantic search and data retrieval.

    Python
  • Mintplex-Labs/anything-llm

    Mintplex-Labs/anything-llm

    54,751GitHubView on GitHub↗

    This platform serves as a comprehensive environment for managing private language models, document knowledge bases, and automated agent workflows within secure local infrastructure. It functions as a document-aware workspace that enables users to ingest diverse file formats into searchable repositories, ensuring that a

    Automates the ingestion of raw documents into structured text and metadata to enable efficient semantic search.

    JavaScriptai-agentscustom-ai-agentsdeepseek