2 repos
Workflows that parse raw files into structured text chunks and metadata to facilitate semantic search and data retrieval.
Explore 2 awesome GitHub repositories matching data & databases · Document Ingestion Pipelines. Refine with filters or upvote what's useful.
This project is a privacy-first backend service designed to facilitate retrieval-augmented generation by processing local documents into searchable vector representations. It provides a modular architecture that allows users to ingest diverse file formats, manage document metadata, and perform semantic searches to prov
Parses raw files into structured text chunks and metadata to facilitate semantic search and data retrieval.
This platform serves as a comprehensive environment for managing private language models, document knowledge bases, and automated agent workflows within secure local infrastructure. It functions as a document-aware workspace that enables users to ingest diverse file formats into searchable repositories, ensuring that a
Automates the ingestion of raw documents into structured text and metadata to enable efficient semantic search.