awesome-repositories.com
© 2026 Bringes Technology SRL·VAT RO45896025·hello@bringes.io
MCPSitemapPrivacyTerms
Data Ingestion · Awesome GitHub Repositories

5 repos

Awesome GitHub RepositoriesData Ingestion

Processes and services that receive, clean, and prepare raw data for entry into a storage system.

Explore 5 awesome GitHub repositories matching data & databases · Data Ingestion. Refine with filters or upvote what's useful.

  1. Home
  2. Data & Databases
  3. Data Engineering and Infrastructure
  4. Data Extraction & Ingestion
  5. Data Ingestion

Awesome Data Ingestion GitHub Repositories

Describe the repository you're looking for…
We'll search the best matching repositories with AI.
  • infiniflow/ragflow

    infiniflow/ragflow

    73,425GitHubView on GitHub↗

    This project is a comprehensive retrieval-augmented generation platform designed for building, managing, and deploying knowledge-based AI applications. It provides a unified environment for organizing datasets, configuring conversational chat assistants, and developing autonomous agents that execute multi-step reasonin

    Pythonagentagenticagentic-ai
  • apache/superset

    apache/superset

    70,587GitHubView on GitHub↗

    Superset is a web-based business intelligence platform designed for data exploration, visualization, and interactive dashboarding. It functions as a query-driven analytics engine that connects to various SQL databases, allowing users to perform ad-hoc analysis, define virtual metrics, and build complex data visualizati

    TypeScriptanalyticsapacheapache-superset
  • zylon-ai/private-gpt

    zylon-ai/private-gpt

    57,116GitHubView on GitHub↗

    This project is a privacy-first backend service designed to facilitate retrieval-augmented generation by processing local documents into searchable vector representations. It provides a modular architecture that allows users to ingest diverse file formats, manage document metadata, and perform semantic searches to prov

    Python
  • deepfakes/faceswap

    deepfakes/faceswap

    54,974GitHubView on GitHub↗

    Faceswap is a comprehensive framework for automated media manipulation and neural face synthesis. It provides a modular pipeline that manages the entire lifecycle of facial feature extraction, deep learning model training, and image conversion. By coordinating complex computer vision workflows, the system enables users

    Pythondeep-face-swapdeep-learningdeep-neural-networks
  • Mintplex-Labs/anything-llm

    Mintplex-Labs/anything-llm

    54,751GitHubView on GitHub↗

    This platform serves as a comprehensive environment for managing private language models, document knowledge bases, and automated agent workflows within secure local infrastructure. It functions as a document-aware workspace that enables users to ingest diverse file formats into searchable repositories, ensuring that a

    JavaScriptai-agentscustom-ai-agentsdeepseek

Explore sub-tags

  • Data Cleanup UtilitiesCommands and scripts to purge or reset local data stores.
  • Document Parsing PipelinesAutomated routines that parse diverse file formats into structured text chunks for downstream processing and analysis.
  • File Ingestion ServicesServices that facilitate the ingestion of files into databases by extracting text and metadata for searchable context.
  • Image Data LoadersUtilities for importing image sets and associated metadata into processing environments.
  • Ingestion Performance OptimizersSettings and configurations to tune resource usage during data import.
  • Local Document IngestionCapabilities for importing and monitoring local file systems for document processing.