awesome-repositories.com
© 2026 Bringes Technology SRL·VAT RO45896025·hello@bringes.io
MCPSitemapPrivacyTerms
Extraction and Ingestion Workflows · Awesome GitHub Repositories

2 repos

Awesome GitHub RepositoriesExtraction and Ingestion Workflows

Specialized frameworks for the initial acquisition and structured parsing of raw data, focusing on plugin orchestration and state management rather than general transformation.

Explore 2 awesome GitHub repositories matching data & databases · Extraction and Ingestion Workflows. Refine with filters or upvote what's useful.

  1. Home
  2. Data & Databases
  3. Data Processing Pipelines
  4. Extraction and Ingestion Workflows

Awesome Extraction and Ingestion Workflows GitHub Repositories

Describe the repository you're looking for…
We'll search the best matching repositories with AI.
  • scrapy/scrapy

    scrapy/scrapy

    59,824GitHubView on GitHub↗

    Scrapy is a comprehensive framework designed for automated web data extraction and large-scale crawling. It operates on an asynchronous, event-driven engine that manages non-blocking network requests and data processing tasks, allowing for the efficient retrieval of structured information from web documents using path-

    Pythoncrawlercrawlingframework
  • deepfakes/faceswap

    deepfakes/faceswap

    54,974GitHubView on GitHub↗

    Faceswap is a comprehensive framework for automated media manipulation and neural face synthesis. It provides a modular pipeline that manages the entire lifecycle of facial feature extraction, deep learning model training, and image conversion. By coordinating complex computer vision workflows, the system enables users

    Pythondeep-face-swapdeep-learningdeep-neural-networks

Explore sub-tags

  • Extraction Data StructuresSpecialized formats and schemas used to organize data during the initial extraction phase of a pipeline.
  • Extraction Pipeline ExecutionRuntime environments that manage the scheduling, execution, and monitoring of data extraction tasks.
  • Item PipelinesModular components that process individual data items as they move through an extraction or transformation pipeline.