awesome-repositories.com
© 2026 Bringes Technology SRL·VAT RO45896025·hello@bringes.io
MCPSitemapPrivacyTerms
Data Processing Configurations · Awesome GitHub Repositories

2 repos

Awesome GitHub RepositoriesData Processing Configurations

Settings and strategies for handling data ingestion, including chunking and constraint management.

Distinguishing note: Focuses on the configuration of data ingestion pipelines rather than raw storage or database management.

Explore 2 awesome GitHub repositories matching data & databases · Data Processing Configurations. Refine with filters or upvote what's useful.

  1. Home
  2. Data & Databases
  3. Data Processing Configurations

Awesome Data Processing Configurations GitHub Repositories

Describe the repository you're looking for…
Find the best repos with AI.We'll search the best matching repositories with AI.
  • crewAIInc/crewAI

    crewAIInc/crewAI

    44,318View on GitHub↗

    CrewAI is a multi-agent orchestration framework designed for building autonomous systems that execute complex, multi-step workflows. It provides a development platform where specialized agents are defined with specific roles, goals, and tool sets to perform tasks collaboratively. By leveraging a declarative workflow engine, the system manages task dependencies, state transitions, and execution logic, allowing for the creation of structured, stateful sequences of operations. The framework distinguishes itself through its hierarchical management capabilities, which utilize manager agents to coo

    CrewAI manages how files are processed when they exceed provider constraints by selecting modes like strict, auto, or chunking.

    Pythonagentsaiai-agents
    44,318View on GitHub↗
  • ray-project/ray

    ray-project/ray

    41,400View on GitHub↗

    Ray is a distributed computing framework designed to scale Python and Java applications across clusters by abstracting task scheduling and resource management. It functions as a resource-aware execution engine that manages task dependencies, placement, and fault tolerance across networked compute nodes. At its core, the system provides a stateful actor model, allowing developers to define classes that run in dedicated processes to maintain and mutate internal state across remote method calls. The framework distinguishes itself through a robust cross-language interoperability layer, enabling f

    Sets global parameters for block sizes and shuffle strategies to control data operations across the cluster.

    Pythondata-sciencedeep-learningdeployment
    41,400View on GitHub↗