1 repo
Frameworks for filtering, cleaning, and modifying data nodes before they are used in downstream processing or model generation.
Distinguishing note: Focuses on the transformation logic applied to retrieved data nodes, distinct from general data ETL processes.
Explore 1 awesome GitHub repository matching artificial intelligence & ml · Data Transformation Pipelines. Refine with filters or upvote what's useful.
LlamaIndex is a comprehensive development framework designed to connect private or external data sources to large language models. It functions as a data-centric toolkit that enables the construction of retrieval-augmented generation systems, allowing developers to build applications that provide context-aware answers based on specific organizational information. The project distinguishes itself through a robust agentic orchestration engine that supports the creation of autonomous agents capable of multi-step reasoning, memory management, and complex tool execution. Beyond simple retrieval, i
LlamaIndex defines specialized logic for filtering or transforming data nodes by implementing custom processing classes that modify information before it reaches the final response generation stage.