This project is a data processing engine and AI application platform designed for building production-grade machine learning workflows. It provides a unified programming model that handles both historical batch data and live stream ingestion, enabling the development of real-time ETL pipelines and scalable data transformation workflows. The framework distinguishes itself through differential dataflow execution, which propagates only changes through a pipeline rather than recomputing entire datasets. It supports distributed state management across worker nodes and utilizes incremental stream p
Vanna is a Python framework designed to build conversational interfaces that translate natural language into executable database queries. It functions as an enterprise-grade toolkit that connects language models to relational databases, allowing users to retrieve information through conversational prompts rather than manual code. The system maintains context across interactions by utilizing vector databases to store historical query patterns and schema metadata. The framework distinguishes itself through a focus on security and schema-aware generation. It incorporates granular access control,
This repository is a collection of guides, notebooks, and recipes for implementing advanced prompting techniques and workflow patterns with large language models. It serves as a prompt engineering guide, an evaluation suite for scoring prompt quality, and a framework for orchestrating agents and integrating external tools. The project provides implementation patterns for building applications with Claude, specifically focusing on coordinating multiple models to split complex tasks between high-reasoning and high-efficiency agents. It includes technical demonstrations for multimodal data proce
The open-source RAG platform: built-in citations, deep research, 22+ file formats, partitions, MCP server, and more.