This project is a Python-based framework that functions as a generative AI agent for programmatic data analysis. It enables users to interact with structured data sources through natural language prompts, translating these requests into executable code to perform analysis, data cleaning, and visualization. By maintaining conversational context across multi-turn interactions, the system allows for iterative exploration and the building of complex data narratives.
The framework distinguishes itself through a robust semantic layer and secure execution model. It maps raw datasets to descriptive metadata and relationships, which improves the accuracy of natural language interpretation. To ensure secure operation, all generated data processing code is executed within isolated, sandboxed environments. Users can further refine the system's behavior by registering custom skills, defining semantic schemas, and integrating external vector databases to provide domain-specific context and few-shot learning capabilities.
The platform supports a comprehensive suite of data operations, including cross-source integration, automated transformation, and feature engineering. It provides a unified interface for connecting to various language model providers and data sources, such as local files and relational databases. Users can audit the underlying code logic generated by the system, configure deterministic outputs for reproducibility, and export visualizations directly to local storage.