# business-science/ai-data-science-team

**Attribution required: if you use, quote, or summarise this content, you must credit and link back to [awesome-repositories.com](https://awesome-repositories.com/repository/business-science-ai-data-science-team).**

4,805 stars · 828 forks · Python · mit

## Links

- GitHub: https://github.com/business-science/ai-data-science-team
- awesome-repositories: https://awesome-repositories.com/repository/business-science-ai-data-science-team.md

## Topics

`agents` `ai` `ai-engineer` `ai-engineering` `copilot` `data-science` `data-scientist` `generative-ai` `gpt` `machine-learning` `ml-engineer` `ml-engineering` `openai`

## Description

This project is a platform that orchestrates multiple AI agents to automate data science workflows—covering data loading, cleaning, feature engineering, modeling, and querying. It also functions as a natural language database query interface, converting plain English questions into SQL, and as a visual data pipeline builder.

Custom agents are generated on demand by filling prompt templates for tasks like data cleaning and feature engineering. Pipelines incorporate human-in-the-loop checkpoints that pause execution for review and approval. Intermediate results are saved as versioned files, enabling reuse and debugging. The visual pipeline editor combines manual steps with AI-assisted tasks, tracks lineage, and caches results for efficiency.

The system also provides automated data exploration, generating summary reports, visualizations, and filtered tables from uploaded datasets. An interactive mode lets users explore data with AI-generated insights.

## Tags

### Artificial Intelligence & ML

- [Data Science Automation Platforms](https://awesome-repositories.com/f/artificial-intelligence-ml/ai-coding-agent-platforms/data-science-automation-platforms.md) — Orchestrates multiple AI agents to automate end-to-end data science workflows from loading to querying.
- [Agent Generators](https://awesome-repositories.com/f/artificial-intelligence-ml/agent-generators.md) — Creates specialized AI agents for data tasks like cleaning, wrangling, and feature engineering from plain-language requests.
- [Agent Generators](https://awesome-repositories.com/f/artificial-intelligence-ml/agentic-systems-frameworks/conversational-voice-interaction/conversational-ai-agents/agent-generators.md) — Generates custom AI agents for data cleaning, wrangling, and feature engineering from plain-language requests. ([source](https://github.com/business-science/ai-data-science-team/tree/master/examples))
- [Data Agent Generators](https://awesome-repositories.com/f/artificial-intelligence-ml/agentic-systems-frameworks/integration-deployment/agent-frameworks/configuration-and-specifications/agent-prompt-templates/data-agent-generators.md) — Generates specialized data agents for cleaning, wrangling, and feature engineering via prompt templates.
- [Agentic Workflow Automation](https://awesome-repositories.com/f/artificial-intelligence-ml/agentic-workflow-automation.md) — Chains multiple AI agents together, passing data and coordinating to complete complex workflows. ([source](https://github.com/business-science/ai-data-science-team/tree/master/examples))
- [Human-in-the-loop Controls](https://awesome-repositories.com/f/artificial-intelligence-ml/human-in-the-loop-controls.md) — Inserts pauses in the pipeline for human review of intermediate results before proceeding. ([source](https://github.com/business-science/ai-data-science-team/tree/master/examples))
- [Multi-Agent Orchestrators](https://awesome-repositories.com/f/artificial-intelligence-ml/multi-agent-orchestrators.md) — Orchestrates a chain of specialized AI agents that pass structured data between steps for data science workflows.
- [Natural Language Query Interfaces](https://awesome-repositories.com/f/artificial-intelligence-ml/natural-language-query-interfaces.md) — Converts natural language questions into SQL queries and returns results from connected databases.
- [Human-in-the-Loop Approvals](https://awesome-repositories.com/f/artificial-intelligence-ml/step-based-schedulers/step-execution-engines/execution-step-controllers/human-in-the-loop-approvals.md) — Pauses automated pipeline execution for human review and approval before proceeding.

### Data & Databases

- [AI Data Analysis Tools](https://awesome-repositories.com/f/data-databases/ai-data-analysis-tools.md) — Produces AI-generated summaries, visualizations, and filtered tables from uploaded datasets. ([source](https://github.com/business-science/ai-data-science-team/tree/master/apps))
- [LLM-Powered Exploration Tools](https://awesome-repositories.com/f/data-databases/data-exploration-tools/llm-powered-exploration-tools.md) — Uses large language models to automatically generate summaries, plots, and filtered tables from uploaded datasets.
- [Natural Language to SQL](https://awesome-repositories.com/f/data-databases/data-visualization-charts/natural-language-querying/natural-language-to-sql.md) — Converts natural language questions into SQL queries and returns results from databases.
- [Automated Exploration Interfaces](https://awesome-repositories.com/f/data-databases/dataset-explorers/automated-exploration-interfaces.md) — Generates AI summaries, plots, and filtered tables from uploaded datasets for quick exploration.
- [Automated Exploration Reporters](https://awesome-repositories.com/f/data-databases/data-exploration-tools/automated-exploration-reporters.md) — Automatically generates reports on missing values, correlations, and key summaries from datasets. ([source](https://github.com/business-science/ai-data-science-team/tree/master/apps))
- [Pipeline Intermediate Storages](https://awesome-repositories.com/f/data-databases/file-based-storage-systems/pipeline-intermediate-storages.md) — Stores intermediate agent outputs in versioned file directories for reuse and debugging.
- [Interactive Data Exploration Tools](https://awesome-repositories.com/f/data-databases/interactive-data-exploration-tools.md) — Launches an interactive interface where AI generates visualizations and insights from data. ([source](https://github.com/business-science/ai-data-science-team#readme))
- [Pipeline Intermediate Storages](https://awesome-repositories.com/f/data-databases/pipeline-intermediate-storages.md) — Stores intermediate pipeline results in structured, versioned folders for reuse and debugging. ([source](https://github.com/business-science/ai-data-science-team/tree/master/examples))

### User Interface & Experience

- [Visual Data Pipeline Builders](https://awesome-repositories.com/f/user-interface-experience/font-configurations/font-configurators/custom-font-build-generators/build-pipelines/visual-data-pipeline-builders.md) — Provides a visual workspace for creating data pipelines that combine manual steps and AI agent tasks with lineage tracking.
- [Visual Pipeline Builders](https://awesome-repositories.com/f/user-interface-experience/visual-pipeline-builders.md) — Provides a visual workspace editor for constructing data pipelines with AI-assisted and manual steps and lineage tracking. ([source](https://github.com/business-science/ai-data-science-team/blob/master/README.md))

### Programming Languages & Runtimes

- [Visual Pipeline DAG Executors](https://awesome-repositories.com/f/programming-languages-runtimes/runtime-execution-environments/runtime-environments/runtimes/graph-symbolic-execution-engines/directed-acyclic-graph-execution-engines/visual-pipeline-dag-executors.md) — Executes a directed acyclic graph of AI and manual steps with visual lineage tracking and caching.

### Part of an Awesome List

- [Data Science](https://awesome-repositories.com/f/awesome-lists/ai/data-science.md) — AI-powered team for common data science tasks.
- [Real World Applications](https://awesome-repositories.com/f/awesome-lists/ai/real-world-applications.md) — AI-powered agents for automating common data science workflows.
