Why is ottermind/chat2db a recommended Automated Exploratory Analysis GitHub Repositories repository?

Provides AI-driven analysis of spreadsheet files to extract patterns and insights using natural language processing.

Why is data-centric-ai-community/ydata-profiling a recommended Automated Exploratory Analysis GitHub Repositories repository?

Automates the statistical summary and visualization of tabular datasets to identify patterns and quality issues.

Why is ydataai/pandas-profiling a recommended Automated Exploratory Analysis GitHub Repositories repository?

Provides a framework that automatically generates statistical summaries and visual insights from tabular datasets.

Why is data-centric-ai-community/fg-data-profiling a recommended Automated Exploratory Analysis GitHub Repositories repository?

Automatically generates statistical summaries and visual insights to discover patterns and anomalies in new datasets.

Why is pandas-profiling/pandas-profiling a recommended Automated Exploratory Analysis GitHub Repositories repository?

Automatically generates statistical summaries and visual insights to facilitate the initial investigation of datasets.

Why is ydataai/ydata-profiling a recommended Automated Exploratory Analysis GitHub Repositories repository?

Provides an automated framework for discovering data distributions, correlations, and quality issues within large datasets.

Why is microsoft/vscode-copilot-chat a recommended Automated Exploratory Analysis GitHub Repositories repository?

Automatically generates a complete data analysis workflow, including notebook scaffolding and visualization code.

Why is growthbook/growthbook a recommended Automated Exploratory Analysis GitHub Repositories repository?

Generates automated drill-down analyses for a single metric across multiple dimensions.

Why is j3ssie/osmedeus a recommended Automated Exploratory Analysis GitHub Repositories repository?

Sends prompts to language models and exports generated analysis for use in subsequent workflow steps.

Why is lux-org/lux a recommended Automated Exploratory Analysis GitHub Repositories repository?

Automates the exploratory data analysis process by recommending optimal chart types and axis mappings based on dataset attributes.

16 repositorios

Awesome GitHub RepositoriesAutomated Exploratory Analysis

Frameworks that automatically generate statistical summaries and visual insights from raw datasets.

Distinct from Data Analysis: Distinct from general data analysis: focuses on the automation of the exploratory phase rather than strategic or manual analysis.

Explore 16 awesome GitHub repositories matching data & databases · Automated Exploratory Analysis. Refine with filters or upvote what's useful.

Encuentra los mejores repositorios con IA.Buscaremos los repositorios que mejor coincidan usando IA.

ottermind/chat2db
OtterMind/Chat2DB
25,784Ver en GitHub
Chat2DB is an AI-powered SQL client and multi-database GUI manager designed for managing various relational and NoSQL database systems. It serves as a visual database management tool and a natural language to SQL interface, allowing users to convert plain text descriptions into executable and optimized queries. The platform distinguishes itself through automated business intelligence capabilities, which include the generation of real-time data visualization dashboards and AI-driven data analysis from spreadsheets. To ensure data privacy, it supports secure local AI deployment, enabling large
Provides AI-driven analysis of spreadsheet files to extract patterns and insights using natural language processing.
Javaaibichatgpt
Ver en GitHub25,784
data-centric-ai-community/ydata-profiling
Data-Centric-AI-Community/ydata-profiling
13,618Ver en GitHub
This library provides a diagnostic toolkit for automated data profiling and exploratory analysis. It generates comprehensive statistical summaries and visual reports for tabular datasets, enabling users to identify distribution patterns, missing values, and quality anomalies through a unified interface. The project distinguishes itself by offering differential analysis, which allows for the comparison of two dataset versions to track structural and statistical changes over time. It supports large-scale data processing through lazy evaluation and provides interactive widgets that embed directl
Automates the statistical summary and visualization of tabular datasets to identify patterns and quality issues.
Python
Ver en GitHub13,618
ydataai/pandas-profiling
ydataai/pandas-profiling
13,610Ver en GitHub
This project is an exploratory data analysis framework and profiling tool designed to generate comprehensive statistical reports from Pandas and Spark DataFrames. It functions as a data quality profiler that identifies missing values, duplicates, and high correlations within tabular datasets. The tool distinguishes itself through specialized capabilities for time-series analysis, extracting temporal statistics, seasonality, and auto-correlation plots. It also includes a dataset comparison utility to identify structural or content changes between different versions of a dataset. The analysis
Provides a framework that automatically generates statistical summaries and visual insights from tabular datasets.
Python
Ver en GitHub13,610
data-centric-ai-community/fg-data-profiling
Data-Centric-AI-Community/fg-data-profiling
13,609Ver en GitHub
This project is a data profiling and exploratory data analysis tool designed to generate automated quality reports for Pandas and Spark dataframes. It serves as a system for computing descriptive statistics, identifying correlations, and analyzing univariate and multivariate data patterns. The tool provides specialized capabilities for comparing different versions of datasets to identify changes in data quality and distributions. It includes a dedicated profiler for time-dependent data to extract statistical information such as seasonality and auto-correlation. The software covers a broad an
Automatically generates statistical summaries and visual insights to discover patterns and anomalies in new datasets.
Python
Ver en GitHub13,609
pandas-profiling/pandas-profiling
pandas-profiling/pandas-profiling
13,609Ver en GitHub
This project is an exploratory data analysis library and profiling tool for Pandas and Spark DataFrames. It automates the initial investigation of datasets by generating comprehensive descriptive analysis reports, statistical summaries, and data quality warnings. The system functions as a data quality profiler to detect missing values, duplicate rows, and type inconsistencies. It includes a dataset comparison tool for identifying structural and content shifts between different versions of the same data, as well as specialized tools for time-series analysis to calculate auto-correlation and se
Automatically generates statistical summaries and visual insights to facilitate the initial investigation of datasets.
Python
Ver en GitHub13,609
ydataai/ydata-profiling
ydataai/ydata-profiling
13,388Ver en GitHub
Ydata-profiling is an automated exploratory data analysis framework designed to generate comprehensive statistical reports and visual summaries from dataframes. It functions as a diagnostic tool for assessing data quality, identifying missing values, duplicates, and outliers, while providing a scalable engine for profiling massive datasets across distributed enterprise environments. The project distinguishes itself through its ability to handle large-scale data through distributed task orchestration and lazy stream processing, which minimizes memory overhead during complex computations. It in
Provides an automated framework for discovering data distributions, correlations, and quality issues within large datasets.
Pythonbig-data-analyticsdata-analysisdata-exploration
Ver en GitHub13,388
microsoft/vscode-copilot-chat
microsoft/vscode-copilot-chat
9,493Ver en GitHub
This project is an AI-powered IDE extension and LLM coding assistant that provides a conversational interface for generating, refactoring, and debugging code. It functions as an AI agent framework and a Model Context Protocol client, connecting AI models to external data sources and tools to automate complex development tasks. The system is distinguished by its use of autonomous AI agents capable of multi-step task execution, including the ability to read files, modify code, and run terminal commands iteratively. It supports recursive agent orchestration through subagent delegation and employ
Automatically generates a complete data analysis workflow, including notebook scaffolding and visualization code.
TypeScript
Ver en GitHub9,493
growthbook/growthbook
growthbook/growthbook
7,351Ver en GitHub
GrowthBook is a feature flagging and experimentation platform that utilizes a warehouse-native approach to data analysis. It serves as a system for managing feature rollouts and conducting A/B tests by executing SQL queries directly against existing data warehouses to calculate experiment results. The platform is distinguished by its integration of a Model Context Protocol server, which allows AI coding assistants and IDEs to manage flags and query analytics using natural language. It also provides specialized capabilities for AI model optimization, enabling the testing of prompts and models
Generates automated drill-down analyses for a single metric across multiple dimensions.
TypeScriptab-testingabtestabtesting
Ver en GitHub7,351
j3ssie/osmedeus
j3ssie/Osmedeus
6,425Ver en GitHub
Osmedeus is a security workflow orchestration engine that coordinates AI agents, shell commands, and scanning tools through declarative YAML pipelines. It functions as a distributed security scanner, a declarative workflow automator, and an AI agent framework for security, enabling automated multi-step security analysis with conditional branching, parallel execution, and distributed workers. The engine distinguishes itself through a hybrid runner model that executes workflow steps on the local host, inside Docker containers, or over SSH to remote machines, selected per step or module. It supp
Sends prompts to language models and exports generated analysis for use in subsequent workflow steps.
Go
Ver en GitHub6,425
lux-org/lux
lux-org/lux
5,380Ver en GitHub
Lux es una herramienta de análisis exploratorio de datos automatizado diseñada para generar representaciones visuales inteligentes de dataframes de pandas. Identifica patrones y tendencias recomendando tipos de gráficos óptimos y mapeos de ejes basados en los atributos estadísticos de un conjunto de datos. La herramienta funciona como una capa de perfilado de datos interactiva que permite a los usuarios navegar y consultar colecciones de gráficos utilizando filtros y comodines. También sirve como un generador de código de visualización, traduciendo gráficos producidos automáticamente en código programático o HTML para un refinamiento manual en bibliotecas externas. El sistema cubre una amplia gama de capacidades de análisis exploratorio, incluyendo codificación de gráficos automatizada, descubrimiento guiado a través de recomendaciones de pasos y la capacidad de exportar configuraciones visuales como especificaciones declarativas. Este proyecto se integra directamente en pandas para anular la impresión predeterminada de dataframes con componentes de visualización interactivos.
Automates the exploratory data analysis process by recommending optimal chart types and axis mappings based on dataset attributes.
Python
Ver en GitHub5,380
observedobserver/visual-insights
ObservedObserver/visual-insights
4,653Ver en GitHub
Visual Insights es una plataforma de análisis exploratorio de datos automatizado y herramienta de inferencia causal diseñada para descubrir patrones y relaciones de causa y efecto dentro de los datasets. Funciona como una librería de visualización de datos interactiva utilizando un enfoque de gramática de gráficos para generar gráficos y dashboards multidimensionales. El proyecto se distingue por una interfaz de lenguaje natural que traduce preguntas en texto plano a respuestas y visualizaciones de datos mediante un modelo de lenguaje. Proporciona un framework especializado para el descubrimiento e inferencia causal, permitiendo a los usuarios identificar enlaces entre variables mediante gráficos causales interactivos y realizar análisis de tipo "qué pasaría si" (what-if) para validar hipótesis. La plataforma cubre un amplio rango de capacidades, incluyendo limpieza visual de datos, perfilado estadístico y transformación automatizada de datasets. Soporta la integración de datos diversos desde archivos locales y bases de datos remotas, y cuenta con un motor de procesamiento de alto rendimiento para manejar grandes datasets localmente. Además, el sistema permite embeber componentes de análisis interactivos en aplicaciones web y notebooks.
Discovers patterns and trends in unfamiliar datasets using automated agents to generate multi-dimensional visualizations.
TypeScript
Ver en GitHub4,653
x0rz/eqgrp
x0rz/EQGRP
4,201Ver en GitHub
EQGRP es un framework de troyano de acceso remoto y kit de herramientas de post-explotación. Proporciona una infraestructura centralizada de comando y control para desplegar implantes persistentes y gestionar agentes remotos en diversos sistemas operativos. El proyecto incluye herramientas para la evasión forense digital, como la modificación de registros del sistema y marcas de tiempo del sistema de archivos para eliminar rastros de ejecución. Cuenta con un sistema de interceptación de red para capturar y reconstruir flujos de datos mediante hooks en el root del sistema, así como exploits diseñados para la escalada de privilegios del kernel para elevar los permisos de proceso a root administrativo. El kit de herramientas cubre una amplia gama de capacidades, incluyendo ejecución remota de código, empaquetado de shellcode para evasión de firmas, y la exfiltración y análisis de registros de dispositivos móviles y registros de telecomunicaciones. También proporciona utilidades para enlazar puertos de red y navegar por archivos descifrados.
Parses telecommunications call detail records to extract structured data for analysis.
Perl
Ver en GitHub4,201
show-me-the-code/python
Show-Me-the-Code/python
4,226Ver en GitHub
This project is a curated library of Python code examples, educational resources, and programming tutorials. It functions as an educational repository designed to teach Python language fundamentals through practical implementation tasks, real-world exercises, and functional code snippets. The collection covers a diverse range of implementation examples, including the development of interactive websites and message boards using web frameworks. It also features scripts for audio speech processing, automated media processing for images, and the extraction of data from web content. Additional ca
Parses call detail records from spreadsheets and computes total talk time per month.
HTML
Ver en GitHub4,226
posit-dev/positron
posit-dev/positron
3,969Ver en GitHub
Positron is a data science integrated development environment and AI-powered code editor designed for polyglot development, specifically supporting Python and R. It functions as a remote compute workspace that separates the user interface from the execution kernel via SSH or container integration. The environment features a deep integration of large language models that provide context-aware suggestions and automated data analysis by accessing real-time interpreter state, in-memory objects, and plot outputs. It distinguishes itself through a polyglot runtime bridge that enables cross-language
Automatically generates and executes statistical summaries and visualizations to uncover insights from datasets.
TypeScript
Ver en GitHub3,969
ironcalc/ironcalc
ironcalc/IronCalc
3,750Ver en GitHub
IronCalc is an XLSX spreadsheet engine and formula evaluator designed to compute numerical expressions and manage workbook structures. It utilizes a logic engine compatible with industry standards to evaluate formulas and manage cell dependencies. The project provides a comprehensive suite of specialized toolkits, including a financial calculation library for bond pricing and net present value, and an engineering math toolkit for complex number arithmetic and Bessel functions. It also features a web-based spreadsheet interface for creating and formatting workbooks. The engine covers a broad
Enables the creation of automated workflows to filter, sort, and aggregate large datasets using database-style criteria.
Rustreactrustself-hosted
Ver en GitHub3,750
deepractice/promptx
Deepractice/PromptX
3,526Ver en GitHub
PromptX is an LLM agent orchestration framework designed to execute multi-step workflows using autonomous agents. It features a sandboxed tool execution environment for secure filesystem operations and external API integrations, alongside a persona management system that defines professional roles and domain expertise to control agent behavior. The system implements a semantic memory network for persistent knowledge storage, utilizing graph-based memory and engrams to retain information across sessions. This cognitive memory includes specialized tools for knowledge graph visualization, allowi
Processes Excel files to generate insights, automate reports, and create data visualizations.
JavaScript
Ver en GitHub3,526

Awesome Automated Exploratory Analysis GitHub Repositories

OtterMind/Chat2DB

Data-Centric-AI-Community/ydata-profiling

ydataai/pandas-profiling

Data-Centric-AI-Community/fg-data-profiling

pandas-profiling/pandas-profiling

ydataai/ydata-profiling

microsoft/vscode-copilot-chat

growthbook/growthbook

j3ssie/Osmedeus

lux-org/lux

ObservedObserver/visual-insights

x0rz/EQGRP

Show-Me-the-Code/python

posit-dev/positron

ironcalc/IronCalc

Deepractice/PromptX

Explorar subetiquetas