Why is donnemartin/data-science-ipython-notebooks a recommended Exploratory Data Analysis GitHub Repositories repository?

Provides techniques for cleaning and manipulating tabular data to visualize trends and extract statistical insights.

Why is saulpw/visidata a recommended Exploratory Data Analysis GitHub Repositories repository?

Provides tools for generating summary statistics, pivot tables, and frequency distributions to identify patterns in datasets.

Why is tidyverse/ggplot2 a recommended Exploratory Data Analysis GitHub Repositories repository?

Enables discovery of patterns and statistical insights through the creation of layered plots and faceted grids.

Why is hadley/ggplot2 a recommended Exploratory Data Analysis GitHub Repositories repository?

Facilitates the rapid generation of various plots to discover patterns and statistical insights in datasets.

Why is observablehq/plot a recommended Exploratory Data Analysis GitHub Repositories repository?

Provides an API for rapidly transforming tabular data into charts to discover patterns and statistical insights.

Why is man-group/dtale a recommended Exploratory Data Analysis GitHub Repositories repository?

Provides a visual interface for identifying patterns, outliers, and missing values in datasets.

Why is hadley/r4ds a recommended Exploratory Data Analysis GitHub Repositories repository?

Teaches the iterative process of manipulating and visualizing datasets to discover statistical patterns and insights.

Why is javascriptdata/danfojs a recommended Exploratory Data Analysis GitHub Repositories repository?

Provides tools for calculating descriptive statistics and generating charts to discover patterns in datasets.

13 रिपॉजिटरी

Awesome GitHub RepositoriesExploratory Data Analysis

The process of cleaning and manipulating datasets to discover patterns and statistical insights.

Distinct from Automated Exploratory Analysis: Focuses on the manual exploratory process using pandas/NumPy, distinct from automated analysis frameworks.

Explore 13 awesome GitHub repositories matching data & databases · Exploratory Data Analysis. Refine with filters or upvote what's useful.

AI के साथ बेहतरीन रिपॉजिटरी खोजें।हम AI का उपयोग करके सबसे सटीक रिपॉजिटरी खोजेंगे।

donnemartin/data-science-ipython-notebooks
donnemartin/data-science-ipython-notebooks
29,166GitHub पर देखें
This project is a collection of interactive Python notebooks and educational resources designed for mastering data science, machine learning, and numerical computing. It provides a series of practical guides and tutorials covering deep learning, big data processing, and statistical analysis. The repository features specialized instructional suites for implementing classical machine learning algorithms, building deep learning model architectures, and managing AWS cloud infrastructure. It includes dedicated notebooks for data visualization and numerical computing exercises. The project covers
Provides techniques for cleaning and manipulating tabular data to visualize trends and extract statistical insights.
Pythonawsbig-datacaffe
GitHub पर देखें29,166
saulpw/visidata
saulpw/visidata
8,834GitHub पर देखें
VisiData is a terminal-based interactive data analysis tool and browser designed for exploring, filtering, and sorting large tabular datasets. It functions as a structured data inspector that loads and flattens complex formats like JSON, XML, and PCAP into interactive sheets, as well as a terminal file manager for navigating directories and performing staged filesystem operations. The project distinguishes itself by rendering data visualizations, such as scatter plots and histograms, directly in the terminal using Unicode Braille characters. It provides a Python-based data wrangling environme
Provides tools for generating summary statistics, pivot tables, and frequency distributions to identify patterns in datasets.
Pythonclicsvdatajournalism
GitHub पर देखें8,834
jvns/pandas-cookbook
jvns/pandas-cookbook
7,086GitHub पर देखें
यह प्रोजेक्ट एक pandas डेटा विश्लेषण कुकबुक और Python डेटा साइंस गाइड है। यह संरचित डेटा को साफ करने, हेरफेर करने और विश्लेषण करने के लिए प्रोग्रामेटिक व्यंजनों और उदाहरणों का एक संग्रह प्रदान करता है। यह प्रोजेक्ट डेटा प्रोसेसिंग स्क्रिप्ट निष्पादित करते समय एक सुसंगत कार्यक्षेत्र और पुनरुत्पादनीय निर्भरता सुनिश्चित करने के लिए एक कंटेनरीकृत विश्लेषण वातावरण प्रदान करने पर केंद्रित है। यह डेटा साइंस क्षमताओं की एक विस्तृत श्रृंखला को कवर करता है, जिसमें बाहरी स्रोतों से डेटा इंजेक्शन, रॉ डेटा क्लीनिंग और खोजपूर्ण डेटा विश्लेषण शामिल है। ये व्यंजन प्रदर्शित करते हैं कि फ़िल्टरिंग, समूहीकृत डेटा को एग्रीगेट करने और टेक्स्ट डेटा को प्रोसेस करने जैसी तकनीकों के माध्यम से संरचित डेटा विश्लेषण कैसे किया जाए।
Uses pandas for cleaning and manipulating datasets to discover patterns and statistical insights.
Jupyter Notebook
GitHub पर देखें7,086
tidyverse/ggplot2
tidyverse/ggplot2
6,948GitHub पर देखें
ggplot2 is a data visualization library for R based on a formal grammar of graphics. It provides a declarative plotting framework that allows users to create complex graphics by combining geometric objects, statistical summaries, and coordinate systems. The system is distinguished by a layered approach to composition, where visualizations are built incrementally by stacking independent geometric, statistical, and coordinate layers. It utilizes a hierarchical styling engine to manage non-data elements such as backgrounds, fonts, and margins, and includes a multi-panel faceting tool for splitti
Enables discovery of patterns and statistical insights through the creation of layered plots and faceted grids.
R
GitHub पर देखें6,948
hadley/ggplot2
hadley/ggplot2
6,948GitHub पर देखें
ggplot2 is an R data visualization library and statistical graphics engine. It implements a grammar of graphics that functions as a declarative plotting framework, allowing users to specify what a plot should contain rather than how to draw it. The system builds visualizations by mapping data variables to visual aesthetics through a structured set of layering rules. This approach enables the composition of complex graphics by stacking independent components, such as geometric objects and scales, on top of a shared coordinate system. The framework supports scientific plotting and exploratory
Facilitates the rapid generation of various plots to discover patterns and statistical insights in datasets.
R
GitHub पर देखें6,948
willkoehrsen/data-analysis
WillKoehrsen/Data-Analysis
5,543GitHub पर देखें
यह प्रोजेक्ट एक Python डेटा विश्लेषण लाइब्रेरी और एक्सप्लोरेटरी डेटा विश्लेषण फ्रेमवर्क है जिसे रॉ डेटासेट को प्रोसेस करने के लिए डिज़ाइन किया गया है। यह डेटा की जांच करने, विसंगतियों की पहचान करने और पैटर्न को उजागर करने के लिए सांख्यिकीय विधियों को लागू करने के लिए टूल्स का एक सूट प्रदान करता है। यह रिपॉजिटरी एक मशीन लर्निंग मॉडलिंग टूलकिट और एक सांख्यिकीय डेटा मॉडलिंग सूट के रूप में कार्य करती है। इसमें डेटा वेरिएबल्स के बीच संबंधों का विश्लेषण करने और जटिल डेटासेट से अंतर्दृष्टि प्राप्त करने के लिए उपयोग किए जाने वाले प्रेडिक्टिव एल्गोरिदम और गणितीय मॉडल शामिल हैं। यह प्रोजेक्ट डेटा साइंस, मशीन लर्निंग मॉडलिंग और एक्सप्लोरेटरी डेटा विश्लेषण सहित क्षमताओं की एक विस्तृत श्रृंखला को कवर करता है। इन्हें डेटा हेरफेर, न्यूमेरिकल कंप्यूटेशन और डेटा विज़ुअलाइज़ेशन के माध्यम से लागू किया जाता है।
Provides a framework for cleaning and manipulating datasets to discover patterns and identify statistical anomalies.
Jupyter Notebook
GitHub पर देखें5,543
observablehq/plot
observablehq/plot
5,305GitHub पर देखें
This is a grammar of graphics visualization library used to build charts by mapping tabular data to visual marks. It functions as an SVG data visualization tool and an exploratory data analysis API, allowing users to render complex visualizations and geographic maps. The library features a GeoJSON map renderer that projects spherical coordinates into two-dimensional pixel space and an Apache Arrow visualization interface for high-efficiency data processing. Its capability surface covers data transformation through binning and grouping, visual encoding via automatic scale inference and color
Provides an API for rapidly transforming tabular data into charts to discover patterns and statistical insights.
HTMLchartsd3data-visualization
GitHub पर देखें5,305
man-group/dtale
man-group/dtale
5,170GitHub पर देखें
dtale is a web-based interactive grid and visualizer for pandas dataframes, designed as an exploratory data analysis tool. It provides a browser-based interface for analyzing tabular data structures, allowing users to calculate statistics, detect outliers, and compute correlations without writing manual code. The project functions as an embedded data viewer that can be integrated into web applications via iframes or custom routes, with specific support for Django, Flask, and Streamlit. It enables the exploration of datasets through a combination of an interactive data grid and a data visualiz
Provides a visual interface for identifying patterns, outliers, and missing values in datasets.
TypeScriptdata-analysisdata-sciencedata-visualization
GitHub पर देखें5,170
hadley/r4ds
hadley/r4ds
5,070GitHub पर देखें
r4ds एक डेटा साइंस करिकुलम और शैक्षिक संसाधन है जिसे R प्रोग्रामिंग भाषा में महारत हासिल करने के लिए डिज़ाइन किया गया है। यह डेटा को आयात करने, व्यवस्थित करने, बदलने और विज़ुअलाइज़ करने की एंड-टू-एंड प्रक्रिया के लिए एक स्ट्रक्चर्ड लर्निंग पाथ प्रदान करता है। यह प्रोजेक्ट एक पुनरुत्पादक (reproducible) डेटा साइंस गाइड और डेटा रैंगलिंग के लिए एक व्यापक करिकुलम पर जोर देता है। इसमें लेयर्ड डेटा विज़ुअलाइज़ेशन के लिए ग्राफिक्स के व्याकरण पर विशेष ट्यूटोरियल्स और Quarto के साथ बनाई गई तकनीकी प्रकाशन शामिल हैं जो निष्पादन योग्य कोड को कथात्मक प्रोज़ के साथ मिश्रित करते हैं। यह सामग्री विश्लेषणात्मक क्षमताओं की एक विस्तृत श्रृंखला को कवर करती है, जिसमें विविध स्रोतों से डेटा अंतर्ग्रहण, रिलेशनल डेटा जॉइनिंग और श्रेणीबद्ध वेरिएबल्स का प्रबंधन शामिल है। यह डेटा सफाई, गणितीय मॉडलिंग और मल्टी-फॉर्मेट पेशेवर रिपोर्ट्स और प्रस्तुतियों के निर्माण को भी संबोधित करती है। यह करिकुलम पारदर्शी और दोहराने योग्य विश्लेषण बनाने के लिए कार्यात्मक प्रोग्रामिंग और टाइडी डेटा सिद्धांतों के व्यावहारिक अनुप्रयोग पर केंद्रित है।
Teaches the iterative process of manipulating and visualizing datasets to discover statistical patterns and insights.
R
GitHub पर देखें5,070
javascriptdata/danfojs
javascriptdata/danfojs
5,050GitHub पर देखें
Danfo.js, JavaScript के लिए एक डेटा विश्लेषण और प्रीप्रोसेसिंग लाइब्रेरी है जो उच्च-प्रदर्शन लेबल वाली डेटा संरचनाएं प्रदान करती है। यह जटिल डेटा विश्लेषण, सांख्यिकीय कंप्यूटिंग और स्ट्रक्चर्ड टैबुलर डेटा के हेरफेर को सक्षम करने के लिए डेटा फ्रेम और सीरीज को लागू करती है। यह प्रोजेक्ट एक मशीन लर्निंग प्रीप्रोसेसिंग लाइब्रेरी के रूप में कार्य करता है, जो कैटेगोरिकल लेबल एन्कोडिंग, वन-हॉट एन्कोडिंग, और न्यूमेरिक फीचर स्केलिंग व मानकीकरण के लिए उपयोगिताएं प्रदान करता है। यह विशेष रूप से मॉडल ट्रेनिंग और मूल्यांकन के लिए लेबल वाली डेटा संरचनाओं को टेंसर में बदलने की सुविधा देता है। लाइब्रेरी वर्णनात्मक सांख्यिकी, मर्जिंग और जॉइनिंग जैसे रिलेशनल ऑपरेशंस, और टाइम-सीरीज प्रोसेसिंग सहित क्षमताओं के एक विस्तृत सेट को कवर करती है।
Provides tools for calculating descriptive statistics and generating charts to discover patterns in datasets.
TypeScriptdanfojsdata-analysisdata-analytics
GitHub पर देखें5,050
nyandwi/machine_learning_complete
Nyandwi/machine_learning_complete
4,983GitHub पर देखें
This is an interactive notebook-based course that teaches machine learning from Python fundamentals through deep learning and natural language processing. It uses real datasets and multiple frameworks within a structured, hands-on curriculum that combines concise explanations with executable code cells, built-in datasets, and embedded exercise checkpoints. Learning progresses through data preparation and exploration, classical machine learning workflows, computer vision with convolutional neural networks, and natural language processing with deep learning, all delivered as a cohesive progressi
Guides users through cleaning and manipulating datasets to discover patterns and optimize features for modeling.
Jupyter Notebookcomputer-visiondata-analysisdata-science
GitHub पर देखें4,983
residentmario/missingno
ResidentMario/missingno
4,209GitHub पर देखें
missingno is a Python library for the visualization and analysis of missing data patterns. It provides a set of tools to profile dataset completeness, map data gaps, and quantify the volume of null values across variables. The library differentiates itself through a nullity correlation analyzer and a hierarchical data clustering tool. These components allow for the detection of systemic dependencies and trends by measuring how the absence of one variable relates to the absence of another. The toolset covers broader data quality auditing and exploratory analysis capabilities. It includes feat
Enables exploratory data analysis by visualizing the distribution and volume of null values.
Pythondata-analysisdata-visualizationmissing-data
GitHub पर देखें4,209
ibm/mcp-context-forge
IBM/mcp-context-forge
3,310GitHub पर देखें
mcp-context-forge is a Model Context Protocol federation gateway that unifies diverse AI tool servers and APIs into a single consistent interface for discovery and execution. It acts as a centralized proxy that aggregates multiple servers and APIs, allowing AI agents to access and invoke a unified set of tools, prompts, and resources. The project distinguishes itself through a multi-protocol translation bridge that converts communication between standard I/O, SSE, gRPC, and REST to enable interoperability between disparate tool servers. It includes a comprehensive LLM evaluation framework for
Performs descriptive statistical analysis to identify data distributions and correlations.
Pythonagentsaiapi-gateway
GitHub पर देखें3,310

Awesome Exploratory Data Analysis GitHub Repositories

donnemartin/data-science-ipython-notebooks

saulpw/visidata

jvns/pandas-cookbook

tidyverse/ggplot2

hadley/ggplot2

WillKoehrsen/Data-Analysis

observablehq/plot

man-group/dtale

hadley/r4ds

javascriptdata/danfojs

Nyandwi/machine_learning_complete

ResidentMario/missingno

IBM/mcp-context-forge

सब-टैग एक्सप्लोर करें