# pair-code/facets

**Attribution required: if you use, quote, or summarise this content, you must credit and link back to [awesome-repositories.com](https://awesome-repositories.com/repository/pair-code-facets).**

7,340 stars · 887 forks · Jupyter Notebook · Apache-2.0 · archived

## Links

- GitHub: https://github.com/PAIR-code/facets
- Homepage: https://pair-code.github.io/facets/
- awesome-repositories: https://awesome-repositories.com/repository/pair-code-facets.md

## Description

Facets is a set of interactive software tools for the statistical analysis, distribution visualization, and multidimensional exploration of machine learning datasets. It provides a visual interface for identifying outliers and missing values in numeric and string data, specifically designed for auditing dataset quality and identifying skews between training and validation sets.

The system uses multidimensional facet-based visualization and interactive bucketing to map individual data points across multiple feature axes. It employs synchronized view filtering and animated dimension transitions to maintain visual context while navigating large datasets to detect systematic classifier failures and model errors.

The toolkit covers dataset distribution auditing and feature-by-feature statistical analysis. It enables the detection of distribution skews and the exploration of data points through coordinated filtering and client-side statistical aggregation.

## Tags

### Graphics & Multimedia

- [View Layering & Faceting](https://awesome-repositories.com/f/graphics-multimedia/view-layering-faceting.md) — Renders multidimensional data distributions across multiple feature axes using synchronized faceted views and interactive filtering.

### Artificial Intelligence & ML

- [Distribution Analysis](https://awesome-repositories.com/f/artificial-intelligence-ml/feature-analysis-tools/distribution-analysis.md) — Provides visualizations of value distributions across features to identify skews between training and validation sets. ([source](https://pair-code.github.io/facets/))
- [Dataset Analysis Tools](https://awesome-repositories.com/f/artificial-intelligence-ml/machine-learning/machine-learning-datasets/dataset-analysis-tools.md) — Visualizes feature distributions and statistics to find missing data, outliers, and skews in ML datasets.
- [Visualizers](https://awesome-repositories.com/f/artificial-intelligence-ml/machine-learning/machine-learning-datasets/visualizers.md) — Offers an interactive tool for exploring feature distributions and identifying data skews in ML datasets.
- [Statistical Analysis](https://awesome-repositories.com/f/artificial-intelligence-ml/statistical-analysis.md) — Produces visual feature-by-feature statistical analysis to identify outliers, missing values, and distribution skews. ([source](https://github.com/pair-code/facets#readme))
- [Classification Error Analysis](https://awesome-repositories.com/f/artificial-intelligence-ml/prediction-visualization/accuracy-calculators/error-metrics/classification-error-analysis.md) — Identifies systematic classifier failures by exploring how specific data points behave across different feature dimensions.

### Data & Databases

- [Visual Data Explorers](https://awesome-repositories.com/f/data-databases/big-data-processing/visual-data-explorers.md) — Enables mapping of individual data points across multiple dimensions using interactive bucketing for error detection. ([source](https://pair-code.github.io/facets/))
- [Visualization Coordinate Mapping](https://awesome-repositories.com/f/data-databases/data-mapping/coordinate-system-mapping/visualization-coordinate-mapping.md) — Maps individual dataset samples to visual coordinates across multiple dimensions to facilitate outlier and failure detection.
- [Interactive Data Binning](https://awesome-repositories.com/f/data-databases/interactive-data-binning.md) — Groups continuous feature values into interactive discrete bins to enable efficient navigation of large datasets.
- [Statistical Aggregators](https://awesome-repositories.com/f/data-databases/data-analysis-visualization/analytical-platforms-engines/advanced-analytics-functions/statistical-aggregators.md) — Provides client-side utilities that compute summary statistics and distribution skews directly in the browser.
- [Faceted Plotting Systems](https://awesome-repositories.com/f/data-databases/data-analysis-visualization/visualization-frameworks-libraries/statistical-plotting-libraries/faceted-plotting-systems.md) — Uses faceted plotting systems to visualize thousands of data points through interactive zooming and animation. ([source](https://github.com/pair-code/facets#readme))
- [Dataset Comparators](https://awesome-repositories.com/f/data-databases/data-collections-datasets/dataset-comparators.md) — Compares numeric and string data distributions to evaluate statistical drift and ensure dataset quality.

### User Interface & Experience

- [Data Explorers](https://awesome-repositories.com/f/user-interface-experience/data-explorers.md) — Provides a visual interface for mapping data points across multiple dimensions using bucketing and filtering.
- [Synchronized View Filtering](https://awesome-repositories.com/f/user-interface-experience/synchronized-view-filtering.md) — Updates all active visualization panels simultaneously when a specific data subset is selected in any single view.
- [Anomaly Distribution Plots](https://awesome-repositories.com/f/user-interface-experience/data-visualization-tools/data-visualization/charting-frameworks/immediate-mode-plotting-libraries/statistical-distribution-visualizers/anomaly-distribution-plots.md) — Uses faceting and animation to reveal patterns and anomalies within large-scale dataset distributions.
- [View Transition Animations](https://awesome-repositories.com/f/user-interface-experience/view-transition-animations.md) — Implements animated transitions that interpolate point positions when switching between feature projections to maintain visual context.

### Part of an Awesome List

- [Data Visualization](https://awesome-repositories.com/f/awesome-lists/data/data-visualization.md) — Visualizations for analyzing ML datasets.
