# fivethirtyeight/data

**Attribution required: if you use, quote, or summarise this content, you must credit and link back to [awesome-repositories.com](https://awesome-repositories.com/repository/fivethirtyeight-data).**

17,285 stars · 11,123 forks · Jupyter Notebook · cc-by-4.0

## Links

- GitHub: https://github.com/fivethirtyeight/data
- Homepage: https://data.fivethirtyeight.com/
- awesome-repositories: https://awesome-repositories.com/repository/fivethirtyeight-data.md

## Topics

`data`

## Description

This repository serves as a public archive for the raw datasets and analytical code used to support journalistic reporting. It functions as a platform for reproducible research, providing the necessary materials for users to verify published findings and conduct independent statistical analysis.

The collection utilizes a versioned storage model to track historical changes to both data and processing scripts. By organizing information into a structured directory hierarchy, the repository maps specific journalistic projects to their corresponding inputs and outputs, ensuring that the methodology behind reported conclusions remains transparent and accessible.

All datasets are distributed in lightweight, human-readable formats to maintain compatibility across various analytical environments. The repository includes the source code required to clean and process these files, enabling users to recreate analytical results and perform secondary investigations using the same logic applied in the original reporting.

## Tags

### DevOps & Infrastructure

- [Git-Based Repositories](https://awesome-repositories.com/f/devops-infrastructure/infrastructure/version-control-systems/git-based-repositories.md) — Maintains a collection of versioned datasets and analytical scripts to support journalistic reporting and verification.

### Software Engineering & Architecture

- [Data Repositories](https://awesome-repositories.com/f/software-engineering-architecture/project-management-governance/repository-maintenance/project-organization/repository-structures/data-repositories.md) — Provides a collection of structured datasets and analytical code used to support published reporting.
- [Analytical Reproducibility](https://awesome-repositories.com/f/software-engineering-architecture/reproducibility-verifiers/analytical-reproducibility.md) — Shares the specific code and data processing steps required for others to recreate analytical results and validate conclusions.
- [Version-Controlled Datasets](https://awesome-repositories.com/f/software-engineering-architecture/version-controlled-datasets.md) — Tracks historical changes to datasets and analytical scripts using distributed version control to maintain an immutable record.

### Data & Databases

- [Data Archiving Systems](https://awesome-repositories.com/f/data-databases/data-archiving-systems.md) — Acts as a public repository for raw data files and processing scripts that allow users to reproduce analysis.
- [Data Access and Querying](https://awesome-repositories.com/f/data-databases/data-access-querying.md) — Provides access to structured data files used in published reporting for independent verification. ([source](https://data.fivethirtyeight.com/))
- [Data Analysis & Visualization](https://awesome-repositories.com/f/data-databases/data-analysis-visualization.md) — Enables independent research and secondary statistical analysis using structured datasets from public interest reporting.
- [Flat-File Data Stores](https://awesome-repositories.com/f/data-databases/data-engineering-infrastructure/data-persistence-storage/data-storage-architectures/flat-file-data-stores.md) — Distributes structured information in lightweight, human-readable formats like CSV to ensure maximum compatibility.
- [CSV Data Loaders](https://awesome-repositories.com/f/data-databases/tabular-data-frameworks/csv-data-loaders.md) — Delivers structured information in lightweight, human-readable CSV formats for broad analytical compatibility.
