# scverse/scanpy

**Attribution required: if you use, quote, or summarise this content, you must credit and link back to [awesome-repositories.com](https://awesome-repositories.com/repository/scverse-scanpy).**

2,493 stars · 750 forks · Python · BSD-3-Clause

## Links

- GitHub: https://github.com/scverse/scanpy
- Homepage: https://scanpy.readthedocs.io
- awesome-repositories: https://awesome-repositories.com/repository/scverse-scanpy.md

## Topics

`anndata` `bioinformatics` `data-science` `machine-learning` `python` `scanpy` `scverse` `transcriptomics` `visualize-data`

## Description

Scanpy is a Python library for the preprocessing, visualization, and analysis of large-scale single-cell gene expression datasets. It serves as a toolkit for single-cell RNA sequencing analysis, providing a framework to process and analyze genomic data from individual cells to identify biological markers and cell types.

The library includes a scalable data processing pipeline for cleaning and preparing genomic data, a clustering framework for grouping cells with similar expression profiles, and a system for modeling transitions between cell states to reconstruct biological development and differentiation processes. It also provides a suite of tools for generating graphical representations of high-dimensional cell populations and gene expression patterns.

The toolkit covers broader analytical capabilities including differential gene expression testing to identify characteristic markers and various genomic data preprocessing operations.

## Tags

### Scientific & Mathematical Computing

- [Single-Cell Analysis](https://awesome-repositories.com/f/scientific-mathematical-computing/single-cell-analysis.md) — Provides a comprehensive toolkit for preprocessing, visualizing, and analyzing large-scale single-cell gene expression datasets.
- [Biological Trajectory Inferences](https://awesome-repositories.com/f/scientific-mathematical-computing/biological-trajectory-inferences.md) — Models developmental transitions between cell clusters to reconstruct biological differentiation and development paths.
- [Cellular Trajectory Inference Tools](https://awesome-repositories.com/f/scientific-mathematical-computing/cellular-trajectory-inference-tools.md) — Models transitions between cell states to reconstruct biological development and differentiation processes.
- [Differential Gene Expression Tests](https://awesome-repositories.com/f/scientific-mathematical-computing/differential-gene-expression-tests.md) — Provides statistical testing to compare gene expression levels between cell groups to identify characteristic biological markers. ([source](https://cdn.jsdelivr.net/gh/scverse/scanpy@main/README.md))
- [Sparse Linear Algebra Routines](https://awesome-repositories.com/f/scientific-mathematical-computing/numerical-mathematical-foundations/linear-algebra/sparse-linear-algebra-routines.md) — Performs large-scale linear algebra using compressed sparse row formats for high-dimensional genomic data.

### Artificial Intelligence & ML

- [Cellular State Trajectories](https://awesome-repositories.com/f/artificial-intelligence-ml/data-preparation/rl-trajectory/cellular-state-trajectories.md) — Models transitions between cell states over time to reconstruct biological development and differentiation processes. ([source](https://cdn.jsdelivr.net/gh/scverse/scanpy@main/README.md))

### Part of an Awesome List

- [Biological Data Visualization](https://awesome-repositories.com/f/awesome-lists/ai/biological-data-visualization.md) — Generates graphical representations of cell populations and gene expression patterns to identify biological trends. ([source](https://scanpy.readthedocs.io))
- [Genomic Preprocessing Pipelines](https://awesome-repositories.com/f/awesome-lists/data/genomic-data-analysis/genomic-preprocessing-pipelines.md) — Provides memory-efficient pipelines for cleaning and preparing large-scale single-cell datasets.
- [Genome Visualization](https://awesome-repositories.com/f/awesome-lists/media/genome-visualization.md) — Provides a suite of tools for rendering and exploring genomic data and sequence diagrams.
- [Python Bioinformatics Modules](https://awesome-repositories.com/f/awesome-lists/devtools/python-bioinformatics-modules.md) — Toolkit for single-cell gene expression analysis.

### Data & Databases

- [Leiden Community Detection](https://awesome-repositories.com/f/data-databases/anomaly-detection/graph-community-detection/leiden-community-detection.md) — Implements the Leiden algorithm to identify distinct cell clusters within gene expression graphs.
- [Cellular State Projections](https://awesome-repositories.com/f/data-databases/data-analysis-visualization/visualization-frameworks-libraries/data-visualization/three-dimensional-visualizations/dimensionality-projection-plots/cellular-state-projections.md) — Implements graph-based manifold learning to project high-dimensional cell states into low-dimensional visual spaces.
- [Genomic Data Cleaning](https://awesome-repositories.com/f/data-databases/data-preprocessing-pipelines/genomic-data-cleaning.md) — Provides vectorized preprocessing pipelines using NumPy and SciPy for high-throughput normalization and scaling of cell data.
- [Genomic Expression Arrays](https://awesome-repositories.com/f/data-databases/sparse-arrays/genomic-expression-arrays.md) — Provides memory-efficient storage of high-dimensional gene expression matrices and cell metadata using sparse arrays.
