# huggingface/sentence-transformers

**Attribution required: if you use, quote, or summarise this content, you must credit and link back to [awesome-repositories.com](https://awesome-repositories.com/repository/huggingface-sentence-transformers).**

18,817 stars · 2,806 forks · Python · Apache-2.0

## Links

- GitHub: https://github.com/huggingface/sentence-transformers
- Homepage: https://www.sbert.net
- awesome-repositories: https://awesome-repositories.com/repository/huggingface-sentence-transformers.md

## Description

This project is a transformer-based framework for generating dense and sparse vector embeddings of text and multimodal data. It serves as a library for fine-tuning models to perform semantic similarity tasks, retrieval, and reranking.

The system is distinguished by its support for diverse architectural patterns, including bi-encoders for fast similarity search and cross-encoders for high-precision reranking. It provides dedicated pipelines for multimodal embeddings, mapping text and images into a shared vector space, and implements knowledge distillation to compress large models into smaller, lower-latency versions.

The framework covers a broad range of capabilities including model training and optimization, semantic search execution, and text analysis. It includes tools for contrastive-loss training, negative mining, and multilingual model extensions, as well as utilities for semantic clustering, paraphrase identification, and extractive summarization.

Users can publish trained weights and configurations to a central model hub for versioning and sharing.

## Tags

### Artificial Intelligence & ML

- [Semantic Search](https://awesome-repositories.com/f/artificial-intelligence-ml/semantic-search.md) — Enables retrieving the most semantically similar content from large collections based on the meaning of a query. ([source](https://github.com/huggingface/sentence-transformers/tree/main/examples/sentence_transformer/applications))
- [Semantic Search Engines](https://awesome-repositories.com/f/artificial-intelligence-ml/semantic-search-engines.md) — Provides a comprehensive framework for semantic search by converting text and images into vector embeddings.
- [Cross-Encoder Rerankers](https://awesome-repositories.com/f/artificial-intelligence-ml/document-rerankers/cross-encoder-rerankers.md) — Scores the relevance between queries and documents using cross-encoders to refine search result precision. ([source](https://github.com/huggingface/sentence-transformers/blob/main/README.md))
- [Embedding Frameworks](https://awesome-repositories.com/f/artificial-intelligence-ml/embedding-frameworks.md) — Provides a complete framework for training and deploying dense and sparse vector embedding models.
- [Embedding Model Fine-Tuning](https://awesome-repositories.com/f/artificial-intelligence-ml/embedding-model-fine-tuning.md) — Fine-tunes transformer networks using various loss functions to improve performance for tasks like paraphrase mining or clustering. ([source](https://github.com/huggingface/sentence-transformers#readme))
- [Cross-Encoders](https://awesome-repositories.com/f/artificial-intelligence-ml/evaluation-metrics/scoring-pipelines/feature-cross-scoring/cross-encoders.md) — Provides cross-encoder models that process input pairs simultaneously for high-accuracy reranking.
- [Fine-Tuning Libraries](https://awesome-repositories.com/f/artificial-intelligence-ml/fine-tuning-libraries.md) — Provides a library for training bi-encoders and cross-encoders using custom datasets to optimize retrieval performance.
- [Embedding Model Fine-Tuning](https://awesome-repositories.com/f/artificial-intelligence-ml/machine-learning/infrastructure/machine-learning-training/fine-tuning-and-alignment/fine-tuning-frameworks/transfer-learning-techniques/embedding-model-fine-tuning.md) — Allows users to train or fine-tune embedding and reranker models for specific domain use cases. ([source](https://github.com/huggingface/sentence-transformers/blob/main/index.rst))
- [Embedding Model Training](https://awesome-repositories.com/f/artificial-intelligence-ml/machine-learning/model-fine-tuning-adaptation/language-model-training/embedding-model-training.md) — Fine-tunes models using semantic similarity or natural language inference to produce dense vector representations. ([source](https://github.com/huggingface/sentence-transformers/tree/main/examples/sentence_transformer/training))
- [Dense Retrieval Training](https://awesome-repositories.com/f/artificial-intelligence-ml/machine-learning/model-fine-tuning-adaptation/language-model-training/retrieval-model-pre-training/dense-retrieval-training.md) — Fine-tunes bi-encoders and rerankers using custom datasets and negative mining to optimize retrieval accuracy. ([source](https://github.com/huggingface/sentence-transformers/tree/main/skills))
- [Bi-Encoder Training](https://awesome-repositories.com/f/artificial-intelligence-ml/model-training/bi-encoder-training.md) — Enables the generation of fixed-dimension dense vectors from text for similarity search and clustering. ([source](https://github.com/huggingface/sentence-transformers/blob/main/skills))
- [Multimodal Embedding Models](https://awesome-repositories.com/f/artificial-intelligence-ml/multimodal-embedding-models.md) — Provides pipelines for mapping text and images into a shared vector space for cross-modal semantic retrieval.
- [Siamese Networks](https://awesome-repositories.com/f/artificial-intelligence-ml/neural-network-architectures/siamese-networks.md) — Utilizes siamese-network architectures with shared weights to map inputs into a common vector space.
- [Reranking Pipelines](https://awesome-repositories.com/f/artificial-intelligence-ml/reranking-pipelines.md) — Implements multi-stage systems combining fast bi-encoders for retrieval and cross-encoders for high-precision reranking. ([source](https://github.com/huggingface/sentence-transformers/tree/main/examples/sentence_transformer/applications))
- [Result Reranking](https://awesome-repositories.com/f/artificial-intelligence-ml/result-reranking.md) — Implements neural reranking workflows using cross-encoders to refine search results for higher precision.
- [Semantic Similarity Calculation](https://awesome-repositories.com/f/artificial-intelligence-ml/semantic-analysis-tools/semantic-similarity-calculation.md) — Calculates the mathematical distance between embeddings to determine the semantic relationship between two pieces of content. ([source](https://github.com/huggingface/sentence-transformers/blob/main/README.md))
- [Vector Embeddings](https://awesome-repositories.com/f/artificial-intelligence-ml/vector-embeddings.md) — Transforms text into dense vector representations for mathematical comparisons of semantic meaning. ([source](https://github.com/huggingface/sentence-transformers/tree/main/examples/sentence_transformer/applications))
- [Bi-Encoders](https://awesome-repositories.com/f/artificial-intelligence-ml/vector-embeddings/bi-encoders.md) — Implements bi-encoder vectorization for fast similarity search using cosine distance.
- [Dense Embeddings](https://awesome-repositories.com/f/artificial-intelligence-ml/vector-embeddings/dense-embeddings.md) — Generates dense vector representations for text, images, and other modalities to enable semantic analysis. ([source](https://github.com/huggingface/sentence-transformers#readme))
- [Hybrid Sparse-Dense Embeddings](https://awesome-repositories.com/f/artificial-intelligence-ml/vector-embeddings/hybrid-sparse-dense-embeddings.md) — Combines learned dense vectors with vocabulary-based sparse representations for balanced semantic and keyword matching.
- [Sparse Embeddings](https://awesome-repositories.com/f/artificial-intelligence-ml/vector-embeddings/sparse-embeddings.md) — Produces sparse vector representations of text to support efficient keyword-aware hybrid retrieval. ([source](https://github.com/huggingface/sentence-transformers#readme))
- [Contrastive Learning Models](https://awesome-repositories.com/f/artificial-intelligence-ml/contrastive-learning-models.md) — Uses contrastive loss to optimize vector spaces by pulling similar pairs together and pushing dissimilar pairs apart.
- [Cross-Lingual Alignment](https://awesome-repositories.com/f/artificial-intelligence-ml/cross-lingual-alignment.md) — Supports multilingual model extensions and the identification of translated sentence pairs.
- [Cross-Modal Representations](https://awesome-repositories.com/f/artificial-intelligence-ml/cross-modal-representations.md) — Maps text and images into a shared embedding space to enable cross-modal retrieval.
- [Text-to-Image Retrieval](https://awesome-repositories.com/f/artificial-intelligence-ml/image-retrieval-systems/text-to-image-retrieval.md) — Implements a shared vector space mapping that allows retrieving images using natural language text queries. ([source](https://github.com/huggingface/sentence-transformers/tree/main/examples/sentence_transformer/applications))
- [Knowledge Distillation](https://awesome-repositories.com/f/artificial-intelligence-ml/knowledge-distillation.md) — Implements knowledge distillation to compress large teacher models into smaller, low-latency student models.
- [Model Evaluation Metrics](https://awesome-repositories.com/f/artificial-intelligence-ml/machine-learning/infrastructure/model-evaluation-and-validation/model-evaluation-metrics.md) — Includes evaluators to measure the quality and accuracy of embeddings and reranking scores. ([source](https://github.com/huggingface/sentence-transformers/blob/main/skills))
- [Model Distillation Tools](https://awesome-repositories.com/f/artificial-intelligence-ml/model-distillation-tools.md) — Provides utilities for transferring knowledge from large teacher models to smaller student models to reduce inference latency. ([source](https://github.com/huggingface/sentence-transformers/tree/main/examples/sparse_encoder/training))
- [Model Compression Suites](https://awesome-repositories.com/f/artificial-intelligence-ml/model-optimization/compression-techniques/model-pruning/model-compression-suites.md) — Offers tools for reducing model size and increasing inference speed via distillation and compression.
- [Model Performance Optimization](https://awesome-repositories.com/f/artificial-intelligence-ml/model-optimization/profiling-and-benchmarking/model-performance-optimization.md) — Reduces model size and increases inference speed through distillation and adaptive layer removal. ([source](https://github.com/huggingface/sentence-transformers/tree/main/examples/sentence_transformer/training))
- [Multilingual Extensions](https://awesome-repositories.com/f/artificial-intelligence-ml/multilingual-models/multilingual-extensions.md) — Adapts monolingual embedding models to support additional languages for multilingual applications. ([source](https://github.com/huggingface/sentence-transformers/tree/main/examples/sentence_transformer/training))
- [Hard Negative Mining](https://awesome-repositories.com/f/artificial-intelligence-ml/sampling-strategies/negative/hard-negative-mining.md) — Includes negative mining strategies to identify hard-to-distinguish dissimilar examples during training.
- [Sentence Pair Scoring](https://awesome-repositories.com/f/artificial-intelligence-ml/vector-embeddings/sentence-embeddings/sentence-pair-scoring.md) — Processes sentence pairs through a network to derive precise similarity scores or labels. ([source](https://github.com/huggingface/sentence-transformers/tree/main/examples/sentence_transformer/applications))
- [Sparse Embedding Fine-Tuning](https://awesome-repositories.com/f/artificial-intelligence-ml/vector-embeddings/sparse-embeddings/sparse-embedding-fine-tuning.md) — Optimizes sparse embeddings for specific retrieval tasks by adjusting model parameters with diverse datasets. ([source](https://github.com/huggingface/sentence-transformers/tree/main/examples/sparse_encoder/training))
- [Sparse Encoder Training](https://awesome-repositories.com/f/artificial-intelligence-ml/vector-embeddings/sparse-embeddings/sparse-encoder-training.md) — Creates learned-sparse vectors over a vocabulary to support efficient sparse retrieval. ([source](https://github.com/huggingface/sentence-transformers/blob/main/skills))

### Data & Databases

- [Semantic Search Engines](https://awesome-repositories.com/f/data-databases/search-indexing-technologies/search-indexing/search-information-retrieval/semantic-search-engines.md) — Implements a comprehensive toolset for retrieving and reranking documents based on vector embedding similarity.

### Part of an Awesome List

- [Text Clustering](https://awesome-repositories.com/f/awesome-lists/ai/text-clustering.md) — Includes utilities for semantic text clustering and paraphrase identification based on embedding distance.
- [Semantic](https://awesome-repositories.com/f/awesome-lists/ai/text-clustering/semantic.md) — Groups sentences together based on semantic similarity to identify common themes or patterns. ([source](https://github.com/huggingface/sentence-transformers/tree/main/examples/sentence_transformer/applications))
- [Training and Fine-Tuning](https://awesome-repositories.com/f/awesome-lists/ai/training-and-fine-tuning.md) — Library for training embedding and reranker models.

### Development Tools & Productivity

- [AI-Based Relevance Ranking](https://awesome-repositories.com/f/development-tools-productivity/search-ranking-algorithms/ai-based-relevance-ranking.md) — Orders search results based on the mathematical relevance of embeddings to a specific query. ([source](https://github.com/huggingface/sentence-transformers/blob/main/index.rst))