Similarity

Features

Metric Learning Libraries - Provides a Python framework for training models to learn vector representations and perform similarity lookups.
Vector Similarity Search - Implements optimized nearest neighbor lookups on indexed embeddings using configurable distance metrics for similarity-based retrieval.
Metric Learning Training - Builds neural networks that learn to map data into vector spaces where similar items are grouped closely together.
Embedding Similarity Loss Functions - Implements loss functions that minimize distance between similar items and maximize distance between dissimilar items in vector space.
Specialized Model Training - Provides specialized training workflows for learning high-dimensional vector representations using contrastive and triplet-based loss functions.
Contrastive Learning Wrappers - Learns general data representations by comparing multiple augmented views of the same input without requiring manual class labels.
Approximate Nearest Neighbor Search - Provides algorithms for indexing high-dimensional embeddings to enable rapid approximate nearest neighbor search.
Similarity Search Engines - Creates systems that index high-dimensional embeddings to perform fast and accurate nearest neighbor lookups for large datasets.
Balanced Class Samplers - Constructs training batches with specific class distributions to ensure stable convergence for metric learning.
Embedding Visualizations - Analyzes and inspects learned vector representations through interactive projections to understand how models cluster data.
Embedding Accuracy Evaluators - Calculates retrieval and classification metrics like Recall at K and NDCG to assess model accuracy and embedding space quality.
Model Persistence - Saves, loads, and exports trained models and indices for production deployment and real-time similarity lookups.
Self-Supervised Learning - Trains models on unlabeled data using contrastive methods to learn meaningful features that improve performance on downstream tasks.
Dimensionality Projection Plots - Maps high-dimensional learned representations into lower-dimensional manifolds for visual inspection and cluster analysis.
Vector Embedding Indexes - Indexes high-dimensional embeddings and metadata to enable efficient storage and fast approximate nearest neighbor retrieval.
Similarity Thresholds - Includes utilities for calibrating similarity matching thresholds to ensure accurate and relevant retrieval outcomes.
Vector Distance Kernels - Calculates mathematical separation between embedding vectors using high-speed linear algebra operations for efficient similarity comparisons.
AI Model Production Deployment - Exports trained models and search indices to production environments to enable real-time retrieval and similarity matching.
Similarity Learning Libraries - Provides a collection of tools for contrastive and triplet-based training to group similar items in high-dimensional embedding spaces.
Vector Distance Metrics - Provides mathematical implementations for calculating distances between high-dimensional vectors.
Regression and Classification - TensorFlow-based library for metric learning and similarity search.

Open-source alternatives to Similarity

Similar open-source projects, ranked by how many features they share with Similarity.

kevinmusgrave/pytorch-metric-learning
KevinMusgrave/pytorch-metric-learning
6,328View on GitHub
PyTorch Metric Learning is an open-source library for training neural networks to produce similarity-preserving embedding spaces. It provides a modular framework where interchangeable loss functions, mining strategies, and evaluation tools can be composed to learn representations that map similar items to nearby points and dissimilar items to distant points in the embedding space. The library distinguishes itself through a highly configurable architecture that separates concerns across several interchangeable components. Users can assemble custom loss functions from pluggable distance metrics
Pythoncomputer-visioncontrastive-learningdeep-learning
View on GitHub6,328
unum-cloud/usearch
unum-cloud/USearch
3,888View on GitHub
USearch is a high-performance vector similarity search engine and approximate nearest neighbor index designed for dense embeddings. It functions as a low-level vector database core and high-dimensional vector indexer, providing the primitives necessary to store and retrieve vectors across massive datasets. The engine distinguishes itself through hardware-level SIMD acceleration for distance kernels and a proximity-graph indexing system that enables fast retrieval across billions of vectors. It supports multi-precision vector quantization to balance memory usage and accuracy, and utilizes memo
C++approximate-nearest-neighbor-searchclusteringdatabase
View on GitHub3,888
spotify/annoy
spotify/annoy
14,157View on GitHub
Annoy is a C++ library designed for approximate nearest neighbor search in high-dimensional vector spaces. It functions as a vector similarity search engine that constructs static, disk-based data structures to facilitate fast lookups. By mapping identifiers to vector data and persisting these structures to disk, the library enables efficient, memory-mapped access to large datasets. The project distinguishes itself through the use of random projection trees and distance-metric-based partitioning, which organize data into hierarchical binary trees to balance search precision against computatio
C++approximate-nearest-neighbor-searchc-plus-plusgolang
View on GitHub14,157
lancedb/lancedb
lancedb/lancedb
9,031View on GitHub
LanceDB is a vector database and columnar data store designed to function as a versioned dataset manager and vector search engine. It serves as a high-performance backend for indexing and retrieving high-dimensional embeddings, providing the foundation for machine learning data pipelines. The system distinguishes itself through a combination of cloud-native object storage and immutable version tracking, allowing for data time-travel and reproducible AI experiments. It integrates hybrid search capabilities, merging dense vector similarity with BM25 full-text search and SQL-like scalar filters
HTMLapproximate-nearest-neighbor-searchimage-searchnearest-neighbor-search
View on GitHub9,031

See all 30 alternatives to Similarity

tensorflowsimilarityArchived

Features

Open-source alternatives to Similarity

KevinMusgrave/pytorch-metric-learning

unum-cloud/USearch

spotify/annoy

lancedb/lancedb

Star history

Open-source alternatives to Similarity

KevinMusgrave/pytorch-metric-learning

unum-cloud/USearch

spotify/annoy

lancedb/lancedb