30 open-source projects similar to rapidfuzz/rapidfuzz, ranked by how many features they have in common. Compare stars, activity and what each one does to find the best RapidFuzz alternative.
Fuzzywuzzy is a Python library and text processing utility designed to calculate similarity scores between strings. It functions as a text similarity scoring engine and an approximate string matching tool used to identify the closest textual matches within a list of candidate strings. The library provides a suite of tools for measuring the degree of similarity between pieces of text, accounting for typos and formatting differences. These capabilities include extracting the best match from a candidate list and performing fuzzy string matching through various scoring methods. The toolset cover
This is a Python fuzzy string matching library used for calculating string similarity and edit distances. It serves as a collection of string distance algorithms, a sequence alignment tool, and an approximate string search engine to measure text similarity. The library provides a wide array of metrics to quantify string closeness, including Levenshtein, Jaro-Winkler, Hamming, and Damerau-Levenshtein distances. It supports similarity analysis through longest common subsequence calculations, token-based comparisons, and weighted scoring to account for differences in content and word order. Bey
This is a text diffing and patching library used for computing differences between text blocks, calculating edit distances, and applying patches to synchronize document versions. It includes a fuzzy text matching engine to locate strings by balancing accuracy with location, and a Levenshtein distance calculator to measure the number of character insertions, deletions, and substitutions between two strings. The library features a semantic diff optimizer that refines raw text differences to align with human-readable word and phrase boundaries. It provides utilities for generating and parsing se
Skim is a cross-platform interactive fuzzy finder that runs as a terminal application, a Rust library, a Vim and Neovim plugin, and a shell integration tool. It provides real-time filtering and selection from lists of items, supporting keyboard and mouse navigation, live preview panes, and multi-select functionality across Linux, macOS, and Windows. The tool distinguishes itself through a composable query expression tree that supports fuzzy, exact, inverse, prefix, suffix, and logical AND/OR operators, combined with a Smith-Waterman scoring engine that penalizes typos and gaps for natural rel
Mailcheck is an email domain suggestion library and validation utility designed to identify misspelled email addresses. It functions as a string similarity tool that calculates the distance between typed domains and known correct extensions to provide automated correction suggestions. The library allows for the use of custom domain suggestion lists and the implementation of custom similarity and string distance logic. These mechanisms enable the replacement of default matching thresholds and distance algorithms with user-defined functions. The tool covers domain validation and correction thr
Dedupe is a machine learning tool for entity resolution that identifies and merges duplicate records in structured datasets. It uses active learning to train a matching model from human-labeled examples, learning which field-level similarities are most important for detecting duplicates without requiring manual rule writing. The system combines fingerprint-based blocking to reduce pairwise comparisons, enabling efficient matching on large datasets, and groups scored record pairs into clusters using a configurable similarity threshold. The tool provides multiple interfaces for different workfl
list.js is a JavaScript search and sort library used to add real-time filtering, sorting, and pagination to HTML lists and tables without backend dependencies. It functions as a DOM data indexer and template-driven HTML renderer, allowing developers to manage how data is displayed and discovered on the client side. The library distinguishes itself through a fuzzy string matching engine that handles approximate matches and typos, and a DOM data indexer that extracts values directly from HTML data attributes to build a searchable internal index. It uses a template-driven rendering system to gen
fuzzysort is a JavaScript library for performing approximate string matching and ranking results. It functions as a string matching engine and weighted search utility designed to identify approximate matches within text and object lists. The library features a pre-indexed search implementation that processes target strings into an optimized format to accelerate repeated lookups. It supports weighted object retrieval, allowing users to search through lists of objects by matching multiple keys and applying custom weights to prioritize specific fields. The engine provides capabilities for searc
Fuse is a JavaScript fuzzy search library and client-side search engine designed to index and query JSON data. It provides utilities for approximate string matching and ranking results by relevance, allowing applications to perform fast filtering and searching of datasets without a dedicated backend. The library distinguishes itself through a token-based search implementation that supports word-order independence and relevance weighting. It utilizes edit-distance scoring to handle typos and insertions, and employs a system of field weighting to prioritize matches in high-value data keys. The
This repository is a curated guide and implementation library of coding patterns used to solve data structures and algorithms problems. It serves as a technical interview study resource, providing a comprehensive set of strategies and computational logic examples for optimizing time and space complexity. The project focuses on standardized algorithmic patterns, including sliding windows, two pointers, and dynamic programming. It features specific implementations for a wide range of challenges, such as LeetCode problem solutions and specialized techniques like cyclic sort and bitwise XOR opera
This project is a comprehensive collection of common computer science algorithms and data structures implemented in Swift. It serves as an educational reference and library for studying computational complexity, algorithmic logic, and data structure engineering through practical code examples. The repository provides a wide suite of data structure implementations, including various types of linked lists, heaps, hash tables, and an extensive range of hierarchical trees such as Red-Black, B-Tree, and Splay trees. It also covers diverse sorting and searching techniques, from basic bubble sort to
USearch is a high-performance vector similarity search engine and approximate nearest neighbor index designed for dense embeddings. It functions as a low-level vector database core and high-dimensional vector indexer, providing the primitives necessary to store and retrieve vectors across massive datasets. The engine distinguishes itself through hardware-level SIMD acceleration for distance kernels and a proximity-graph indexing system that enables fast retrieval across billions of vectors. It supports multi-precision vector quantization to balance memory usage and accuracy, and utilizes memo
This project is a comprehensive resource directory for web data extraction, providing a curated collection of tools and libraries for parsing data, automating browsers, and managing network operations. It serves as a guide for extracting structured information from HTML, XML, JSON, and PDF formats. The toolkit focuses on advanced data collection strategies, including headless browser automation to interact with JavaScript and a suite of network utilities for DNS resolution and WebSocket connections. It specifically covers methods for bypassing bot protections through proxy pool management, us
Presto is a distributed SQL query engine designed for high-performance analytical processing across heterogeneous data sources. It functions as a data federation platform and massively parallel processing engine, allowing users to execute interactive queries against diverse storage systems without requiring data migration. By mapping remote metadata and structures to a unified relational namespace, it enables seamless cross-platform analysis through a standard SQL interface. The engine distinguishes itself through a pluggable connector architecture and a shared-nothing distributed processing
Skim is an interactive text filter and terminal selection tool written in Rust. It functions as a command line interface utility that processes input streams to isolate specific entries through real-time user queries and sorting. The tool differentiates itself through ANSI compatibility, allowing it to parse color codes and maintain text formatting during the search process. It supports multiple matching strategies, including configurable fuzzy matching algorithms and regular expression integration. The application covers a broad range of capabilities including field-specific filtering, resu
Ulauncher is a keyboard-driven application launcher and extensible command palette for Linux desktop environments. It provides a searchable interface for launching installed software and navigating local files. The system features a Python-based plugin architecture that allows the integration of third-party extensions and custom functionality. It includes a themable interface that supports custom color schemes and visual styles to match the system environment. Core capabilities include fuzzy-string matching for software and file retrieval, an integrated mathematical calculator for instant ev
ctrlp.vim is a fuzzy file navigation tool for the Vim editor. It enables the location and opening of files, buffers, tags, and recently used items through approximate string matching and regular expressions. The project identifies project roots by scanning for version control markers and configuration files. It allows for the creation of new files and their required parent directories directly from the search interface, and can open multiple files simultaneously. Broad capabilities include text editor resource management and workflow automation, such as executing specific commands immediatel
Sensitive-lexicon is a sensitive word detection service and content moderation tool designed to identify prohibited text. It utilizes a curated lexicon of thousands of categorized terms and a fuzzy matching text scanner to detect restricted words and phrases. The project features specialized filters for Chinese language content across political, social, and adult domains. It supports approximate string matching to identify terms that use noise characters or whitespace to evade standard keyword filters. The system includes a network interface for hosting the detection service, allowing for re
lunr.js is a JavaScript full-text search library and client-side search engine. It creates in-memory search indexes for fast keyword retrieval and ranked document matching within browser or Node.js environments. The library utilizes a JSON serializable search index, allowing the search structure to be converted to and from JSON for storage and distribution of pre-built search data. This enables search functionality for static websites by indexing content into portable files. The system supports advanced querying capabilities, including fuzzy text matching to account for typos, field-scoped i
Zed is a terminal-based code editor built in Rust that provides a full-featured editing experience with familiar keybindings, mouse support, and multiple cursors. It runs entirely in the terminal while offering capabilities typically found in graphical editors, including split panes, a command palette, and integrated language server protocol support for real-time diagnostics, completions, go-to-definition, and code actions across multiple languages. The editor distinguishes itself through a plugin system that runs sandboxed TypeScript plugins in a QuickJS runtime, with an asynchronous bridge
match-sorter is a JavaScript string matching and array filtering utility designed to rank and sort lists based on search string relevance. It functions as a deterministic best-match sorting library and fuzzy search engine for filtering object arrays. The tool prioritizes results using weighted match heuristics that favor exact matches, acronyms, and string containment. It employs a deterministic ranking system to ensure consistent ordering and supports diacritic-insensitive normalization to match characters regardless of accents. The library covers match criteria specification via key-path p
Saws is an interactive shell wrapper and resource manager for the AWS CLI. It provides a command-line environment designed to enhance the execution of AWS commands through predictive text, resource suggestions, and improved navigation. The tool implements fuzzy searching and case-insensitive autocomplete to accelerate the discovery and selection of cloud resources. It reduces manual entry via a system of command shortcuts and aliases that map short strings to complex commands. The interface includes local caching of resource data to minimize API requests, persistent command history, and the
Peco is an interactive text filter and fuzzy finder for the terminal. It serves as a terminal user interface selection tool that filters standard input in real-time using fuzzy matching and regular expressions. The tool preserves and renders ANSI color escape sequences from piped input streams while performing matching logic on plain-text versions. It supports multi-stage filtering, allowing users to freeze result sets to create a new base for subsequent refinements. Capability areas include advanced search filtering with negative matching, multi-item selection, and the ability to pipe selec
mini.nvim is a comprehensive library of independent modules designed to extend Neovim with a wide array of navigation, user interface, and text manipulation tools. It serves as a modular plugin collection, a UI toolkit for creating custom statuslines and notifications, and a package manager for installing and pinning external plugins from Git. The project provides a specialized fuzzy picker framework for filtering files and symbols, an LSP completion engine with interactive snippet expansion, and a dedicated plugin test framework that uses headless editor instances and remote procedure calls
toolong is a terminal log viewer and TUI log manager designed for monitoring live log streams and navigating large log files. It functions as a log aggregator and JSONL formatter, capable of merging multiple log files into a single chronological view by automatically detecting timestamps. The application supports the visualization of structured data by pretty printing JSONL files and applying syntax highlighting to common web server log patterns. It handles large-scale data efficiently through virtual-sized scrollable views, allowing users to open compressed logs or files of any size without
This project is a self-hosted system for discovering, browsing, and receiving personalized recommendations from academic papers on arXiv. It combines an arXiv API client that downloads paper metadata and PDFs with a TF-IDF document similarity engine and an SVM-based recommendation system that trains a classifier per user based on their preferences. The system provides a web interface for browsing, searching, and filtering recent arXiv submissions, alongside personalized paper recommendations generated from individual user signals. It also includes a Twitter mention tracker that periodically p
Hyperscan is a high-performance regular expression matching library that scans large volumes of data against thousands of patterns simultaneously. It accepts PCRE-compatible regular expressions and supports multi-pattern matching in a single pass, approximate matching within a configurable edit distance, and streaming mode for processing data that arrives in blocks. The library is designed for throughput-oriented scanning across block, streaming, and vectored inputs. What distinguishes Hyperscan is its hybrid automata engine, which combines deterministic and nondeterministic finite automata t