# seatgeek/fuzzywuzzy

**Attribution required: if you use, quote, or summarise this content, you must credit and link back to [awesome-repositories.com](https://awesome-repositories.com/repository/seatgeek-fuzzywuzzy).**

9,258 stars · 862 forks · Python · GPL-2.0 · archived

## Links

- GitHub: https://github.com/seatgeek/fuzzywuzzy
- Homepage: http://chairnerd.seatgeek.com/fuzzywuzzy-fuzzy-string-matching-in-python/
- awesome-repositories: https://awesome-repositories.com/repository/seatgeek-fuzzywuzzy.md

## Description

Fuzzywuzzy is a Python library and text processing utility designed to calculate similarity scores between strings. It functions as a text similarity scoring engine and an approximate string matching tool used to identify the closest textual matches within a list of candidate strings.

The library provides a suite of tools for measuring the degree of similarity between pieces of text, accounting for typos and formatting differences. These capabilities include extracting the best match from a candidate list and performing fuzzy string matching through various scoring methods.

The toolset covers text normalization and preprocessing, including the removal of non-alphanumeric characters and whitespace standardization. It also provides algorithmic implementations for distance calculation, token-based set matching, and sequence matching.

## Tags

### Software Engineering & Architecture

- [Approximate String Searching](https://awesome-repositories.com/f/software-engineering-architecture/string-matching-algorithms/approximate-string-searching.md) — Provides the primary capability of finding the closest textual matches for a query string within candidate lists.
- [Approximate Matching Tools](https://awesome-repositories.com/f/software-engineering-architecture/string-matching-algorithms/approximate-matching-tools.md) — Serves as a comprehensive toolset for identifying the closest textual matches within lists of candidate strings.
- [Best Match Extraction](https://awesome-repositories.com/f/software-engineering-architecture/string-matching-algorithms/best-match-extraction.md) — Implements logic to extract the closest matching string from a candidate list based on a provided query. ([source](https://github.com/seatgeek/fuzzywuzzy/blob/master/test_fuzzywuzzy_pytest.py))
- [Partial String Matching](https://awesome-repositories.com/f/software-engineering-architecture/string-matching-algorithms/partial-string-matching.md) — Includes partial ratio matching to identify the best embedded substrings within longer text sequences.
- [Token Set Matching](https://awesome-repositories.com/f/software-engineering-architecture/string-tokenization/token-set-matching.md) — Splits strings into individual words and compares them as sets to ignore differences in word order.
- [String Validation and Normalization](https://awesome-repositories.com/f/software-engineering-architecture/string-validation-and-normalization.md) — Provides utilities to remove non-essential characters and standardize whitespace for consistent string matching.
- [Custom Text Normalizers](https://awesome-repositories.com/f/software-engineering-architecture/string-validation-and-normalization/speech-to-text-normalizers/custom-text-normalizers.md) — Provides utilities for cleaning strings by removing non-alphanumeric characters to improve matching accuracy. ([source](https://github.com/seatgeek/fuzzywuzzy/blob/master/test_fuzzywuzzy.py))

### Artificial Intelligence & ML

- [Text Similarity Scoring](https://awesome-repositories.com/f/artificial-intelligence-ml/semantic-analysis-tools/semantic-similarity-calculation/text-similarity-scoring.md) — Provides a scoring engine to measure the degree of similarity between texts using token sorting and normalization.

### Data & Databases

- [Fuzzy Matching](https://awesome-repositories.com/f/data-databases/fuzzy-matching.md) — Provides a library for calculating string similarity to find matches despite typos and formatting errors.
- [Text Preprocessing](https://awesome-repositories.com/f/data-databases/data-processing-pipelines/data-transformation/text-nlp-preprocessing/text-preprocessing.md) — Normalizes strings by removing special characters and forcing ASCII encoding to optimize fuzzy comparisons. ([source](https://github.com/seatgeek/fuzzywuzzy/blob/master/test_fuzzywuzzy_hypothesis.py))

### Graphics & Multimedia

- [Token-Based Similarity Scoring](https://awesome-repositories.com/f/graphics-multimedia/perceptual-similarity-scoring/token-based-similarity-scoring.md) — Implements sorted-token scoring to treat strings with the same words in different orders as identical.

### Programming Languages & Runtimes

- [Edit Distance Calculators](https://awesome-repositories.com/f/programming-languages-runtimes/programming-utilities/string-utilities/string-manipulators/edit-distance-calculators.md) — Implements Levenshtein distance to calculate the minimum number of edits required to transform one string into another.
- [String Similarity Metrics](https://awesome-repositories.com/f/programming-languages-runtimes/programming-utilities/string-utilities/string-manipulators/edit-distance-calculators/string-similarity-metrics.md) — Calculates numerical similarity metrics, such as Levenshtein distance, to identify matches despite typos. ([source](https://github.com/seatgeek/fuzzywuzzy/blob/master/.travis.yml))

### Business & Productivity Software

- [Data Deduplication](https://awesome-repositories.com/f/business-productivity-software/data-deduplication.md) — Identifies duplicate records by matching strings that are spelled slightly differently.

### Development Tools & Productivity

- [Query Normalizers](https://awesome-repositories.com/f/development-tools-productivity/search-query-utilities/query-normalizers.md) — Standardizes raw text input to improve the precision and accuracy of subsequent search results.
- [Text Processing Utilities](https://awesome-repositories.com/f/development-tools-productivity/text-processing-utilities.md) — Includes functions for cleaning and preprocessing strings to enhance the accuracy of fuzzy matching.

### Part of an Awesome List

- [Natural Language Processing](https://awesome-repositories.com/f/awesome-lists/ai/natural-language-processing.md) — Library for fuzzy string matching in Python.
