# hotoo/pinyin

**Attribution required: if you use, quote, or summarise this content, you must credit and link back to [awesome-repositories.com](https://awesome-repositories.com/repository/hotoo-pinyin).**

7,821 stars · 853 forks · JavaScript · MIT

## Links

- GitHub: https://github.com/hotoo/pinyin
- Homepage: https://pinyin.js.org
- awesome-repositories: https://awesome-repositories.com/repository/hotoo-pinyin.md

## Topics

`chinese` `hanzi` `pinyin` `zhongwen`

## Description

This is a Chinese text segmentation library that converts Chinese characters into their phonetic pinyin representation. It functions as a polyphone disambiguation tool, resolving ambiguous pronunciations for multi-sound characters using word segmentation and context analysis, and also serves as a pinyin sorting utility for ordering Chinese strings alphabetically.

The library distinguishes itself through surname-aware pronunciation switching, applying specialized phonetic rules for Chinese surnames with non-standard pronunciations in name contexts. It supports pluggable word segmentation algorithms, allowing users to choose between different segmentation strategies for accuracy or speed, and generates all possible pinyin permutations for strings containing polyphonic characters to support search indexing. The tool also groups pinyin syllables by word boundaries instead of individual characters for more natural phonetic output, and offers flexible output formatting through tone-mark-to-numeric conversion.

Additional capabilities include selecting pinyin style in various formats such as tone marks, numeric tone indicators, or initials-only, and sorting Chinese text alphabetically by converting characters to pinyin and comparing their phonetic representations.

## Tags

### Business & Productivity Software

- [Chinese Character Disambiguations](https://awesome-repositories.com/f/business-productivity-software/contextual-ambiguity-resolutions/chinese-character-disambiguations.md) — Converts Chinese characters into their phonetic pinyin representation with polyphonic character support.
- [Word-Segmentation-Based Disambiguators](https://awesome-repositories.com/f/business-productivity-software/contextual-ambiguity-resolutions/chinese-character-disambiguations/word-segmentation-based-disambiguators.md) — Uses word segmentation to resolve polyphonic character ambiguity by analyzing contextual word boundaries.
- [Polyphonic Character Handlers](https://awesome-repositories.com/f/business-productivity-software/polyphonic-character-handlers.md) — Resolves ambiguous pronunciations for multi-sound Chinese characters using word segmentation and context analysis.
- [Pinyin-Based Sorters](https://awesome-repositories.com/f/business-productivity-software/chinese-text-input/pinyin-based-sorters.md) — Orders Chinese characters or strings based on their pinyin representation for alphabetical sorting. ([source](https://pinyin.js.org/api/v4/index.html))
- [Polyphone Permutation Generators](https://awesome-repositories.com/f/business-productivity-software/polyphonic-character-handlers/polyphone-permutation-generators.md) — Generates all possible pinyin permutations for strings with polyphonic characters to support search indexing.

### Data & Databases

- [Character-to-Pinyin Converters](https://awesome-repositories.com/f/data-databases/pinyin-transliterations/character-to-pinyin-converters.md) — Provides the core character-to-pinyin conversion with polyphonic support and word-level disambiguation.
- [Chinese Language Segmenters](https://awesome-repositories.com/f/data-databases/text-processing-utilities/text-extraction/text-segmentation/chinese-language-segmenters.md) — Applies Chinese word segmentation before conversion to reduce polyphonic character ambiguity. ([source](https://pinyin.js.org/api/v4/index.html))
- [Chinese Character Matchers](https://awesome-repositories.com/f/data-databases/text-processing-utilities/text-extraction/text-segmentation/chinese-language-segmenters/traditional-chinese-support/chinese-character-simplifiers/chinese-character-matchers.md) — Converts Chinese characters into their phonetic pinyin representation with polyphonic character support. ([source](https://pinyin.js.org/api/v4/index.html))
- [Pre-Conversion Segmenters](https://awesome-repositories.com/f/data-databases/text-processing-utilities/text-extraction/text-segmentation/linguistic-text-segmenters/pre-conversion-segmenters.md) — Uses word segmentation to improve pinyin accuracy for polyphonic characters by analyzing context. ([source](https://cdn.jsdelivr.net/gh/hotoo/pinyin@master/README.md))
- [Pinyin Transliterations](https://awesome-repositories.com/f/data-databases/pinyin-transliterations.md) — Switches to surname mode to produce more accurate pinyin for Chinese names with special pronunciations. ([source](https://pinyin.js.org/api/v3/index.html))
- [Pinyin Sorters](https://awesome-repositories.com/f/data-databases/pinyin-transliterations/pinyin-sorters.md) — Ships a pinyin-based sorting utility for ordering Chinese text alphabetically.
- [Pinyin Sorting Engines](https://awesome-repositories.com/f/data-databases/pinyin-transliterations/pinyin-sorting-engines.md) — Implements a sorting engine that orders Chinese strings alphabetically by their pinyin representation.
- [Surname-Aware Converters](https://awesome-repositories.com/f/data-databases/pinyin-transliterations/surname-aware-converters.md) — Applies specialized pronunciation rules for Chinese surnames to produce accurate name-specific pinyin output.
- [Pluggable Segmenter Backends](https://awesome-repositories.com/f/data-databases/segmented-storage-architectures/pluggable-segmenter-backends.md) — Provides a pluggable backend architecture for choosing between different word segmentation strategies.

### Artificial Intelligence & ML

- [HMM Segmenters](https://awesome-repositories.com/f/artificial-intelligence-ml/natural-language-processing/word-embeddings/hmm-segmenters.md) — Uses word segmentation to reduce polyphone ambiguity by analyzing context for improved pinyin accuracy. ([source](https://pinyin.js.org/api/v3/index.html))
- [Phonetic Pronunciation Overrides](https://awesome-repositories.com/f/artificial-intelligence-ml/text-to-speech/phonetic-pronunciation-overrides.md) — Uses specialized pinyin rules for Chinese surnames, ensuring accurate pronunciation in name contexts. ([source](https://cdn.jsdelivr.net/gh/hotoo/pinyin@master/README.md))
- [Surname-Aware Pronunciation Overrides](https://awesome-repositories.com/f/artificial-intelligence-ml/text-to-speech/phonetic-pronunciation-overrides/surname-aware-pronunciation-overrides.md) — Applies specialized phonetic rules for Chinese surnames with non-standard pronunciations in name contexts.
