# myshell-ai/melotts

**Attribution required: if you use, quote, or summarise this content, you must credit and link back to [awesome-repositories.com](https://awesome-repositories.com/repository/myshell-ai-melotts).**

7,509 stars · 1,049 forks · Python · MIT

## Links

- GitHub: https://github.com/myshell-ai/MeloTTS
- awesome-repositories: https://awesome-repositories.com/repository/myshell-ai-melotts.md

## Topics

`chinese` `english` `french` `japanese` `korean` `multilingual` `spanish` `text-to-speech` `tts`

## Description

MeloTTS is an open-source text-to-speech library that generates natural-sounding speech across six languages, with the ability to mix two languages within a single utterance. Its architecture combines a token-based text frontend with a language-agnostic acoustic model, enabling it to handle bilingual code-switching and produce streaming audio output in real time.

The system is designed to run efficiently on standard CPU hardware without requiring a dedicated GPU, using a lightweight neural network for real-time inference. It supports English, Spanish, French, Chinese, Japanese, and Korean, and can process mixed-language input such as Chinese and English within the same sentence by switching between language-specific acoustic models.

The library provides a freely available toolkit for developers to integrate speech synthesis into applications, with phoneme mapping that preserves language identity and prosodic boundaries across all supported languages.

## Tags

### Artificial Intelligence & ML

- [TTS Engine Optimizations](https://awesome-repositories.com/f/artificial-intelligence-ml/cpu-optimizations/tts-engine-optimizations.md) — A speech synthesis engine designed to run real-time inference on standard CPU hardware without requiring a dedicated GPU.
- [Bilingual Speech Synthesizers](https://awesome-repositories.com/f/artificial-intelligence-ml/generative-ai-resources/speech-synthesis/bilingual-speech-synthesizers.md) — Handles text containing both Chinese and English within a single utterance for bilingual speech generation. ([source](https://cdn.jsdelivr.net/gh/myshell-ai/melotts@main/README.md))
- [CPU Inference Runtimes](https://awesome-repositories.com/f/artificial-intelligence-ml/inference-clients/on-device-inference/cpu-inference-runtimes.md) — Runs a lightweight neural network model designed for real-time speech synthesis on standard CPU hardware.
- [Speech Inference Runtimes](https://awesome-repositories.com/f/artificial-intelligence-ml/inference-clients/on-device-inference/cpu-inference-runtimes/speech-inference-runtimes.md) — Runs real-time text-to-speech synthesis on standard CPU hardware without requiring a dedicated GPU.
- [Shared Acoustic Backbones](https://awesome-repositories.com/f/artificial-intelligence-ml/language-agnostic-training-pipelines/shared-acoustic-backbones.md) — Uses a shared neural backbone trained on multilingual data to generate speech features independent of input language.
- [Prosody-Preserving Tokenizers](https://awesome-repositories.com/f/artificial-intelligence-ml/natural-language-processing/text-tokenization/prosody-preserving-tokenizers.md) — Converts raw text into a tokenised representation that preserves language identity and prosodic boundaries for synthesis.
- [CPU-Based Synthesizers](https://awesome-repositories.com/f/artificial-intelligence-ml/real-time-speech-processing/real-time-speech-synthesis/cpu-based-synthesizers.md) — Runs real-time text-to-speech inference on standard CPU hardware without needing a dedicated GPU. ([source](https://cdn.jsdelivr.net/gh/myshell-ai/melotts@main/README.md))
- [Multilingual Text-to-Speech Engines](https://awesome-repositories.com/f/artificial-intelligence-ml/speech-and-text-conversion/text-to-speech-conversions/multilingual-text-to-speech-engines.md) — Converts written text into natural-sounding speech across multiple languages including English, Spanish, French, Chinese, Japanese, and Korean. ([source](https://cdn.jsdelivr.net/gh/myshell-ai/melotts@main/README.md))

### Part of an Awesome List

- [Bilingual Code-Switching](https://awesome-repositories.com/f/awesome-lists/devtools/switches/bilingual-code-switching.md) — Detects and processes mixed-language text within a single utterance by switching between language-specific acoustic models.
- [AI & Machine Learning](https://awesome-repositories.com/f/awesome-lists/ai/ai-machine-learning.md) — Multi-lingual text-to-speech library.

### Development Tools & Productivity

- [Open Source Toolkits](https://awesome-repositories.com/f/development-tools-productivity/open-source-toolkits.md) — A freely available toolkit for developers to integrate high-quality speech synthesis into applications with multi-language support.

### Software Engineering & Architecture

- [Grapheme-to-Phoneme Pipelines](https://awesome-repositories.com/f/software-engineering-architecture/infrastructure-configuration-languages/multi-language-support/multi-language-pipeline-orchestration/grapheme-to-phoneme-pipelines.md) — Maps input text to language-specific phoneme sequences using a unified phonetic representation across six languages.

### Graphics & Multimedia

- [Streaming Audio Generators](https://awesome-repositories.com/f/graphics-multimedia/audio-music/audio-streaming-engines/audio-playback-engines/chunked-audio-streaming/generative-audio-chunking/streaming-audio-generators.md) — Produces audio output in small chunks during inference to minimise latency and enable real-time playback.