# jianchang512/clone-voice

**Attribution required: if you use, quote, or summarise this content, you must credit and link back to [awesome-repositories.com](https://awesome-repositories.com/repository/jianchang512-clone-voice).**

8,959 stars · 985 forks · Python · NOASSERTION · archived

## Links

- GitHub: https://github.com/jianchang512/clone-voice
- Homepage: https://pyvideotrans.com
- awesome-repositories: https://awesome-repositories.com/repository/jianchang512-clone-voice.md

## Topics

`clonevoice` `speech-analysis` `sts` `tts` `voice-assistant`

## Description

This project is a GPU-accelerated speech engine and AI voice cloning tool. It functions as a text-to-speech synthesizer and voice-to-voice converter that replicates specific human voices to generate synthetic speech.

The system creates digital voice profiles by analyzing short audio samples or capturing live microphone input. These profiles enable the transformation of existing audio recordings into a target speaker's voice or the synthesis of new audio from written text.

The engine supports subtitle-based speech generation for batch processing and automated dubbing workflows. A web-based audio interface provides a dashboard for recording voice samples and managing synthesis tasks.

## Tags

### Artificial Intelligence & ML

- [Voice Cloning Tools](https://awesome-repositories.com/f/artificial-intelligence-ml/generative-ai-resources/speech-synthesis/voice-cloning-tools.md) — Provides machine learning pipelines that generate high-quality synthetic speech from custom audio recordings.
- [Voice Profiling](https://awesome-repositories.com/f/artificial-intelligence-ml/agentic-systems-frameworks/conversational-voice-interaction/voice-agents/voice-profiling.md) — Extracts unique vocal characteristics from live microphone input to build personalized voice models.
- [Neural Text-to-Speech Engines](https://awesome-repositories.com/f/artificial-intelligence-ml/generative-ai-resources/speech-synthesis/neural-text-to-speech-engines.md) — Implements deep learning pipelines that generate synthetic speech by modeling cloned vocal characteristics.
- [GPU Acceleration](https://awesome-repositories.com/f/artificial-intelligence-ml/gpu-acceleration.md) — Uses GPU hardware acceleration to optimize the processing speed of voice cloning and synthesis models. ([source](https://github.com/jianchang512/clone-voice/blob/main/README.md))
- [Text-to-Speech Synthesis](https://awesome-repositories.com/f/artificial-intelligence-ml/text-to-speech-synthesis.md) — Converts written text and subtitle files into spoken audio using artificial intelligence and cloned voices. ([source](https://github.com/jianchang512/clone-voice))
- [Voice Cloning](https://awesome-repositories.com/f/artificial-intelligence-ml/voice-cloning.md) — Replicates specific human vocal characteristics from audio samples to transform existing recordings. ([source](https://github.com/jianchang512/clone-voice/blob/main/README.md))
- [Microphone Sampling](https://awesome-repositories.com/f/artificial-intelligence-ml/voice-cloning/microphone-sampling.md) — Captures audio directly from a microphone to establish voice profiles for cloning and synthesis. ([source](https://github.com/jianchang512/clone-voice/blob/main/README.md))
- [Voice Identity Conversions](https://awesome-repositories.com/f/artificial-intelligence-ml/voice-cloning/voice-identity-conversions.md) — Transforms the vocal characteristics of a source audio signal to match a target speaker's identity.
- [Voice Profile Management](https://awesome-repositories.com/f/artificial-intelligence-ml/voice-profile-management.md) — Registers and stores vocal characteristics to ensure consistency in synthetic speech generation. ([source](https://github.com/jianchang512/clone-voice))
- [Subtitle-Driven Dubbing](https://awesome-repositories.com/f/artificial-intelligence-ml/ai-video-generators/ai-video-dubbing-tools/subtitle-driven-dubbing.md) — Utilizes external subtitle files as the primary driver for generating synchronized synthetic voiceovers.
- [Subtitle-Driven Audio Synthesis](https://awesome-repositories.com/f/artificial-intelligence-ml/audio-transcription/multilingual-transcription/subtitle-driven-audio-synthesis.md) — Generates dubbed audio tracks based on the timing and text of external subtitle files.
- [Speech Synthesis Generators](https://awesome-repositories.com/f/artificial-intelligence-ml/phonetic-text-processors/speech-synthesis-generators.md) — Generates audible cloned speech based on the text and timing found in subtitle files. ([source](https://github.com/jianchang512/clone-voice))

### Part of an Awesome List

- [Voice Embedding Precomputations](https://awesome-repositories.com/f/awesome-lists/media/voice-processing/voice-embedding-precomputations.md) — Analyzes short audio samples to extract and store reusable vocal embeddings for voice cloning.

### Graphics & Multimedia

- [GPU-Accelerated TTS](https://awesome-repositories.com/f/graphics-multimedia/audio-music/speech-synthesis-tts/gpu-accelerated-tts.md) — Provides a speech synthesis engine specifically optimized for GPU execution to increase audio generation speed.
- [Text-to-Speech Synthesizers](https://awesome-repositories.com/f/graphics-multimedia/text-to-speech-synthesizers.md) — Converts written text or subtitle files into synthetic spoken audio using cloned voice profiles.
