# plachtaa/seed-vc

**Attribution required: if you use, quote, or summarise this content, you must credit and link back to [awesome-repositories.com](https://awesome-repositories.com/repository/plachtaa-seed-vc).**

3,590 stars · 443 forks · Python · gpl-3.0 · archived

## Links

- GitHub: https://github.com/Plachtaa/seed-vc
- awesome-repositories: https://awesome-repositories.com/repository/plachtaa-seed-vc.md

## Topics

`singing-voice-conversion` `voice-conversion`

## Description

seed-vc is an AI voice conversion tool and voice cloning system designed to transform the timbre, accent, and emotion of speech recordings. It provides a framework for replicating specific speaker identities and singing styles using short reference audio samples.

The project includes a voice fine-tuning framework for training models on custom audio datasets to increase the accuracy of voice clones. It also features speech anonymization tools that remove unique speaker traits to produce a generic average voice for identity protection.

The system covers a broad range of audio processing capabilities, including zero-shot voice conversion, talking pace control, and the modification of emotional delivery and accents. It supports both spoken speech and singing voice conversion to transfer styles between source and target recordings.

## Tags

### Artificial Intelligence & ML

- [Zero-Shot Voice Cloning](https://awesome-repositories.com/f/artificial-intelligence-ml/generative-ai-resources/speech-synthesis/zero-shot-voice-cloning.md) — Transforms source speech into a target speaker identity from short samples without requiring model retraining.
- [Custom Model Training](https://awesome-repositories.com/f/artificial-intelligence-ml/custom-model-training.md) — Fine-tunes machine learning models on specialized audio datasets to increase the likeness of cloned speakers.
- [Fine-Tuning Frameworks](https://awesome-repositories.com/f/artificial-intelligence-ml/fine-tuning-frameworks.md) — Provides a framework for training models on custom audio datasets to improve the accuracy of voice clones.
- [Voice Model Trainers](https://awesome-repositories.com/f/artificial-intelligence-ml/language-model-trainers/voice-model-trainers.md) — Trains and fine-tunes voice models on custom audio datasets to increase speaker similarity. ([source](https://cdn.jsdelivr.net/gh/plachtaa/seed-vc@main/README.md))
- [Voice Cloning](https://awesome-repositories.com/f/artificial-intelligence-ml/voice-cloning.md) — Creates high-fidelity digital replicas of specific people's voices using short audio samples.
- [Voice Identity Conversions](https://awesome-repositories.com/f/artificial-intelligence-ml/voice-cloning/voice-identity-conversions.md) — Clones speaker voice timbre using reference audio samples to transform source recordings. ([source](https://plachtaa.github.io/seed-vc/))
- [Neural Pace Control](https://awesome-repositories.com/f/artificial-intelligence-ml/on-the-fly-training-transformations/audio-resampling-transformations/neural-pace-control.md) — Adjusts the temporal duration of speech waveforms to change talking pace while preserving audio quality.
- [Emotional Synthesis](https://awesome-repositories.com/f/artificial-intelligence-ml/speech-synthesis/emotional-synthesis.md) — Utilizes latent-space embeddings to incorporate controllable emotional states and accents into synthesized speech.
- [Speech Style Transfer](https://awesome-repositories.com/f/artificial-intelligence-ml/text-to-speech/speech-to-speech-models/speech-style-transfer.md) — Changes the accent, emotion, and delivery of recordings while preserving or altering the original voice.
- [Vocal Identity Anonymization](https://awesome-repositories.com/f/artificial-intelligence-ml/vocal-identity-anonymization.md) — Implements averaging-based identity anonymization to protect speaker privacy by producing a generic vocal profile.
- [Emotion and Accent Transformation](https://awesome-repositories.com/f/artificial-intelligence-ml/voice-cloning/voice-identity-conversions/emotion-and-accent-transformation.md) — Modifies the accent and emotional delivery of source recordings while maintaining or altering speaker timbre. ([source](https://cdn.jsdelivr.net/gh/plachtaa/seed-vc@main/README.md))

### Part of an Awesome List

- [Model Fine-Tuning](https://awesome-repositories.com/f/awesome-lists/ai/model-training-and-fine-tuning/model-fine-tuning.md) — Provides a framework for optimizing pre-trained voice models on specific audio datasets to improve cloning accuracy.

### Graphics & Multimedia

- [Singing Style Transfer](https://awesome-repositories.com/f/graphics-multimedia/media-processing-analysis/audio-processing-systems/audio-synthesis/singing-voice-synthesis/singing-style-transfer.md) — Clones target singers' voices and singing styles using short reference samples to transform source recordings. ([source](https://cdn.jsdelivr.net/gh/plachtaa/seed-vc@main/README.md))
- [Vocal Timbre Extraction](https://awesome-repositories.com/f/graphics-multimedia/media-processing-analysis/media-manipulation/media-processing-workflows/audio-analysis-synthesis/audio-feature-extraction/audio-track-extraction/vocal-timbre-extraction.md) — Extracts vocal characteristics from short reference samples to reshape the spectral envelope of source recordings.

### Security & Cryptography

- [Audio Identity Anonymization](https://awesome-repositories.com/f/security-cryptography/audio-identity-anonymization.md) — Removes identifying vocal characteristics from recordings to produce a generic voice for identity protection.
- [Speech Anonymization](https://awesome-repositories.com/f/security-cryptography/speech-anonymization.md) — Provides tools that remove unique speaker traits from recordings to produce a generic average voice for identity protection. ([source](https://cdn.jsdelivr.net/gh/plachtaa/seed-vc@main/README.md))
- [Speech Anonymization Tools](https://awesome-repositories.com/f/security-cryptography/speech-anonymization-tools.md) — Removes individual speaker traits from audio recordings to produce a generic average voice.
