awesome-repositories.com
© 2026 Bringes Technology SRL·VAT RO45896025·hello@bringes.io
MCPSitemapPrivacyTerms
Speech Synthesis · Awesome GitHub Repositories

2 repos

Awesome GitHub RepositoriesSpeech Synthesis

Models and engines that convert text into natural-sounding human speech using advanced acoustic and alignment techniques.

Explore 2 awesome GitHub repositories matching artificial intelligence & ml · Speech Synthesis. Refine with filters or upvote what's useful.

  1. Home
  2. Artificial Intelligence & ML
  3. Speech and Voice Technologies
  4. Speech Synthesis

Awesome Speech Synthesis GitHub Repositories

Describe the repository you're looking for…
We'll search the best matching repositories with AI.
  • CorentinJ/Real-Time-Voice-Cloning

    CorentinJ/Real-Time-Voice-Cloning

    59,355GitHubView on GitHub↗

    This project is a neural text-to-speech engine and voice cloning toolkit designed to generate synthetic speech that mimics the vocal characteristics of a target speaker. It functions as a real-time audio synthesizer, utilizing a deep learning pipeline to convert written text into high-fidelity speech output with minima

    Pythondeep-learningpythonpytorch
  • RVC-Boss/GPT-SoVITS

    RVC-Boss/GPT-SoVITS

    55,111GitHubView on GitHub↗

    GPT-SoVITS is a text-to-speech synthesis engine and voice cloning toolkit designed for generating natural-sounding human speech. It functions as a neural audio processing pipeline that maps input text to high-fidelity audio waveforms, utilizing conditional variational autoencoders and flow-based decoders to ensure expr

    Pythontext-to-speechttsvits

Explore sub-tags

  • Acoustic ModelsNeural network architectures that convert linguistic representations into audio features like mel-spectrograms.
  • Autoregressive Sequence GeneratorsModels that predict sequential audio frames based on previous outputs to form continuous acoustic representations.
  • Cross-Lingual Speech GeneratorsGenerative models capable of producing speech in multiple languages while maintaining specific speaker identity characteristics.
  • Cross-Modal Alignment ModelsMechanisms that map linguistic features to speaker-specific voice embeddings.
  • Neural Text-to-Speech EnginesDeep learning pipelines that generate synthetic speech by modeling vocal characteristics.
  • Real-Time Voice CloningSystems capable of replicating vocal identities from short samples with low latency.
  • Voice Cloning ToolsMachine learning pipelines that generate high-quality synthetic speech by processing custom audio recordings or pre-trained voice models.