awesome-repositories.com
© 2026 Bringes Technology SRL·VAT RO45896025·hello@bringes.io
MCPSitemapPrivacyTerms
Speech Processing · Awesome GitHub Repositories

6 repos

Awesome GitHub RepositoriesSpeech Processing

Tools and libraries for converting, analyzing, and interpreting human speech through computational methods.

Explore 6 awesome GitHub repositories matching artificial intelligence & ml · Speech Processing. Refine with filters or upvote what's useful.

  1. Home
  2. Artificial Intelligence & ML
  3. Machine Learning
  4. Speech Processing

Awesome Speech Processing GitHub Repositories

Describe the repository you're looking for…
We'll search the best matching repositories with AI.
  • Significant-Gravitas/AutoGPT

    Significant-Gravitas/AutoGPT

    181,891GitHubView on GitHub↗

    AutoGPT is an orchestration platform designed for building, managing, and deploying autonomous agents. It provides a visual canvas-based environment where users can assemble agents by connecting modular blocks that represent actions, data flows, and conditional logic. The platform supports the entire agent lifecycle, i

    Pythonaiartificial-intelligenceautonomous-agents
  • huggingface/transformers

    huggingface/transformers

    156,730GitHubView on GitHub↗

    Transformers is a comprehensive library for machine learning that provides a unified interface for training, fine-tuning, and deploying transformer-based models. It supports a wide range of tasks, including text classification, language modeling, question answering, and sequence-to-sequence translation, while offering

    Pythonaudiodeep-learningdeepseek
  • pytorch/pytorch

    pytorch/pytorch

    97,601GitHubView on GitHub↗

    PyTorch is a machine learning framework centered on a GPU-ready tensor library that supports multi-dimensional array operations across both CPU and accelerator hardware. It provides a foundational infrastructure for mathematical computation and dynamic neural network construction, utilizing a tape-based automatic diffe

    Pythonautograddeep-learninggpu
  • openai/whisper

    openai/whisper

    94,839GitHubView on GitHub↗

    This project is a speech recognition and translation engine that utilizes a sequence-to-sequence transformer architecture to convert audio into text. It is built upon a weakly supervised learning framework, which leverages large-scale, unlabelled audio-transcript data to create generalized speech representations capabl

    Python
  • CorentinJ/Real-Time-Voice-Cloning

    CorentinJ/Real-Time-Voice-Cloning

    59,355GitHubView on GitHub↗

    This project is a neural text-to-speech engine and voice cloning toolkit designed to generate synthetic speech that mimics the vocal characteristics of a target speaker. It functions as a real-time audio synthesizer, utilizing a deep learning pipeline to convert written text into high-fidelity speech output with minima

    Pythondeep-learningpythonpytorch
  • RVC-Boss/GPT-SoVITS

    RVC-Boss/GPT-SoVITS

    55,111GitHubView on GitHub↗

    GPT-SoVITS is a text-to-speech synthesis engine and voice cloning toolkit designed for generating natural-sounding human speech. It functions as a neural audio processing pipeline that maps input text to high-fidelity audio waveforms, utilizing conditional variational autoencoders and flow-based decoders to ensure expr

    Pythontext-to-speechttsvits

Explore sub-tags

  • Automatic Speech RecognitionSystems and pre-trained models that convert spoken audio recordings into text using large-scale speech recognition technology.
  • Multilingual Speech TranslationCapabilities for converting spoken audio from one language into text in another language.
  • Self-Supervised Speech RepresentationsModels that learn linguistic features from raw audio without explicit labels.
  • Sequence-to-Sequence Tasks1 sub-tagModels that transform one sequence of data, such as text or audio, into another sequence.
  • Speaker EmbeddingsModels that map audio clips into fixed-dimensional vectors representing unique vocal characteristics.
  • Speech Datasets2 sub-tagsCollections of audio recordings and transcriptions used to train and evaluate speech-based machine learning models.
  • Speech Recognition APIsProgrammatic interfaces for integrating speech-to-text capabilities into software applications.
  • Speech Recognition LibrariesSoftware libraries providing programmatic interfaces for converting spoken audio into text.
  • Speech Recognition SystemsModels and tools that convert spoken audio into written text or perform cross-lingual speech translation.
  • Speech Translation SystemsModels and tools that convert spoken audio in one language into text in another language.
  • Voice Synthesis1 sub-tagServices and technologies that convert text input into natural-sounding human speech.