What are the best open-source alternatives to Pocketsphinx?

30 open-source projects similar to cmusphinx/pocketsphinx, ranked by shared features. Top picks: wenet-e2e/wenet, argmaxinc/whisperkit, ufal/whisper_streaming, elevenlabs/elevenlabs-python, pluja/whishper, thewh1teagle/vibe, sevask/ecoute, opennmt/ctranslate2, leetcode-mafia/cheetah, davabase/whisper_real_time.

Is wenet-e2e/wenet a good alternative to Pocketsphinx?

WeNet is an end-to-end automatic speech recognition (ASR) toolkit designed for both Chinese and English, built around transformer-based models. It supports streaming and non-streaming inference out of the box, and is structured to be production-ready, with model export and deployment paths for serv…

Is argmaxinc/whisperkit a good alternative to Pocketsphinx?

argmaxinc/whisperkit is an open-source alternative to Pocketsphinx.

Is ufal/whisper_streaming a good alternative to Pocketsphinx?

Whisper streaming is an automated speech recognition engine designed to convert live audio into text. It functions as a network-based transcription server that accepts raw audio data from remote clients and returns incremental text results in real-time. The system distinguishes itself through its…

Is elevenlabs/elevenlabs-python a good alternative to Pocketsphinx?

This Python SDK provides a comprehensive toolkit for synthetic audio generation, voice cloning, and the development of conversational AI agents. It enables the creation of lifelike spoken audio from text, the replication of human voices through custom cloning, and the deployment of real-time voice…

Is pluja/whishper a good alternative to Pocketsphinx?

Whishper is a graphical user interface for transcribing audio and video files into text using the Whisper model. It serves as a speech-to-text tool and subtitle file generator that converts spoken content into editable text and timed subtitle formats. The project features an integrated transcripti…

Is thewh1teagle/vibe a good alternative to Pocketsphinx?

Vibe is a cross-platform transcription tool that converts spoken audio into text by running Whisper neural models directly on your device, with no cloud dependency. It can transcribe audio from files, microphones, system output, and network streams, and supports both batch processing of multiple fi…

Is sevask/ecoute a good alternative to Pocketsphinx?

Ecoute is a live transcription tool that provides real-time transcripts for both the user's microphone input (You) and the user's speakers output (Speaker) in a textbox.

Is opennmt/ctranslate2 a good alternative to Pocketsphinx?

CTranslate2 is a C++ inference engine and runtime for Transformer models, designed to execute models on both CPU and GPU with optimizations for speed and memory efficiency. It functions as a model format converter, quantization tool, and REST API server, enabling deployment of neural machine transl…

Is leetcode-mafia/cheetah a good alternative to Pocketsphinx?

Cheetah is an LLM technical interview assistant composed of a native macOS application and a browser extension. It provides real-time coding and answering suggestions during technical interviews by combining live audio transcription with web-based context extraction. The system functions as a real…

Is davabase/whisper_real_time a good alternative to Pocketsphinx?

Whisper Real-Time is a speech-to-text engine designed to convert continuous microphone input into written transcripts. It functions as a real-time audio processor that leverages the OpenAI Whisper model to generate immediate textual output from live spoken language. The system utilizes a transform…

Back to cmusphinx/pocketsphinx

Open-source alternatives to Pocketsphinx

30 open-source projects similar to cmusphinx/pocketsphinx, ranked by how many features they have in common. Compare stars, activity and what each one does to find the best Pocketsphinx alternative.

wenet-e2e/wenet
wenet-e2e/wenet
5,035View on GitHub
WeNet is an end-to-end automatic speech recognition (ASR) toolkit designed for both Chinese and English, built around transformer-based models. It supports streaming and non-streaming inference out of the box, and is structured to be production-ready, with model export and deployment paths for servers and mobile devices. The toolkit distinguishes itself through a chunk-based streaming transformer architecture that processes audio in fixed-size segments for low latency while preserving context across chunks. It jointly trains models with both CTC and attention loss to combine alignment accurac
Pythonasrautomatic-speech-recognitionconformer
View on GitHub5,035
argmaxinc/whisperkit
argmaxinc/WhisperKit
5,639View on GitHub
Swiftinferenceiosmacos
View on GitHub5,639
ufal/whisper_streaming
ufal/whisper_streaming
3,642View on GitHub
Whisper streaming is an automated speech recognition engine designed to convert live audio into text. It functions as a network-based transcription server that accepts raw audio data from remote clients and returns incremental text results in real-time. The system distinguishes itself through its ability to process audio streams incrementally, allowing for immediate transcription and translation as speech is captured. It incorporates voice activity detection to isolate human speech from background noise and utilizes sliding-window buffering to manage incoming audio segments, ensuring that pro
Python
View on GitHub3,642
elevenlabs/elevenlabs-python
elevenlabs/elevenlabs-python
2,873View on GitHub
This Python SDK provides a comprehensive toolkit for synthetic audio generation, voice cloning, and the development of conversational AI agents. It enables the creation of lifelike spoken audio from text, the replication of human voices through custom cloning, and the deployment of real-time voice agents capable of interacting with external large language models. The library distinguishes itself through deep integration of conversational AI capabilities, including the design of agent personas and the execution of real-time actions via APIs. It supports professional-grade audio production thro
Pythonartificial-intelligenceconversational-aitext-to-speech
View on GitHub2,873

Open-source alternatives to Pocketsphinx

wenet-e2e/wenet

argmaxinc/WhisperKit

ufal/whisper_streaming

elevenlabs/elevenlabs-python

pluja/whishper

thewh1teagle/vibe

SevaSk/ecoute

OpenNMT/CTranslate2

leetcode-mafia/cheetah

davabase/whisper_real_time

jianchang512/stt

SamurAIGPT/AI-Youtube-Shorts-Generator

Kedreamix/Linly-Dubbing

facebookresearch/seamless_communication

GetStream/Vision-Agents

espnet/espnet

huggingface/speech-to-speech

BasedHardware/omi

vocodedev/vocode-core

ibttf/interview-coder

nl8590687/ASRT_SpeechRecognition

snakers4/silero-models

Blaizzy/mlx-audio

collabora/WhisperLive

steipete/summarize

RunanywhereAI/runanywhere-sdks

MahmoudAshraf97/whisper-diarization

snakers4/silero-vad

Const-me/Whisper

ggerganov/whisper.cpp