DeepSpeech is an open-source speech-to-text framework and machine learning engine designed to convert spoken audio into written text locally on a device. It provides on-device speech recognition that operates without requiring an internet connection to external servers.
The system supports real-time speech transcription across a variety of hardware platforms, ranging from single-board computers and edge devices to GPU servers. This allows for audio analysis and processing directly on the local hardware.