Whisper | Awesome Repository

Whisper is a high-performance speech-to-text inference engine that uses graphics hardware shaders to accelerate the transcription of spoken audio into written text. It implements a GPU-accelerated automatic speech recognition framework specifically designed to run Whisper models.

The system focuses on high-speed processing for both recorded audio files and live microphone streams. It utilizes voice activity detection to analyze raw audio in real time, triggering the inference engine only when human speech is detected.

The engine covers a broad range of capabilities including real-time audio capture, GPGPU inference optimization, and compute performance profiling to measure the execution time of individual shaders.

Features

Automatic Speech Recognition - Implements a high-performance automatic speech recognition system using OpenAI Whisper to transcribe audio in multiple languages.
Audio Transcription - Converts recorded audio files into text transcripts using a GPU-accelerated speech recognition model.
Real-Time Transcription - Provides instantaneous conversion of live microphone audio streams into text transcripts.
GPU-Accelerated Inference - Leverages the parallel processing power of GPUs specifically to accelerate the inference phase of speech recognition.

Features

Automatic Speech Recognition - Implements a high-performance automatic speech recognition system using OpenAI Whisper to transcribe audio in multiple languages.
Audio Transcription - Converts recorded audio files into text transcripts using a GPU-accelerated speech recognition model.
Real-Time Transcription - Provides instantaneous conversion of live microphone audio streams into text transcripts.
GPU-Accelerated Inference - Leverages the parallel processing power of GPUs specifically to accelerate the inference phase of speech recognition.

The engine covers a broad range of capabilities including real-time audio capture, GPGPU inference optimization, and compute performance profiling to measure the execution time of individual shaders.