Whishper is a graphical user interface for transcribing audio and video files into text using the Whisper model. It serves as a speech-to-text tool and subtitle file generator that converts spoken content into editable text and timed subtitle formats.
The project features an integrated transcription and translation interface, allowing users to refine automated results and convert transcribed text into different languages. It includes a visual editor for correcting speech recognition errors, adjusting segment timecodes, and performing bilingual translation reviews.
The system handles the full transcription workflow, from retrieving media via remote URLs to exporting final data. Supported export formats include SRT, VTT, JSON, and plain text.