# tmoroney/auto-subs

**Attribution required: if you use, quote, or summarise this content, you must credit and link back to [awesome-repositories.com](https://awesome-repositories.com/repository/tmoroney-auto-subs).**

2,851 stars · 160 forks · TypeScript · mit

## Links

- GitHub: https://github.com/tmoroney/auto-subs
- Homepage: https://tom-moroney.com/auto-subs/
- awesome-repositories: https://awesome-repositories.com/repository/tmoroney-auto-subs.md

## Topics

`ai` `davinci` `davinci-resolve` `diarize` `linux` `mac` `openai` `pyannote` `resolve` `speaker` `speech-to-text` `subtitles` `subtitles-generator` `transcribe` `whisper` `windows`

## Description

Auto-subs is an AI transcription and automatic captioning tool that converts spoken audio from video files into synchronized subtitles. It functions as a subtitle generator and a transcription bridge, enabling the conversion of speech to text with automatic speaker identification and multi-language translation support.

The software prioritizes data privacy by utilizing on-device AI inference to process audio and video files locally on the user's hardware. It distinguishes itself by offering deep integration with professional video editing workflows, allowing users to export timing and transcription data directly into external editing software for precise alignment.

The system provides comprehensive capabilities for speaker diarization, visual style customization for captions, and the ability to bake stylized overlays directly into video frames. It supports both the export of standardized SRT files and the generation of subtitled video exports.

Translation services are integrated for both raw audio content and generated transcription text across a wide range of supported languages.

## Tags

### Artificial Intelligence & ML

- [On-Device Inference](https://awesome-repositories.com/f/artificial-intelligence-ml/inference-clients/on-device-inference.md) — Runs transcription models locally on the user's hardware to process audio and video files privately.
- [Audio and Video File Transcription](https://awesome-repositories.com/f/artificial-intelligence-ml/audio-and-video-file-transcription.md) — Extracts speech from media files offline using on-device processing to produce subtitles and timestamps. ([source](https://cdn.jsdelivr.net/gh/tmoroney/auto-subs@main/README.md))
- [Subtitle Translation](https://awesome-repositories.com/f/artificial-intelligence-ml/audio-transcription/multilingual-transcription/subtitle-translation.md) — Translates spoken audio or existing transcriptions into different languages to reach global audiences.
- [Audio Transcriptions](https://awesome-repositories.com/f/artificial-intelligence-ml/audio-transcriptions.md) — Converts spoken audio from video files into text with automatic speaker identification and translation support.
- [Local Speech-to-Text](https://awesome-repositories.com/f/artificial-intelligence-ml/local-speech-to-text.md) — Generates text from audio files using on-device processing to ensure sensitive data never leaves the machine.
- [Neural Machine Translation](https://awesome-repositories.com/f/artificial-intelligence-ml/neural-machine-translation.md) — Uses pre-trained language models to convert transcribed text or spoken audio between different natural languages.
- [Speaker Diarization](https://awesome-repositories.com/f/artificial-intelligence-ml/speaker-diarization.md) — Implements speaker diarization to identify and separate different voices within a recording.
- [Automated Video Subtitling](https://awesome-repositories.com/f/artificial-intelligence-ml/speech-transcription/automated-video-subtitling.md) — Combines AI transcription and translation to generate accurate time-stamped subtitles from video files.
- [Interactive Timing Adjustments](https://awesome-repositories.com/f/artificial-intelligence-ml/audio-transcription/transcription-timing-synchronizers/interactive-timing-adjustments.md) — Allows for manual transcript modifications while automatically adjusting corresponding timestamps to maintain precise audio alignment. ([source](https://cdn.jsdelivr.net/gh/tmoroney/auto-subs@main/README.md))
- [Audio-Visual Translation](https://awesome-repositories.com/f/artificial-intelligence-ml/multilingual-content-translation/audio-visual-translation.md) — Translates spoken audio within video content into over 100 different target languages. ([source](https://cdn.jsdelivr.net/gh/tmoroney/auto-subs@main/README.md))
- [Burned-in Subtitle Exports](https://awesome-repositories.com/f/artificial-intelligence-ml/speech-transcription/automated-video-subtitling/burned-in-subtitle-exports.md) — Embeds finalized transcription edits as professional subtitles directly into a video file for distribution. ([source](https://tom-moroney.com/auto-subs/))
- [Text Translation Services](https://awesome-repositories.com/f/artificial-intelligence-ml/text-translation-services.md) — Converts generated transcription text from one natural language to another using a wide library of supported languages. ([source](https://tom-moroney.com/auto-subs/))

### Content Management & Publishing

- [Automated Subtitle Generators](https://awesome-repositories.com/f/content-management-publishing/media-management/subtitle-management-systems/timestamped-subtitle-generators/automated-subtitle-generators.md) — Uses Whisper AI to automate the end-to-end process of transcribing audio and embedding subtitles locally.

### Graphics & Multimedia

- [Automatic On-Device Captioning](https://awesome-repositories.com/f/graphics-multimedia/automatic-on-device-captioning.md) — Generates synchronized text overlays from audio streams using local processing for privacy and offline use.
- [Transcript-Based Editing](https://awesome-repositories.com/f/graphics-multimedia/non-linear-video-editing/transcript-based-editing.md) — Enables alignment of captions with video cuts by exporting timing data into professional editing software.
- [Transcription Export Bridges](https://awesome-repositories.com/f/graphics-multimedia/non-linear-video-editing/transcript-based-editing/transcription-export-bridges.md) — Exports AI-generated timing and transcription data into professional video editing software.
- [Subtitle Styling](https://awesome-repositories.com/f/graphics-multimedia/subtitle-styling.md) — Enables customization of caption colors and effects to meet professional video production requirements. ([source](https://tom-moroney.com/auto-subs/))
- [Video Compositing](https://awesome-repositories.com/f/graphics-multimedia/video-compositing.md) — Bakes stylized subtitle overlays directly into the video frames during the rendering process.
- [Transcription Data Exports](https://awesome-repositories.com/f/graphics-multimedia/video-editor-plugin-integrations/transcription-data-exports.md) — Provides integration to export transcription and timing data into professional video editing software for precise caption alignment. ([source](https://cdn.jsdelivr.net/gh/tmoroney/auto-subs@main/README.md))
- [Subtitle Visual Customization](https://awesome-repositories.com/f/graphics-multimedia/video-frame-styling/subtitle-visual-customization.md) — Provides tools for customizing the visual appearance of subtitles and attributing text to different speakers.
