Abogen | Awesome Repository

Abogen is a text-to-speech audiobook generator that transforms digital documents and subtitle files into audiobooks. It utilizes language models to perform text normalization, rewriting contractions and punctuation to produce more natural speech synthesis.

The system features a voice profile mixer that blends multiple voice models using adjustable weight ratios to create personalized synthetic voices. It also includes an automated export system that sends completed audio files and metadata to a remote Audiobookshelf server via a web API.

The project manages the end-to-end audiobook production workflow, covering document-to-audio conversion, chapter segmentation, and the embedding of metadata such as titles, authors, and cover art. It supports batch processing through a queue-based model and can generate synchronized subtitle files that match the timing of the generated speech.

Features

Audiobook Converters - Transforms digital documents and subtitle files into narrated audiobooks using language models for natural speech synthesis.
Document-to-Audio Synthesis - Transforms digital documents and subtitle files into high-quality audiobooks with synchronized subtitle tracks.
Voice Identity Interpolators - Creates custom vocal profiles by interpolating multiple voice models using adjustable weight ratios.

Features

Audiobook Converters - Transforms digital documents and subtitle files into narrated audiobooks using language models for natural speech synthesis.
Document-to-Audio Synthesis - Transforms digital documents and subtitle files into high-quality audiobooks with synchronized subtitle tracks.
Voice Identity Interpolators - Creates custom vocal profiles by interpolating multiple voice models using adjustable weight ratios.