# denizsafak/abogen

**Attribution required: if you use, quote, or summarise this content, you must credit and link back to [awesome-repositories.com](https://awesome-repositories.com/repository/denizsafak-abogen).**

4,135 stars · 254 forks · Python · mit

## Links

- GitHub: https://github.com/denizsafak/abogen
- Homepage: https://pypi.org/project/abogen/
- awesome-repositories: https://awesome-repositories.com/repository/denizsafak-abogen.md

## Topics

`audiobook` `audiobooks` `content-creation` `content-creator` `ebook` `epub` `epub-converter` `kokoro` `kokoro-82m` `kokoro-tts` `llm` `media-generation` `narrator` `speech-synthesis` `subtitles` `text-to-audio` `text-to-speech` `tts` `voice-conversion` `voice-synthesis`

## Description

Abogen is a text-to-speech audiobook generator that transforms digital documents and subtitle files into audiobooks. It utilizes language models to perform text normalization, rewriting contractions and punctuation to produce more natural speech synthesis.

The system features a voice profile mixer that blends multiple voice models using adjustable weight ratios to create personalized synthetic voices. It also includes an automated export system that sends completed audio files and metadata to a remote Audiobookshelf server via a web API.

The project manages the end-to-end audiobook production workflow, covering document-to-audio conversion, chapter segmentation, and the embedding of metadata such as titles, authors, and cover art. It supports batch processing through a queue-based model and can generate synchronized subtitle files that match the timing of the generated speech.

## Tags

### Artificial Intelligence & ML

- [Audiobook Converters](https://awesome-repositories.com/f/artificial-intelligence-ml/text-to-speech/audiobook-converters.md) — Transforms digital documents and subtitle files into narrated audiobooks using language models for natural speech synthesis.
- [Voice Identity Interpolators](https://awesome-repositories.com/f/artificial-intelligence-ml/model-weight-reconstruction/weight-interpolators/voice-identity-interpolators.md) — Creates custom vocal profiles by interpolating multiple voice models using adjustable weight ratios.
- [Text-to-Speech Conversions](https://awesome-repositories.com/f/artificial-intelligence-ml/speech-and-text-conversion/text-to-speech-conversions.md) — Converts digital documents and subtitles into high-quality audio files with natural speech and timing control.
- [Hybrid Voice Synthesis](https://awesome-repositories.com/f/artificial-intelligence-ml/voice-cloning/voice-identity-conversions/hybrid-voice-synthesis.md) — Blends multiple voice models using adjustable weights to create unique hybrid vocal identities.
- [Synthetic Voice Design](https://awesome-repositories.com/f/artificial-intelligence-ml/voice-cloning/voice-identity-conversions/synthetic-voice-design.md) — Blends multiple voice models with adjustable weight ratios to create personalized synthetic vocal identities.

### Data & Databases

- [Document-to-Audio Synthesis](https://awesome-repositories.com/f/data-databases/data-processing-pipelines/batch-processing-systems/batch-processing-utilities/audio-batch-utilities/text-to-audio-batch-conversion/document-to-audio-synthesis.md) — Transforms digital documents and subtitle files into high-quality audiobooks with synchronized subtitle tracks. ([source](https://cdn.jsdelivr.net/gh/denizsafak/abogen@main/README.md))
- [Task Queues](https://awesome-repositories.com/f/data-databases/batch-processing/task-queues.md) — Manages a queue of document processing tasks with support for per-file configuration overrides.
- [Text-to-Audio Batch Conversion](https://awesome-repositories.com/f/data-databases/data-processing-pipelines/batch-processing-systems/batch-processing-utilities/audio-batch-utilities/text-to-audio-batch-conversion.md) — Implements a queue-based system for converting multiple text files into audiobooks using batch processing.

### Content Management & Publishing

- [Audio Tagging](https://awesome-repositories.com/f/content-management-publishing/metadata-tagging/audio-tagging.md) — Embeds titles, authors, and cover images into audio containers via automatic extraction or tagging. ([source](https://cdn.jsdelivr.net/gh/denizsafak/abogen@main/README.md))
- [Chapter Segmentation Tools](https://awesome-repositories.com/f/content-management-publishing/chapter-segmentation-tools.md) — Splits source documents into separate audio files or merged tracks using internal structural markers.
- [E-Book Splitting](https://awesome-repositories.com/f/content-management-publishing/media-management/audiobook-servers/chapter-editors/e-book-splitting.md) — Splits e-books into separate audio files per chapter or merges them into a single file. ([source](https://cdn.jsdelivr.net/gh/denizsafak/abogen@main/README.md))
- [Timestamped Subtitle Generators](https://awesome-repositories.com/f/content-management-publishing/media-management/subtitle-management-systems/timestamped-subtitle-generators.md) — Generates synchronized subtitle files that match the exact timing of the synthesized speech. ([source](https://cdn.jsdelivr.net/gh/denizsafak/abogen@main/README.md))

### Graphics & Multimedia

- [Voice Customization](https://awesome-repositories.com/f/graphics-multimedia/media-processing-analysis/audio-processing-systems/audio-processing/text-to-speech-engines/text-to-speech-engines/voice-customization.md) — Blends multiple voice models with adjustable weights to create personalized synthetic voice profiles. ([source](https://cdn.jsdelivr.net/gh/denizsafak/abogen@main/README.md))
- [Audio-to-Text Alignment](https://awesome-repositories.com/f/graphics-multimedia/media-processing-analysis/audio-processing-systems/audio-processing/text-to-speech-engines/text-to-speech-engines/audio-to-text-alignment.md) — Aligns speech synthesis timing by mapping audio segments to specific markers in source subtitle files.

### Software Engineering & Architecture

- [Media Metadata Injections](https://awesome-repositories.com/f/software-engineering-architecture/contextual-data-injection/contextual-metadata-injection/media-metadata-injections.md) — Extracts document properties to embed titles, authors, and cover art directly into audio files.
- [Text-to-Speech Normalizers](https://awesome-repositories.com/f/software-engineering-architecture/string-validation-and-normalization/speech-to-text-normalizers/custom-text-normalizers/text-to-speech-normalizers.md) — Uses language models to normalize text, expanding contractions and cleaning punctuation for natural speech synthesis. ([source](https://cdn.jsdelivr.net/gh/denizsafak/abogen@main/README.md))
- [Batch Document Processing](https://awesome-repositories.com/f/software-engineering-architecture/batch-document-processing.md) — Implements a system for queuing multiple documents for sequential processing with individual settings. ([source](https://cdn.jsdelivr.net/gh/denizsafak/abogen@main/README.md))
