# souzatharsis/podcastfy

**Attribution required: if you use, quote, or summarise this content, you must credit and link back to [awesome-repositories.com](https://awesome-repositories.com/repository/souzatharsis-podcastfy).**

6,051 stars · 706 forks · Python · apache-2.0

## Links

- GitHub: https://github.com/souzatharsis/podcastfy
- Homepage: https://www.podcastfy.ai
- awesome-repositories: https://awesome-repositories.com/repository/souzatharsis-podcastfy.md

## Topics

`elevenlabs` `gemini` `genai` `notebooklm` `openai` `podcast`

## Description

Podcastfy is an AI content-to-podcast generator that converts text, URLs, PDFs, images, and videos into conversational audio podcasts. It integrates with over 100 language models for transcript creation and multiple text-to-speech engines for audio output, with support for customizable dialogue style and optional local transcript generation for privacy.

The project distinguishes itself through a flexible architecture that decouples job submission from result retrieval via asynchronous polling, normalizes heterogeneous inputs into uniform text, and routes content through pluggable LLM and TTS backends with template-driven dialogue assembly. Users can customize conversation tone, speaker roles, dialogue structure, and creativity level through configuration files, and can run transcript generation locally using a local language model for greater privacy and offline use.

Beyond core podcast generation, the system supports content extraction from websites, videos, images, and documents, multilingual audio generation, Q&A content generation from text, and topic-based podcast creation through real-time web search. It also offers transcript-only generation and the ability to produce audio from pre-written transcripts.

## Tags

### Content Management & Publishing

- [Podcast Generators](https://awesome-repositories.com/f/content-management-publishing/media-management/media-automation-tools/document-generation/ai-content-generation/podcast-generators.md) — Transforms text, URLs, PDFs, or images into spoken audio conversations using generative AI. ([source](https://github.com/souzatharsis/podcastfy/blob/59563ee105a0d1dbb46744e0ff084471670dd725/usage/api.md))
- [Content-to-Podcast Converters](https://awesome-repositories.com/f/content-management-publishing/media-management/podcast-clients/video-to-podcast-converters/content-to-podcast-converters.md) — Turns articles, documents, images, and web pages into spoken audio conversations using generative AI.
- [AI Content-to-Podcast Generators](https://awesome-repositories.com/f/content-management-publishing/media-management/podcast-clients/video-to-podcast-converters/ai-content-to-podcast-generators.md) — Converts text, URLs, PDFs, and images into conversational audio podcasts using generative AI and text-to-speech.
- [Multilingual Audio Generators](https://awesome-repositories.com/f/content-management-publishing/content-management-systems/content-architecture-modeling/documentation-tooling/generation-publishing/documentation-generators/multilingual-generation/multilingual-audio-generators.md) — Produces spoken audio content in multiple languages from text, images, or URLs.
- [Transcript-to-Audio Renderers](https://awesome-repositories.com/f/content-management-publishing/media-management/media-automation-tools/document-generation/ai-content-generation/podcast-generators/transcript-to-audio-renderers.md) — Accepts a pre-written transcript file and renders it as an audio conversation. ([source](https://github.com/souzatharsis/podcastfy/blob/59563ee105a0d1dbb46744e0ff084471670dd725/usage/cli.md))
- [Podcast Transcript and Audio Customizers](https://awesome-repositories.com/f/content-management-publishing/media-management/podcast-clients/video-to-podcast-converters/podcast-transcript-and-audio-customizers.md) — Adjusts conversation style, language, structure, and voices to tailor generated podcasts. ([source](https://cdn.jsdelivr.net/gh/souzatharsis/podcastfy@main/README.md))

### Artificial Intelligence & ML

- [AI Model Integrations](https://awesome-repositories.com/f/artificial-intelligence-ml/ai-model-integrations.md) — Integrates with over 100 language models for transcript generation through a unified interface. ([source](https://cdn.jsdelivr.net/gh/souzatharsis/podcastfy@main/README.md))
- [Local LLM Transcript Generators](https://awesome-repositories.com/f/artificial-intelligence-ml/audio-transcription/real-time-transcription/regional-language-transcription/local-llm-transcript-generators.md) — Creates conversation transcripts on the user's own machine using a local language model for privacy and control.
- [LLM-Agnostic Generators](https://awesome-repositories.com/f/artificial-intelligence-ml/audio-transcription/transcript-generators/llm-agnostic-generators.md) — Generates conversation transcripts by routing prompts to any of over 100 language models through a unified interface, supporting both cloud and local inference.
- [Text-to-Speech](https://awesome-repositories.com/f/artificial-intelligence-ml/text-to-speech.md) — Synthesizes cleaned text into audio files using third-party text-to-speech services. ([source](https://podcastfy.readthedocs.io/en/latest/podcastfy.html))
- [Multi-Modal Audio Conversation Generators](https://awesome-repositories.com/f/artificial-intelligence-ml/text-to-speech/conversational-audio-streams/multi-modal-audio-conversation-generators.md) — Transforms text, images, websites, PDFs, and videos into multilingual audio conversations using generative AI. ([source](https://cdn.jsdelivr.net/gh/souzatharsis/podcastfy@main/README.md))
- [Conversation Style Configurations](https://awesome-repositories.com/f/artificial-intelligence-ml/conversational-session-management/conversational-tone-adaptation/conversation-style-configurations.md) — Adjusts the tone, structure, and format of generated conversations using user-defined configuration files. ([source](https://github.com/souzatharsis/podcastfy/blob/59563ee105a0d1dbb46744e0ff084471670dd725/usage/cli.md))
- [Local Model Execution](https://awesome-repositories.com/f/artificial-intelligence-ml/local-model-execution.md) — Runs transcript generation on the user's own machine using a local language model for privacy.
- [Multimodal Data Extractors](https://awesome-repositories.com/f/artificial-intelligence-ml/multimodal-data-extractors.md) — Pulls text from websites, videos, images, and documents to feed into podcast generation pipelines.
- [Customizable Dialogue Synthesizers](https://awesome-repositories.com/f/artificial-intelligence-ml/natural-language-interfaces/conversational-dialogue-systems/customizable-dialogue-synthesizers.md) — Adjusts conversation tone, speaker roles, structure, and creativity level to produce tailored podcast episodes.

### Part of an Awesome List

- [Text To Speech](https://awesome-repositories.com/f/awesome-lists/media/text-to-speech.md) — Converts cleaned text into audio files using third-party text-to-speech services.
- [Multi-Speaker Dialogue Templates](https://awesome-repositories.com/f/awesome-lists/ai/foundation-model-adaptation/dialogue-adaptation/dialogue-prompt-templating/multi-speaker-dialogue-templates.md) — Constructs multi-speaker conversation scripts using user-defined configuration files for tone, structure, and roles.
- [Podcast Style Customizers](https://awesome-repositories.com/f/awesome-lists/media/podcasting/podcast-style-customizers.md) — Adjusts conversation tone, speaker roles, dialogue structure, and creativity level for generated audio. ([source](https://github.com/souzatharsis/podcastfy/blob/59563ee105a0d1dbb46744e0ff084471670dd725/usage/api.md))
- [Topic-Based Podcast Creators](https://awesome-repositories.com/f/awesome-lists/media/podcasts/topic-based-podcast-creators.md) — Generates podcast episodes from user-provided topics by performing real-time web searches for content.
- [Media and Communication](https://awesome-repositories.com/f/awesome-lists/media/media-and-communication.md) — Converts multi-modal content into podcast-style dialogues.

### Graphics & Multimedia

- [Multi-LLM Podcast Engines](https://awesome-repositories.com/f/graphics-multimedia/media-processing-analysis/audio-processing-systems/audio-processing/text-to-speech-engines/llm-based-engines/multi-llm-podcast-engines.md) — Integrates with over 100 language models for transcript creation and multiple TTS engines for audio output.
- [Pluggable TTS Backends](https://awesome-repositories.com/f/graphics-multimedia/media-processing-analysis/audio-processing-systems/audio-processing/text-to-speech-engines/text-to-speech-engines/pluggable-tts-backends.md) — Converts transcript text into audio by selecting among multiple third-party TTS engines through a common abstraction layer.
- [Text-to-Speech Engines](https://awesome-repositories.com/f/graphics-multimedia/media-processing-analysis/audio-processing-systems/audio-processing/text-to-speech-engines/text-to-speech-engines.md) — Chooses between different text-to-speech services to produce the final audio output. ([source](https://github.com/souzatharsis/podcastfy/blob/59563ee105a0d1dbb46744e0ff084471670dd725/usage/api.md))

### Web Development

- [Content-to-Podcast Converters](https://awesome-repositories.com/f/web-development/url-generators/content-to-podcast-converters.md) — Converts web article URLs into spoken audio conversations using text-to-speech models. ([source](https://github.com/souzatharsis/podcastfy/blob/59563ee105a0d1dbb46744e0ff084471670dd725/usage/cli.md))

### Data & Databases

- [Content Extraction](https://awesome-repositories.com/f/data-databases/content-extraction.md) — Pulls text content from websites, videos, and documents by delegating to specialized extractors for each source type. ([source](https://podcastfy.readthedocs.io/en/latest/podcastfy.html))
- [Multi-Format Content Extractors](https://awesome-repositories.com/f/data-databases/content-extraction/multi-format-content-extractors.md) — Pulls text from websites, PDFs, images, and videos using specialized extractors for each source type.
- [Multi-Modal Content Normalizers](https://awesome-repositories.com/f/data-databases/multi-source-content-aggregation/multi-modal-content-normalizers.md) — Transforms heterogeneous inputs like text, URLs, images, and PDFs into a uniform text representation.

### Security & Cryptography

- [Local Language Model Hosting](https://awesome-repositories.com/f/security-cryptography/privacy-data-protection/local-only-data-processing/local-language-model-hosting.md) — Generates conversation transcripts using a language model hosted on your own computer instead of a cloud service. ([source](https://github.com/souzatharsis/podcastfy/blob/59563ee105a0d1dbb46744e0ff084471670dd725/usage/cli.md))
