# lipku/livetalking

**Attribution required: if you use, quote, or summarise this content, you must credit and link back to [awesome-repositories.com](https://awesome-repositories.com/repository/lipku-livetalking).**

8,042 stars · 1,287 forks · Python · Apache-2.0

## Links

- GitHub: https://github.com/lipku/LiveTalking
- Homepage: https://www.livetalking.ai
- awesome-repositories: https://awesome-repositories.com/repository/lipku-livetalking.md

## Topics

`aigc` `digihuman` `digital-human` `er-nerf` `lip-sync` `metahuman-stream` `musetalk` `nerf` `realtime` `streaming` `talking-head` `virtualhumans` `wav2lip`

## Description

LiveTalking is an interactive talking head engine and AI avatar management platform designed to synchronize synthetic speech with facial movements. It functions as a real-time orchestrator that connects large language models and text-to-speech services to neural-rendered digital humans.

The project distinguishes itself through low-latency streaming capabilities and the ability to handle real-time conversational interruptions. It supports advanced audio-visual customization, including human voice cloning and the ability to drive avatar expressions using real-time webcam data.

The platform covers a broad range of capabilities, including digital human animation, real-time video streaming via WebRTC and RTMP, and virtual camera broadcasting. It also provides tools for managing character profiles, coordinating idle animations, and rendering multiple avatars within a single frame.

The engine can be deployed via container images or cloud instances to ensure consistent environment management.

## Tags

### Artificial Intelligence & ML

- [Real-Time Lip Synchronization](https://awesome-repositories.com/f/artificial-intelligence-ml/real-time-speech-translation/real-time-video-audio-dubbing/real-time-lip-synchronization.md) — Aligns neural rendering models with audio or text streams to generate real-time lip-synchronized facial movements.
- [Audio-Driven Talking Head Synthesis](https://awesome-repositories.com/f/artificial-intelligence-ml/video-generation/image-to-video-generation/audio-driven-talking-head-synthesis.md) — Implements a rendering engine that synthesizes talking head videos by synchronizing facial movements with synthetic speech.
- [AI Audio-to-Video Synchronization](https://awesome-repositories.com/f/artificial-intelligence-ml/generative-ai-resources/diffusion-visual-models/generative-ai-tasks/ai-audio-to-video-synchronization.md) — Generate lip-synced digital human animations using neural rendering models to align visual speech with audio inputs. ([source](https://doc.livetalking.ai/))
- [Conversational Response Generation](https://awesome-repositories.com/f/artificial-intelligence-ml/language-model-response-generators/response-generation-configurations/conversational-response-generation.md) — Connects to large language models to automatically generate conversational text responses based on user input. ([source](https://cdn.jsdelivr.net/gh/lipku/livetalking@main/README.md))
- [Real-Time Conversational AI Frameworks](https://awesome-repositories.com/f/artificial-intelligence-ml/real-time-conversational-ai-frameworks.md) — Integrates large language models to create real-time, AI-driven conversational interactions for the avatar. ([source](https://doc.livetalking.ai/docs/usage/))
- [Text-to-Speech Conversions](https://awesome-repositories.com/f/artificial-intelligence-ml/speech-and-text-conversion/text-to-speech-conversions.md) — Convert written text into spoken audio using a model optimized for fast inference and short-form audio. ([source](https://doc.livetalking.ai/docs/tts/cosyvoice/))
- [Text-to-Speech Integrations](https://awesome-repositories.com/f/artificial-intelligence-ml/speech-to-text-integrations/text-to-speech-integrations.md) — Connects external voice synthesis services to transform text into audible speech for avatar animation. ([source](https://doc.livetalking.ai/))
- [Talking Head Generators](https://awesome-repositories.com/f/artificial-intelligence-ml/talking-head-generators.md) — Synchronizes facial expressions and head movements with audio to create an interactive talking head stream. ([source](https://doc.livetalking.ai/docs/quickstart/))
- [Speech Interruption Management](https://awesome-repositories.com/f/artificial-intelligence-ml/text-to-speech/speech-to-speech-models/speech-to-speech-frameworks/speech-interruption-management.md) — Immediately halts active audio output to allow for mid-sentence transitions and real-time interaction. ([source](https://cdn.jsdelivr.net/gh/lipku/livetalking@main/README.md))
- [Voice Cloning Tools](https://awesome-repositories.com/f/artificial-intelligence-ml/generative-ai-resources/speech-synthesis/voice-cloning-tools.md) — Includes tools for managing custom audio recordings to synthesize speech mimicking specific individuals. ([source](https://doc.livetalking.ai/docs/tts/omnitts/))
- [Video Generation Optimizations](https://awesome-repositories.com/f/artificial-intelligence-ml/video-generation-optimizations.md) — Manage resources and cache data to maintain high performance when generating long-form video content. ([source](https://doc.livetalking.ai/en/))
- [Voice Cloning](https://awesome-repositories.com/f/artificial-intelligence-ml/voice-cloning.md) — Synthesizes realistic audio for digital avatars using customized voice profiles. ([source](https://cdn.jsdelivr.net/gh/lipku/livetalking@main/README.md))

### User Interface & Experience

- [Interactive Video Avatar Generators](https://awesome-repositories.com/f/user-interface-experience/avatars/realtime-avatar-renderers/interactive-video-avatar-generators.md) — Synchronizes audio and video in real-time to create an interactive digital human that speaks provided text. ([source](https://doc.livetalking.ai/docs/quickstart))
- [Avatar Behavior Management](https://awesome-repositories.com/f/user-interface-experience/avatars/realtime-avatar-renderers/interactive-video-avatar-generators/avatar-session-management/avatar-behavior-management.md) — Provides a comprehensive platform for configuring digital character profiles and managing real-time avatar behaviors.
- [Avatar Speech Control](https://awesome-repositories.com/f/user-interface-experience/avatars/realtime-avatar-renderers/interactive-video-avatar-generators/avatar-speech-control.md) — Allows sending text content to trigger real-time speech and synchronized body movements of the digital human. ([source](https://doc.livetalking.ai/docs/api/))
- [Interruption Handlers](https://awesome-repositories.com/f/user-interface-experience/avatars/realtime-avatar-renderers/interactive-video-avatar-generators/avatar-speech-control/interruption-handlers.md) — Automatically stops the audio-video stream when a user interrupts the avatar to enable immediate responses. ([source](https://doc.livetalking.ai/docs/feature/))
- [Lip Synchronization Engines](https://awesome-repositories.com/f/user-interface-experience/avatars/realtime-avatar-renderers/lip-synchronization-engines.md) — Aligns character mouth movements with synthesized audio in real-time using neural rendering models. ([source](https://doc.livetalking.ai/docs/usage/))
- [Avatar Appearance Configurators](https://awesome-repositories.com/f/user-interface-experience/avatars/avatar-appearance-configurators.md) — Integrates user-defined images or models to change the visual identity of the digital human. ([source](https://doc.livetalking.ai/docs/feature/))
- [Idle Animation Triggers](https://awesome-repositories.com/f/user-interface-experience/user-idleness-detectors/idle-animation-triggers.md) — Plays predefined looping videos or movements when the avatar is not speaking to maintain natural presence. ([source](https://cdn.jsdelivr.net/gh/lipku/livetalking@main/README.md))

### Part of an Awesome List

- [Digital Human Synthesis](https://awesome-repositories.com/f/awesome-lists/ai/human-generation-and-synthesis/digital-human-synthesis.md) — Creates lifelike virtual humans by combining cloned voices and lip-synced video from uploaded media. ([source](https://cdn.jsdelivr.net/gh/lipku/livetalking@main/README.md))
- [Animation Drivers](https://awesome-repositories.com/f/awesome-lists/ai/human-generation-and-synthesis/digital-human-synthesis/animation-drivers.md) — Driving avatar expressions and mouth movements using text, audio inputs, or real-time webcam data for natural visual presence.

### Graphics & Multimedia

- [AI Avatar Streaming Bridges](https://awesome-repositories.com/f/graphics-multimedia/ai-avatar-streaming-bridges.md) — Ships a low-latency streaming bridge that delivers neural-rendered avatar animations to browsers and media servers.
- [Interactive Live Streaming](https://awesome-repositories.com/f/graphics-multimedia/interactive-live-streaming.md) — Delivers synchronized audio-visual AI avatar content via real-time connections for low-latency interaction. ([source](https://doc.livetalking.ai/docs/quickstart/))
- [Low-Latency Video Streaming](https://awesome-repositories.com/f/graphics-multimedia/low-latency-video-streaming.md) — Uses real-time protocols to deliver synchronized AI avatar video and audio with minimal end-to-end delay. ([source](https://doc.livetalking.ai/docs/quickstart))
- [Generative Video Streaming](https://awesome-repositories.com/f/graphics-multimedia/low-latency-video-streaming/generative-video-streaming.md) — Delivering low-latency, synchronized audio and video streams of AI generated characters to browsers or broadcasting software. ([source](https://cdn.jsdelivr.net/gh/lipku/livetalking@main/README.md))
- [Real-Time Media Streaming](https://awesome-repositories.com/f/graphics-multimedia/real-time-media-streaming.md) — Delivers synchronized audio and video to clients using low-latency protocols for interactive digital human experiences.
- [Video Streaming](https://awesome-repositories.com/f/graphics-multimedia/streaming-distribution/streaming-broadcasting/media-streaming/video-streaming.md) — Transmit digital human renders via streaming protocols or virtual cameras for use in live broadcasts or meetings. ([source](https://doc.livetalking.ai/docs/feature/))
- [Webcam-Driven Expressions](https://awesome-repositories.com/f/graphics-multimedia/audio-music/audio-processing/audio-emotion-classifiers/emotional-modulation/facial-expression-modulators/webcam-driven-expressions.md) — Translates real-time facial expressions from a webcam into avatar lip-sync and gestures. ([source](https://doc.livetalking.ai/docs/service/))
- [Multi-Model Orchestrators](https://awesome-repositories.com/f/graphics-multimedia/audio-music/speech-synthesis-tts/multi-model-orchestrators.md) — Provides a unified interface to route text through various voice synthesis engines for realistic spoken audio.
- [Virtual Camera Drivers](https://awesome-repositories.com/f/graphics-multimedia/camera-systems/virtual-camera-drivers.md) — Sends the generated avatar video stream to a virtual camera device for use in broadcasting software. ([source](https://doc.livetalking.ai/docs/usage/))
- [Multi-Avatar Rendering](https://awesome-repositories.com/f/graphics-multimedia/multi-avatar-rendering.md) — Renders multiple digital humans in one frame with assigned voices and speech tasks for each. ([source](https://doc.livetalking.ai/docs/service/))
- [Unlimited-Duration Talking Video Generators](https://awesome-repositories.com/f/graphics-multimedia/unlimited-duration-talking-video-generators.md) — Limit memory usage through a caching system to support virtually unlimited video length during live streaming. ([source](https://doc.livetalking.ai/docs/service/))

### Networking & Communication

- [Digital Human Connection Management](https://awesome-repositories.com/f/networking-communication/network-reliability-diagnostics/connection-session-management/connection-management/connection-lifecycle-managers/digital-human-connection-management.md) — Establishes real-time connections to receive synchronized audio and video streams of an AI avatar. ([source](https://doc.livetalking.ai/docs/api/))
- [WebRTC Streaming](https://awesome-repositories.com/f/networking-communication/webrtc-streaming.md) — Provides real-time delivery of synchronized AI avatar video to browsers via WebRTC. ([source](https://doc.livetalking.ai/docs/usage/))

### Software Engineering & Architecture

- [Stream Interruption](https://awesome-repositories.com/f/software-engineering-architecture/stream-interruption.md) — Implements the ability to immediately stop active video streams when user audio input is detected for natural conversational transitions.
- [Inference Session Isolation](https://awesome-repositories.com/f/software-engineering-architecture/inference-session-isolation.md) — Manages concurrent user connections by isolating individual avatar rendering pipelines using unique session identifiers.
- [Frame Memory Buffers](https://awesome-repositories.com/f/software-engineering-architecture/memory-buffering/frame-memory-buffers.md) — Uses a sliding-window memory buffer for video frames to prevent disk read bottlenecks during long-duration streaming.

### Business & Productivity Software

- [Voice-to-Text Input Automation](https://awesome-repositories.com/f/business-productivity-software/voice-to-text-input-automation.md) — Converts spoken audio into text via automatic speech recognition to drive AI avatar responses. ([source](https://doc.livetalking.ai/docs/usage/))
