# google-ai-edge/mediapipe

**Attribution required: if you use, quote, or summarise this content, you must credit and link back to [awesome-repositories.com](https://awesome-repositories.com/repository/google-ai-edge-mediapipe).**

33,820 stars · 5,799 forks · C++ · apache-2.0

## Links

- GitHub: https://github.com/google-ai-edge/mediapipe
- Homepage: https://ai.google.dev/edge/mediapipe
- awesome-repositories: https://awesome-repositories.com/repository/google-ai-edge-mediapipe.md

## Topics

`android` `audio-processing` `c-plus-plus` `calculator` `computer-vision` `deep-learning` `framework` `graph-based` `graph-framework` `inference` `machine-learning` `mediapipe` `mobile-development` `perception` `pipeline-framework` `stream-processing` `video-processing`

## Description

MediaPipe is a cross-platform machine learning framework designed for deploying vision, audio, and text processing models across mobile, desktop, and web environments. It functions as an on-device inference engine that executes complex models locally on edge hardware, ensuring low latency and privacy without requiring a constant internet connection.

The framework utilizes a graph-based pipeline orchestration system where data flows through a directed network of modular calculators to ensure synchronized and deterministic processing. It distinguishes itself through a unified runtime that provides consistent hardware abstraction and high-performance data pipelines, which manage synchronized streams of audio, video, and sensor data. To maximize throughput, the system employs hardware-accelerated tensor execution and zero-copy memory management, offloading heavy mathematical computations to specialized GPU or NPU backends.

Beyond local inference, the platform includes a generative AI integration layer that connects applications to remote language models. This interface supports real-time conversational interactions, streaming responses, and multi-turn prompts, with built-in capabilities for request structuring, response parsing, and authentication. These features allow developers to combine local media analysis with remote generative services within a single, modular architecture.

## Tags

### Artificial Intelligence & ML

- [Machine Learning Frameworks](https://awesome-repositories.com/f/artificial-intelligence-ml/machine-learning-frameworks.md) — A development environment for deploying vision, audio, and text processing models across mobile, desktop, and web platforms.
- [Cross-Platform Inference Frameworks](https://awesome-repositories.com/f/artificial-intelligence-ml/cross-platform-inference-frameworks.md) — Building machine learning features once and deploying them consistently across mobile, web, and desktop environments using a unified framework.
- [Model Deployment Frameworks](https://awesome-repositories.com/f/artificial-intelligence-ml/model-deployment-frameworks.md) — Provides a cross-platform runtime for deploying and executing pre-trained machine learning models on mobile, desktop, and web environments. ([source](https://developers.google.com/mediapipe))
- [On-Device Inference Engines](https://awesome-repositories.com/f/artificial-intelligence-ml/on-device-inference-engines.md) — A high-performance runtime that executes complex machine learning models locally on edge hardware to ensure low latency and privacy.
- [Pipeline Orchestration Frameworks](https://awesome-repositories.com/f/artificial-intelligence-ml/pipeline-orchestration-frameworks.md) — Data flows through a directed acyclic network of modular calculators to ensure synchronized and deterministic processing of complex tasks.
- [Generative AI Integrations](https://awesome-repositories.com/f/artificial-intelligence-ml/generative-ai-integrations.md) — A standardized interface for connecting applications to remote large language models while managing conversational state and streaming response data.
- [Generative AI Interfaces](https://awesome-repositories.com/f/artificial-intelligence-ml/generative-ai-interfaces.md) — Gemini API generates content using standard web requests, streaming event updates, or persistent connections to facilitate real-time and bi-directional conversational interactions. ([source](https://ai.google.dev/api/))
- [Hardware Acceleration Backends](https://awesome-repositories.com/f/artificial-intelligence-ml/hardware-acceleration-backends.md) — Heavy mathematical computations are offloaded to specialized GPU or NPU backends to ensure high performance on edge devices.
- [Generative AI Integration Layers](https://awesome-repositories.com/f/artificial-intelligence-ml/generative-ai-integration-layers.md) — Connecting applications to remote language models to handle conversational interactions, streaming responses, and complex multi-turn user prompts.
- [Prompt Engineering Templates](https://awesome-repositories.com/f/artificial-intelligence-ml/prompt-engineering-templates.md) — Gemini API structures request bodies using content and part objects to represent conversation history, including support for sending raw media data alongside text prompts. ([source](https://ai.google.dev/api/))
- [Stream Synchronization Utilities](https://awesome-repositories.com/f/artificial-intelligence-ml/stream-synchronization-utilities.md) — Data streams are aligned and synchronized using precise temporal metadata to maintain consistency across disparate input sources.

### Data & Databases

- [Data Processing Pipelines](https://awesome-repositories.com/f/data-databases/data-processing-pipelines.md) — Managing synchronized streams of audio, video, and sensor data through modular processing graphs for low-latency media analysis.

### Graphics & Multimedia

- [Computer Vision Frameworks](https://awesome-repositories.com/f/graphics-multimedia/computer-vision-frameworks.md) — Processing live video streams to detect objects, track movement, or recognize gestures instantly on mobile and desktop devices.

### Programming Languages & Runtimes

- [Cross-Platform Runtimes](https://awesome-repositories.com/f/programming-languages-runtimes/cross-platform-runtimes.md) — A single core implementation provides consistent performance and hardware abstraction across diverse operating systems and device architectures.
- [Hardware Acceleration Runtimes](https://awesome-repositories.com/f/programming-languages-runtimes/hardware-acceleration-runtimes.md) — A low-level execution environment that delegates heavy mathematical computations to specialized GPU or NPU hardware for maximum throughput.

### Software Engineering & Architecture

- [Data Processing Pipelines](https://awesome-repositories.com/f/software-engineering-architecture/data-processing-pipelines.md) — A modular architecture that processes streaming data through directed networks using synchronized timestamp management for deterministic and efficient execution.