# moonshine-ai/moonshine

**Attribution required: if you use, quote, or summarise this content, you must credit and link back to [awesome-repositories.com](https://awesome-repositories.com/repository/moonshine-ai-moonshine).**

4,243 stars · 209 forks · C

## Links

- GitHub: https://github.com/moonshine-ai/moonshine
- awesome-repositories: https://awesome-repositories.com/repository/moonshine-ai-moonshine.md

## Description

Moonshine is a cross-platform AI inference core and toolkit designed for executing quantized neural networks locally on edge hardware. It serves as a high-performance runtime for running large language models and speech processing tasks without cloud connectivity.

The project functions as an edge speech-to-text engine and an on-device text-to-speech synthesizer. It enables the creation of voice interfaces by combining real-time transcription, intent recognition, and the ability to generate audible speech from written text using phonetic lexicons.

The system covers several broad capability areas, including speaker diarization to distinguish individual voices, phonetic text processing using the International Phonetic Alphabet, and conversational flow management. It also includes tools for model weight quantization and multicore compute distribution to optimize performance on local hardware.

## Tags

### Artificial Intelligence & ML

- [Cross-Platform Inference Frameworks](https://awesome-repositories.com/f/artificial-intelligence-ml/cross-platform-inference-frameworks.md) — Implements a high-performance inference core that enables local execution of large language models across different operating systems.
- [Speech-to-Text Engines](https://awesome-repositories.com/f/artificial-intelligence-ml/speech-to-text-engines.md) — Provides a high-performance engine for transforming spoken audio into text across multiple languages. ([source](https://github.com/moonshine-ai/moonshine/blob/main/core/moonshine-c-api.h))
- [On-Device Text-to-Speech Synthesizers](https://awesome-repositories.com/f/artificial-intelligence-ml/text-to-speech/on-device-text-to-speech-synthesizers.md) — Provides an on-device text-to-speech engine that generates speech using local acoustic models. ([source](https://github.com/moonshine-ai/moonshine/blob/main/CMakeLists.txt))
- [Intent Classification Pipelines](https://awesome-repositories.com/f/artificial-intelligence-ml/agent-architectures/user-intent-modeling/intent-classification-pipelines.md) — Detects goals from transcribed speech by comparing utterance embeddings against registered triggers. ([source](https://github.com/moonshine-ai/moonshine/blob/main/examples/python/intent_recognition.py))
- [Multilingual Transcription](https://awesome-repositories.com/f/artificial-intelligence-ml/audio-language-model-implementations/multilingual-transcription.md) — Converts spoken audio to text across various languages by loading language-specific models. ([source](https://github.com/moonshine-ai/moonshine/blob/main/python))
- [Acoustic Model Synthesis](https://awesome-repositories.com/f/artificial-intelligence-ml/deep-learning-architectures/phonetic-text-analysis/phonetic-representations/lexicon-mappings/acoustic-model-synthesis.md) — Generates audible speech by mapping phonetic representations to acoustic models and voice style tensors.
- [Grapheme To Phoneme Conversion](https://awesome-repositories.com/f/artificial-intelligence-ml/grapheme-to-phoneme-conversion.md) — Transforms written text into International Phonetic Alphabet notation using lexicons and syllable parsing for synthesis.
- [Grapheme-to-Phoneme Converters](https://awesome-repositories.com/f/artificial-intelligence-ml/grapheme-to-phoneme-converters.md) — Transforms Arabic and other text into International Phonetic Alphabet notation using lexicons and vocalization models. ([source](https://github.com/moonshine-ai/moonshine/blob/main/core/moonshine-tts/data/ar_msa))
- [Large Language Models](https://awesome-repositories.com/f/artificial-intelligence-ml/large-language-models.md) — Executes large language model inference locally on hardware to ensure high performance and low latency. ([source](https://github.com/moonshine-ai/moonshine/blob/main/CMakeLists.txt))
- [Edge AI Model Deployment](https://awesome-repositories.com/f/artificial-intelligence-ml/machine-learning/infrastructure/model-deployment-and-serving/local-and-on-device-inference/edge-ai-model-deployment.md) — Runs language models locally on cross-platform hardware to enable AI inference without cloud connectivity. ([source](https://github.com/moonshine-ai/moonshine/blob/main/examples/windows/cli-transcriber/download-lib.bat))
- [On-Device Speech Recognizers](https://awesome-repositories.com/f/artificial-intelligence-ml/machine-learning/speech-processing/automatic-speech-recognition/on-device-speech-recognizers.md) — Converts spoken letters and digits into text using a quantized neural network running entirely on-device. ([source](https://github.com/moonshine-ai/moonshine/blob/main/micro/README.md))
- [On-Device Inference Engines](https://awesome-repositories.com/f/artificial-intelligence-ml/on-device-inference-engines.md) — Executes large language models locally on edge hardware to ensure low latency and offline availability. ([source](https://github.com/moonshine-ai/moonshine/tree/main/examples))
- [On-Device Models](https://awesome-repositories.com/f/artificial-intelligence-ml/on-device-models.md) — Runs large language models locally on device hardware for offline data processing. ([source](https://github.com/moonshine-ai/moonshine/blob/main/swift))
- [Local LLM Execution](https://awesome-repositories.com/f/artificial-intelligence-ml/on-device-models/local-llm-execution.md) — Enables the execution of large language models on local hardware to provide inference without cloud connectivity.
- [On-Device Speech-to-Text SDKs](https://awesome-repositories.com/f/artificial-intelligence-ml/on-device-models/on-device-speech-to-text-sdks.md) — Provides an on-device speech-to-text engine for real-time transcription of audio streams and files.
- [Weight Quantization](https://awesome-repositories.com/f/artificial-intelligence-ml/quantized-inference-runtimes/weight-quantization.md) — Reduces neural network precision to 8-bit to lower memory usage and bundle size for edge deployment.
- [Real-Time Audio Transcribers](https://awesome-repositories.com/f/artificial-intelligence-ml/real-time-audio-transcribers.md) — Provides real-time transcription of audio captured from microphones and other data sources. ([source](https://github.com/moonshine-ai/moonshine/tree/main/python))
- [Real-Time Speaker Identifiers](https://awesome-repositories.com/f/artificial-intelligence-ml/speaker-diarization/real-time-speaker-identifiers.md) — Distinguishes between multiple speakers in a live audio stream by assigning unique vocal identifiers. ([source](https://github.com/moonshine-ai/moonshine/blob/main/README.md))
- [Speech Processing Toolkits](https://awesome-repositories.com/f/artificial-intelligence-ml/speech-processing-toolkits.md) — Offers a comprehensive toolkit for local LLM execution, speech-to-text, and text-to-speech on edge hardware.
- [Speech-to-Text Libraries](https://awesome-repositories.com/f/artificial-intelligence-ml/speech-to-text-libraries.md) — Provides libraries to convert live audio streams and files into text segments. ([source](https://github.com/moonshine-ai/moonshine#readme))
- [Text-to-Speech](https://awesome-repositories.com/f/artificial-intelligence-ml/text-to-speech.md) — Converts written text into audible speech using pre-trained models and language lexicons. ([source](https://github.com/moonshine-ai/moonshine/blob/main/core/moonshine-tts/data))
- [On-Device Synthesizers](https://awesome-repositories.com/f/artificial-intelligence-ml/text-to-speech/cli-speech-synthesizers/on-device-synthesizers.md) — Convert text into audible speech across multiple languages and voices for playback. ([source](https://github.com/moonshine-ai/moonshine#readme))
- [Intent Recognition](https://awesome-repositories.com/f/artificial-intelligence-ml/agent-architectures/user-intent-modeling/intent-recognition.md) — Matches spoken phrases against preprogrammed commands using semantic fuzzy matching to trigger actions. ([source](https://github.com/moonshine-ai/moonshine#readme))
- [Conversational Flow Controllers](https://awesome-repositories.com/f/artificial-intelligence-ml/agentic-systems-frameworks/conversational-voice-interaction/conversational-ai-agents/conversational-turn-detection/conversational-flow-controllers.md) — Orchestrates multi-step voice interactions using branching logic and conversational flow control. ([source](https://github.com/moonshine-ai/moonshine#readme))
- [Real-Time Transcription](https://awesome-repositories.com/f/artificial-intelligence-ml/audio-transcription/real-time-transcription.md) — Processes audio input incrementally in real-time to reduce delay during the transcription process. ([source](https://github.com/moonshine-ai/moonshine/blob/main/core/moonshine-c-api.h))
- [Phonetic Text Analysis](https://awesome-repositories.com/f/artificial-intelligence-ml/deep-learning-architectures/phonetic-text-analysis.md) — Transforms written text into International Phonetic Alphabet notation to ensure precise pronunciation during speech synthesis.
- [Phonetic String Generators](https://awesome-repositories.com/f/artificial-intelligence-ml/deep-learning-architectures/phonetic-text-analysis/phonetic-representations/phonetic-string-encoders/phonetic-string-generators.md) — Transforms written text into International Pronunciation Alphabet strings without producing audible audio. ([source](https://github.com/moonshine-ai/moonshine#readme))
- [Multilingual Model Loading](https://awesome-repositories.com/f/artificial-intelligence-ml/multilingual-model-loading.md) — Processes speech across global languages by loading the appropriate language-specific models. ([source](https://github.com/moonshine-ai/moonshine/tree/main/python))
- [Inference Optimizations](https://awesome-repositories.com/f/artificial-intelligence-ml/neural-networks/inference-optimizations.md) — Distributes neural network computations across multiple processor cores to reduce inference latency. ([source](https://github.com/moonshine-ai/moonshine/tree/main/micro))
- [Phonetic Text Processors](https://awesome-repositories.com/f/artificial-intelligence-ml/phonetic-text-processors.md) — Translate text into International Phonetic Alphabet notation using lexicon lookups and syllable parsing. ([source](https://github.com/moonshine-ai/moonshine/blob/main/core/moonshine-tts/data/hi))
- [Speaker Diarization](https://awesome-repositories.com/f/artificial-intelligence-ml/speaker-diarization.md) — Distinguishes between multiple speakers in a single recording by analyzing unique vocal characteristics. ([source](https://github.com/moonshine-ai/moonshine/blob/main/core/moonshine-c-api.h))
- [Offline Media Transcribers](https://awesome-repositories.com/f/artificial-intelligence-ml/speech-transcription/automated-video-transcribers/offline-media-transcribers.md) — Converts speech from recorded files into text using on-device pre-trained models without internet connection. ([source](https://github.com/moonshine-ai/moonshine/blob/main/examples/windows/cli-transcriber))
- [Voice Activity Detection](https://awesome-repositories.com/f/artificial-intelligence-ml/voice-activity-detection.md) — Identifies the start and end of spoken phrases in an audio stream to trigger speech processing. ([source](https://github.com/moonshine-ai/moonshine/blob/main/core/moonshine-c-api.h))

### Part of an Awesome List

- [Text-to-Speech](https://awesome-repositories.com/f/awesome-lists/ai/text-to-speech.md) — Synthesizes audible speech from French text using IPA lexicons and pronunciation rules. ([source](https://github.com/moonshine-ai/moonshine/blob/main/core/moonshine-tts/data/fr))

### Graphics & Multimedia

- [Audio Streaming Pipelines](https://awesome-repositories.com/f/graphics-multimedia/audio-music/audio-streaming-engines/audio-playback-engines/chunked-audio-streaming/generative-audio-chunking/audio-streaming-pipelines.md) — Processes audio input in incremental chunks through a pipeline to produce real-time text transcriptions with low latency.
- [Text-to-Speech Engines](https://awesome-repositories.com/f/graphics-multimedia/media-processing-analysis/audio-processing-systems/audio-processing/text-to-speech-engines/text-to-speech-engines.md) — Implements a background pipeline to synthesize text into spoken audio for seamless utterance playback. ([source](https://github.com/moonshine-ai/moonshine/blob/main))
- [Audio Feature Extraction](https://awesome-repositories.com/f/graphics-multimedia/media-processing-analysis/media-manipulation/media-processing-workflows/audio-analysis-synthesis/audio-feature-extraction.md) — Transforms raw audio into normalized spectrograms to prepare data for neural network processing. ([source](https://github.com/moonshine-ai/moonshine/blob/main/micro/README.md))

### Software Engineering & Architecture

- [Semantic Intent Mapping](https://awesome-repositories.com/f/software-engineering-architecture/intent-based-coordination/intent-to-skill-mappings/semantic-intent-mapping.md) — Triggers system actions by comparing utterance embeddings against registered command triggers using semantic fuzzy matching.

### User Interface & Experience

- [Voice Command Mapping](https://awesome-repositories.com/f/user-interface-experience/input-mapping/voice-command-mapping.md) — Triggers specific functions by mapping transcribed audio to programmed intents using an embedding model. ([source](https://github.com/moonshine-ai/moonshine/blob/main/python))
- [Voice Interfaces](https://awesome-repositories.com/f/user-interface-experience/voice-interfaces.md) — Provides a framework for building voice-controlled systems that process speech and trigger actions locally. ([source](https://github.com/moonshine-ai/moonshine/blob/main/examples/raspberry-pi/my-dalek/README.md))

### Operating Systems & Systems Programming

- [Neural Compute Distribution](https://awesome-repositories.com/f/operating-systems-systems-programming/neural-compute-distribution.md) — Spreads neural network calculations across multiple processor cores to reduce latency while maintaining bit-identical output.