# runanywhereai/runanywhere-sdks

**Attribution required: if you use, quote, or summarise this content, you must credit and link back to [awesome-repositories.com](https://awesome-repositories.com/repository/runanywhereai-runanywhere-sdks).**

8,781 stars · 265 forks · C++ · other

## Links

- GitHub: https://github.com/RunanywhereAI/runanywhere-sdks
- Homepage: https://www.runanywhere.ai
- awesome-repositories: https://awesome-repositories.com/repository/runanywhereai-runanywhere-sdks.md

## Topics

`android` `apple-intelligence` `cpp` `diffusion-models` `edge` `flutter` `inference` `ios` `kotlin` `llamacpp` `llm` `multimodal` `ollama` `on-device-ai` `react-native` `swift` `vlm` `voice-ai` `web` `websdk`

## Description

This project is an on-device AI SDK providing a framework for running large language models, vision models, and speech models locally. It serves as an orchestration layer for local LLM execution, ensuring data privacy and offline availability by utilizing hardware acceleration on the device.

The SDK is distinguished by its comprehensive voice and multimodal capabilities, including a coordinated voice pipeline for activity detection, speech-to-text, and text-to-speech synthesis. It also provides a dedicated implementation kit for local retrieval-augmented generation and tools for processing combined image and text inputs via vision-language models.

The broader capability surface covers model lifecycle management, including downloading, caching, and the dynamic swapping of fine-tuned adapters. It includes support for structured output generation, tool calling for external function integration, and hardware-accelerated image generation.

The system also incorporates performance monitoring for inference metrics and comprehensive audio-visual capture tools for camera and microphone input.

## Tags

### Artificial Intelligence & ML

- [Local Model Execution](https://awesome-repositories.com/f/artificial-intelligence-ml/local-model-execution.md) — Enables the execution of large language and vision models locally on hardware for privacy and offline availability. ([source](https://docs.runanywhere.ai/kotlin/installation.md))
- [On-Device Inference Engines](https://awesome-repositories.com/f/artificial-intelligence-ml/on-device-inference-engines.md) — Implements a high-performance runtime for executing large language and vision models locally using hardware acceleration.
- [Voice Pipelines](https://awesome-repositories.com/f/artificial-intelligence-ml/agentic-systems-frameworks/conversational-voice-interaction/voice-agents/voice-activity-detection/voice-pipelines.md) — Implements a coordinated offline pipeline for voice activity detection, speech-to-text, and text-to-speech synthesis.
- [Voice Interaction Management](https://awesome-repositories.com/f/artificial-intelligence-ml/agentic-systems-frameworks/conversational-voice-interaction/voice-agents/voice-activity-detection/wake-word-detection/voice-interaction-management.md) — Maintains a continuous voice session with automatic activity detection and event-based callbacks. ([source](https://docs.runanywhere.ai/react-native/voice-agent.md))
- [AI Integration Tools](https://awesome-repositories.com/f/artificial-intelligence-ml/ai-integration-tools.md) — Connects local models to external functions and custom code via structured tool calling and argument parsing.
- [Speech-to-Text Translation](https://awesome-repositories.com/f/artificial-intelligence-ml/audio-transcription/end-to-end-pipelines/speech-to-text-translation.md) — Converts mono PCM audio into text using specialized speech-to-text models. ([source](https://docs.runanywhere.ai/kotlin/quick-start.md))
- [Conversational Voice AI](https://awesome-repositories.com/f/artificial-intelligence-ml/conversational-voice-ai.md) — Provides a coordinated pipeline for building hands-free assistants using voice activity detection, speech-to-text, and text-to-speech.
- [Conversational Voice Pipelines](https://awesome-repositories.com/f/artificial-intelligence-ml/conversational-voice-pipelines.md) — Coordinates voice activity detection, speech-to-text, and text-to-speech into a continuous conversational loop.
- [GPU Acceleration](https://awesome-repositories.com/f/artificial-intelligence-ml/gpu-acceleration.md) — Utilizes hardware-specific GPU drivers to increase the processing speed of local AI models. ([source](https://docs.runanywhere.ai/sdks.md))
- [Hardware-Accelerated Inference](https://awesome-repositories.com/f/artificial-intelligence-ml/hardware-accelerated-inference.md) — Provides mechanisms to verify if the inference engine is utilizing hardware-accelerated GPU execution. ([source](https://docs.runanywhere.ai/web/quick-start.md))
- [Local Model Orchestrators](https://awesome-repositories.com/f/artificial-intelligence-ml/local-model-orchestrators.md) — Orchestrates the lifecycle, hardware acceleration, and streaming execution of local machine learning models.
- [Local RAG Implementations](https://awesome-repositories.com/f/artificial-intelligence-ml/local-rag-implementations.md) — Implements retrieval-augmented generation using private local datasets and offline LLMs for grounded answers.
- [Local Model Lifecycle Managers](https://awesome-repositories.com/f/artificial-intelligence-ml/machine-learning/infrastructure/model-management/local-model-lifecycle-managers.md) — Provides tools for downloading, caching, and loading model assets to optimize local device storage and RAM.
- [Multimodal Vision Inputs](https://awesome-repositories.com/f/artificial-intelligence-ml/machine-learning/multimodal-processing-tools/multimodal-vision-inputs.md) — Provides tools to process and interpret combined image and text inputs for visual analysis and descriptions.
- [Natural Language Generation](https://awesome-repositories.com/f/artificial-intelligence-ml/natural-language-generation.md) — Produces human-like natural language responses from prompts using configurable temperature and token limits. ([source](https://docs.runanywhere.ai/kotlin/quick-start.md))
- [On-Device Models](https://awesome-repositories.com/f/artificial-intelligence-ml/on-device-models.md) — Ships a comprehensive SDK for running large language, vision, and speech models locally on edge hardware.
- [Real-Time Audio Transcribers](https://awesome-repositories.com/f/artificial-intelligence-ml/real-time-audio-transcribers.md) — Transcribes live audio input and emits partial results for immediate user feedback. ([source](https://docs.runanywhere.ai/react-native/stt/stream.md))
- [Incremental Synthesis](https://awesome-repositories.com/f/artificial-intelligence-ml/speech-synthesis/incremental-synthesis.md) — Produces audio chunks incrementally so playback begins before the entire synthesis task completes. ([source](https://docs.runanywhere.ai/kotlin/tts/stream.md))
- [Local Speech Synthesis](https://awesome-repositories.com/f/artificial-intelligence-ml/text-to-speech/local-speech-synthesis.md) — Converts text into natural-sounding audio using neural models running locally on-device. ([source](https://cdn.jsdelivr.net/gh/runanywhereai/runanywhere-sdks@main/README.md))
- [Tool Calling](https://awesome-repositories.com/f/artificial-intelligence-ml/tool-calling.md) — Connects on-device models to external functions via typed tool definitions and automated execution. ([source](https://docs.runanywhere.ai/web/tool-calling.md))
- [Vector Embeddings](https://awesome-repositories.com/f/artificial-intelligence-ml/vector-embeddings.md) — Provides vector embeddings of text to enable semantic search and memory operations directly on-device. ([source](https://cdn.jsdelivr.net/gh/runanywhereai/runanywhere-sdks@main/README.md))
- [Vector Retrieval Systems](https://awesome-repositories.com/f/artificial-intelligence-ml/vector-retrieval-systems.md) — Implements local document chunking and similarity search via embedding models to provide grounded AI responses.
- [Vision-Language Inference](https://awesome-repositories.com/f/artificial-intelligence-ml/vision-language-inference.md) — Executes multimodal models that process combined image and text inputs to generate analytical descriptions. ([source](https://docs.runanywhere.ai/kotlin/introduction.md))
- [Voice Activity Detection](https://awesome-repositories.com/f/artificial-intelligence-ml/voice-activity-detection.md) — Identifies when a user starts and stops speaking using configurable sensitivity thresholds. ([source](https://docs.runanywhere.ai/kotlin/configuration.md))
- [Voice Conversational Loops](https://awesome-repositories.com/f/artificial-intelligence-ml/voice-conversational-loops.md) — Coordinates a single cycle of audio-to-text, AI reasoning, and text-to-audio synthesis. ([source](https://docs.runanywhere.ai/react-native/voice-agent.md))
- [Word-Level Timestamps](https://awesome-repositories.com/f/artificial-intelligence-ml/audio-transcription/word-level-timestamps.md) — Tracks start and end times for individual words to synchronize text and audio playback. ([source](https://docs.runanywhere.ai/react-native/stt/options.md))
- [Retrieval-Augmented Generation](https://awesome-repositories.com/f/artificial-intelligence-ml/conversational-interfaces/retrieval-augmented-generation.md) — Implements a retrieval-augmented generation pipeline that grounds AI responses using locally indexed documents. ([source](https://docs.runanywhere.ai/kotlin/introduction))
- [Argument Parsing](https://awesome-repositories.com/f/artificial-intelligence-ml/generative-ai-resources/decoding-generation-controls/tool-calling/argument-repairers/argument-parsing.md) — Extracts tool names and arguments from raw model text for manual execution logic. ([source](https://docs.runanywhere.ai/react-native/tool-calling.md))
- [Diffusion Model Managers](https://awesome-repositories.com/f/artificial-intelligence-ml/generative-ai-resources/diffusion-visual-models/generative-ai-models/diffusion-model-managers.md) — Manages the registration and downloading of diffusion model packages for local image generation. ([source](https://docs.runanywhere.ai/swift/diffusion.md))
- [Inference Telemetry](https://awesome-repositories.com/f/artificial-intelligence-ml/generative-ai-resources/generative-ai/generative-text-inference/inference-telemetry.md) — Tracks tokens used and generation speed to monitor local model performance and efficiency. ([source](https://docs.runanywhere.ai/kotlin/llm/stream.md))
- [Image Generation](https://awesome-repositories.com/f/artificial-intelligence-ml/image-generation.md) — Creates images from text prompts using hardware-accelerated diffusion models on-device. ([source](https://docs.runanywhere.ai/swift/diffusion.md))
- [Model Preloading Endpoints](https://awesome-repositories.com/f/artificial-intelligence-ml/machine-learning/infrastructure/model-deployment-and-serving/deployment-pipelines-and-endpoints/serving-endpoints/model-preloading-endpoints.md) — Loads specific models into memory during idle time to reduce latency when starting a user task. ([source](https://docs.runanywhere.ai/flutter/best-practices.md))
- [Incremental Inference Streaming](https://awesome-repositories.com/f/artificial-intelligence-ml/machine-learning/infrastructure/model-inference-serving/model-integration-pipelines/model-inference/inference-result-processors/incremental-inference-streaming.md) — Streams model outputs token-by-token using asynchronous iterators to reduce perceived user latency. ([source](https://docs.runanywhere.ai/kotlin/quick-start.md))
- [Model Adapters](https://awesome-repositories.com/f/artificial-intelligence-ml/model-adapters.md) — Injects lightweight LoRA adapters into base models at runtime to change behavior without reloading full weights.
- [Behavioral Constraints](https://awesome-repositories.com/f/artificial-intelligence-ml/model-behavioral-analysis/prompt-engineering-workflows/behavioral-constraints.md) — Defines personas, operational constraints, and response formats for models through system prompts. ([source](https://docs.runanywhere.ai/react-native/llm/system-prompts.md))
- [Model Persistence](https://awesome-repositories.com/f/artificial-intelligence-ml/model-training/model-persistence.md) — Caches downloaded models in persistent storage to prevent redundant network requests. ([source](https://docs.runanywhere.ai/web/best-practices.md))
- [LoRA Adapter Loaders](https://awesome-repositories.com/f/artificial-intelligence-ml/model-weight-management/lora-adapter-loaders.md) — Applies lightweight LoRA adapters to base models at runtime to modify behavior without full reloading. ([source](https://docs.runanywhere.ai/kotlin/lora.md))
- [Structured Output Generators](https://awesome-repositories.com/f/artificial-intelligence-ml/natural-language-code-generators/structured-generation-engines/structured-output-generators.md) — Constrains model responses to specific JSON schemas to ensure machine-readable and type-safe outputs. ([source](https://cdn.jsdelivr.net/gh/runanywhereai/runanywhere-sdks@main/README.md))
- [Speaker Diarization](https://awesome-repositories.com/f/artificial-intelligence-ml/speaker-diarization.md) — Detects and labels different speakers within a single recording to distinguish between voices. ([source](https://docs.runanywhere.ai/react-native/stt/options.md))
- [Concurrent Model Execution](https://awesome-repositories.com/f/artificial-intelligence-ml/stateful-model-execution/concurrent-model-execution.md) — Supports the simultaneous execution of multiple models in memory to facilitate complex processing pipelines. ([source](https://docs.runanywhere.ai/web/best-practices.md))
- [Text Generation Controls](https://awesome-repositories.com/f/artificial-intelligence-ml/text-generation-controls.md) — Provides controls for adjusting creativity, token limits, and streaming behavior during text generation. ([source](https://docs.runanywhere.ai/kotlin/configuration.md))

### Data & Databases

- [Local Model Loading](https://awesome-repositories.com/f/data-databases/local-model-loading.md) — Handles the downloading of model files from remote URLs and loading them into device memory. ([source](https://docs.runanywhere.ai/flutter/quick-start.md))
- [Document Ingestion Pipelines](https://awesome-repositories.com/f/data-databases/document-ingestion-pipelines.md) — Provides a pipeline for chunking, embedding, and indexing raw documents to facilitate local vector search. ([source](https://docs.runanywhere.ai/kotlin/rag.md))
- [Schema-Constrained Sampling](https://awesome-repositories.com/f/data-databases/json-schema-modeling/schema-validators/schema-constrained-sampling.md) — Enforces structured JSON or XML output formats by constraining token sampling during the model generation process.

### Graphics & Multimedia

- [Multimodal Analysis Engines](https://awesome-repositories.com/f/graphics-multimedia/media-processing-analysis/media-manipulation/media-processing-workflows/generative-visual-engines/multimodal-analysis-engines.md) — Implements a multimodal analysis engine to process visual content and text prompts together for image description. ([source](https://docs.runanywhere.ai/web/vlm.md))
- [Raw Audio Captures](https://awesome-repositories.com/f/graphics-multimedia/audio-music/audio-capture-and-playback/raw-audio-captures.md) — Records raw audio at specified sample rates, providing audio chunks and volume levels for AI processing. ([source](https://docs.runanywhere.ai/web/configuration.md))

### Part of an Awesome List

- [Model Memory Reclamation](https://awesome-repositories.com/f/awesome-lists/ai/memory-and-caching/model-memory-reclamation.md) — Removes unused large language or speech models from memory to reclaim system resources. ([source](https://docs.runanywhere.ai/flutter/best-practices.md))
- [Model Serving & Deployment](https://awesome-repositories.com/f/awesome-lists/ai/model-serving-deployment.md) — Runs AI models on-device for mobile platforms.

### Mobile Development

- [Camera Feed Capture](https://awesome-repositories.com/f/mobile-development/mobile-capabilities/camera-integration/camera-feed-capture.md) — Accesses the device camera to provide live RGB frames for analysis by on-device vision models. ([source](https://docs.runanywhere.ai/web/configuration.md))

### Networking & Communication

- [Token Streaming](https://awesome-repositories.com/f/networking-communication/real-time-event-streams/token-streaming.md) — Delivers AI model generated tokens to the user interface in real-time via asynchronous streams.

### System Administration & Monitoring

- [Event Monitoring Streams](https://awesome-repositories.com/f/system-administration-monitoring/event-monitoring-streams.md) — Provides an event stream to track AI-specific lifecycle events, including generation status and model loading. ([source](https://docs.runanywhere.ai/swift/best-practices.md))
- [Inference Performance Monitoring](https://awesome-repositories.com/f/system-administration-monitoring/monitoring-and-observability/inference-performance-monitoring.md) — Tracks inference-specific metrics such as tokens per second, latency, and time to first token. ([source](https://docs.runanywhere.ai/kotlin/best-practices.md))