What are the best open-source alternatives to Mistral.rs?

30 open-source projects similar to ericlbuehler/mistral.rs, ranked by shared features. Top picks: abetlen/llama-cpp-python, sgl-project/sglang, modeltc/lightllm, langroid/langroid, openvinotoolkit/openvino, livekit/agents, openbmb/minicpm, opengvlab/internvl, cloudwego/eino, coaidev/coai.

Is abetlen/llama-cpp-python a good alternative to Mistral.rs?

llama-cpp-python provides a Python interface for the llama.cpp library, enabling the execution of large language models with hardware acceleration. It functions as a GGUF model loader and a structured text generator capable of running inference servers and multimodal runtimes for processing both te…

Is sgl-project/sglang a good alternative to Mistral.rs?

Sglang is a high-performance inference engine and serving system designed for large language and multimodal models. It provides a programmable interface for orchestrating complex generation workflows, enabling developers to coordinate multi-turn dialogues, tool invocations, and reasoning chains thr…

Is modeltc/lightllm a good alternative to Mistral.rs?

LightLLM is a high-performance serving framework for deploying and executing large language models. It functions as a multi-GPU inference engine and server capable of handling dense architectures, mixture-of-experts designs, and multimodal models that process both text and images. The system is di…

Is langroid/langroid a good alternative to Mistral.rs?

Langroid is a multi-agent orchestration framework and tool integration suite designed for building complex AI applications. It serves as a multi-modal integration layer that connects diverse local and remote language models with an agentic retrieval-augmented generation system. The project disting…

Is openvinotoolkit/openvino a good alternative to Mistral.rs?

OpenVINO is an AI inference engine and model serving platform designed to execute optimized deep learning models across CPUs, GPUs, and NPUs through a unified API. It includes a model optimization toolkit for converting, quantizing, and compressing models from various frameworks, alongside a specia…

Is livekit/agents a good alternative to Mistral.rs?

This project is a framework for developing multimodal AI agents that function as programmable participants in real-time communication rooms. It enables the construction of agents that can see, hear, and speak by integrating speech-to-text, large language models, and text-to-speech pipelines to faci…

Is openbmb/minicpm a good alternative to Mistral.rs?

MiniCPM is a collection of small language models designed for local, on-device deployment in resource-constrained environments. The project focuses on running dense Transformer models on consumer hardware, including GPUs, CPUs, and Apple Silicon, without requiring custom code forks. The project di…

Is opengvlab/internvl a good alternative to Mistral.rs?

InternVL is a vision-language model framework that fuses a visual encoder with a large language model to translate image features into textual tokens for reasoning. It provides a system for multimodal inference and dialogue, enabling the processing of images and text to answer questions or generate…

Is cloudwego/eino a good alternative to Mistral.rs?

Eino is an AI agent development kit and LLM application framework designed for building autonomous agents and orchestrating complex language model workflows. It serves as a multi-agent orchestration engine and workflow orchestrator, providing a graph-based execution model to route data between mode…

Is coaidev/coai a good alternative to Mistral.rs?

CoAI is an enterprise-grade, self-hostable AI gateway platform that unifies access to over 200 AI models from more than 35 providers through a single OpenAI-compatible API endpoint. It functions as a multi-tenant gateway, routing requests across providers with load balancing, automatic failover, an…

Back to ericlbuehler/mistral.rs

Open-source alternatives to Mistral.rs

30 open-source projects similar to ericlbuehler/mistral.rs, ranked by how many features they have in common. Compare stars, activity and what each one does to find the best Mistral.rs alternative.

abetlen/llama-cpp-python
abetlen/llama-cpp-python
9,993View on GitHub
llama-cpp-python provides a Python interface for the llama.cpp library, enabling the execution of large language models with hardware acceleration. It functions as a GGUF model loader and a structured text generator capable of running inference servers and multimodal runtimes for processing both text and image inputs. The project distinguishes itself through a local inference server that exposes model capabilities via an OpenAI-compatible web API. It supports advanced execution techniques including speculative decoding, weight quantization, and layer-based GPU offloading to manage memory acro
Python
View on GitHub9,993
sgl-project/sglang
sgl-project/sglang
29,079View on GitHub
Sglang is a high-performance inference engine and serving system designed for large language and multimodal models. It provides a programmable interface for orchestrating complex generation workflows, enabling developers to coordinate multi-turn dialogues, tool invocations, and reasoning chains through a domain-specific language. The platform is built to support production-scale deployments, offering an OpenAI-compatible API that allows for integration with existing application ecosystems. The system distinguishes itself through a disaggregated architecture that separates compute-intensive pr
Pythonattentionblackwellcuda
View on GitHub29,079
modeltc/lightllm
ModelTC/LightLLM
3,901View on GitHub
LightLLM is a high-performance serving framework for deploying and executing large language models. It functions as a multi-GPU inference engine and server capable of handling dense architectures, mixture-of-experts designs, and multimodal models that process both text and images. The system is distinguished by its specialized support for Mixture-of-Experts models using expert parallelism and fused kernels. It implements structured text generation through deterministic state machines and pushdown automata to enforce precise output formats. To optimize throughput, the framework employs specula
Pythondeep-learninggptllama
View on GitHub3,901

Open-source alternatives to Mistral.rs

abetlen/llama-cpp-python

sgl-project/sglang

ModelTC/LightLLM

langroid/langroid

openvinotoolkit/openvino

livekit/agents

OpenBMB/MiniCPM

OpenGVLab/InternVL

cloudwego/eino

coaidev/coai

xiangsx/gpt4free-ts

OpenNMT/CTranslate2

kserve/kserve

meta-llama/llama-models

kubeflow/kfserving

google-ai-edge/LiteRT-LM

zai-org/GLM-4.5

zhaochenyang20/Awesome-ML-SYS-Tutorial

openai/openai-agents-python

camel-ai/camel

j3ssie/Osmedeus

google-ai-edge/LiteRT

josStorer/RWKV-Runner

OpenLMLab/MOSS

anthropics/anthropic-cookbook

deepseek-ai/DeepSeek-Coder-V2

fauxpilot/fauxpilot

crmne/ruby_llm

ollama/ollama-python

argmaxinc/WhisperKit