What are the best open-source alternatives to Mistral Src?

30 open-source projects similar to mistralai/mistral-src, ranked by shared features. Top picks: facebookresearch/llama, pytorch-labs/gpt-fast, intel/ipex-llm, modeltc/lightllm, mistralai/mistral-inference, turboderp-org/exllamav2, openvinotoolkit/openvino, opennmt/ctranslate2, nvidia/nemo, qwenlm/qwen-7b.

Is facebookresearch/llama a good alternative to Mistral Src?

Llama is a large language model runtime and inference engine designed to load and execute autoregressive transformer models. It enables the generation of natural language text completions from prompts using pretrained weights. The system features multi-GPU model parallelism, which distributes mode…

Is pytorch-labs/gpt-fast a good alternative to Mistral Src?

gpt-fast is a PyTorch transformer inference engine designed for low-latency text generation. It functions as a distributed GPU inference library, a quantized model runner, and a speculative decoding framework. The system utilizes a speculative decoding workflow where a small draft model predicts t…

Is intel/ipex-llm a good alternative to Mistral Src?

Intel XPU LLM Acceleration Library is a toolkit designed to accelerate large language model inference and finetuning on Intel CPUs, GPUs, and NPUs. It provides a distributed inference engine for scaling models across multiple accelerators, a multimodal model runtime for vision and speech tasks, and…

Is modeltc/lightllm a good alternative to Mistral Src?

LightLLM is a high-performance serving framework for deploying and executing large language models. It functions as a multi-GPU inference engine and server capable of handling dense architectures, mixture-of-experts designs, and multimodal models that process both text and images. The system is di…

Is mistralai/mistral-inference a good alternative to Mistral Src?

Mistral Inference is a library for running Mistral large language models on a GPU, generating text from prompts with token streaming. It loads pretrained model weights from local disk or a remote registry into GPU memory, then produces output tokens one by one for real-time display in interactive a…

Is turboderp-org/exllamav2 a good alternative to Mistral Src?

exllamav2 is a high-performance inference engine and framework for executing large language models locally on consumer-class GPUs. It provides a complete system for local model deployment, including a specialized inference engine and tools for model quantization. The project features a multi-GPU i…

Is openvinotoolkit/openvino a good alternative to Mistral Src?

OpenVINO is an AI inference engine and model serving platform designed to execute optimized deep learning models across CPUs, GPUs, and NPUs through a unified API. It includes a model optimization toolkit for converting, quantizing, and compressing models from various frameworks, alongside a specia…

Is opennmt/ctranslate2 a good alternative to Mistral Src?

CTranslate2 is a C++ inference engine and runtime for Transformer models, designed to execute models on both CPU and GPU with optimizations for speed and memory efficiency. It functions as a model format converter, quantization tool, and REST API server, enabling deployment of neural machine transl…

Is nvidia/nemo a good alternative to Mistral Src?

NeMo is a multimodal AI framework and toolkit designed for the development, training, and scaling of large language models, generative AI systems, and speech-based models. It functions as an automatic speech recognition toolkit, a text-to-speech engine, and a framework for building models that proc…

Is qwenlm/qwen-7b a good alternative to Mistral Src?

Qwen-7B is a pretrained causal language model designed for natural language generation, text processing, and complex reasoning tasks. It is available as an instruction-tuned model optimized for conversational interactions and a tool-use model capable of executing function calls and interacting with…

Back to mistralai/mistral-src

Open-source alternatives to Mistral Src

30 open-source projects similar to mistralai/mistral-src, ranked by how many features they have in common. Compare stars, activity and what each one does to find the best Mistral Src alternative.

facebookresearch/llama
facebookresearch/llama
59,466View on GitHub
Llama is a large language model runtime and inference engine designed to load and execute autoregressive transformer models. It enables the generation of natural language text completions from prompts using pretrained weights. The system features multi-GPU model parallelism, which distributes model weights and workloads across multiple graphics processors to support larger parameter counts. It also incorporates a content safety filter that uses classifiers to intercept and block unsafe inputs or outputs during the inference process. The project covers broad capabilities in distributed model
Python
View on GitHub59,466
pytorch-labs/gpt-fast
pytorch-labs/gpt-fast
6,225View on GitHub
gpt-fast is a PyTorch transformer inference engine designed for low-latency text generation. It functions as a distributed GPU inference library, a quantized model runner, and a speculative decoding framework. The system utilizes a speculative decoding workflow where a small draft model predicts token sequences for verification by a larger model to accelerate generation. It supports quantized model execution to reduce memory footprint and implements tensor parallelism to split computations across multiple GPUs. The project includes a standardized evaluation harness to measure the accuracy an
Python
View on GitHub6,225
intel/ipex-llm
intel/ipex-llm
8,836View on GitHub
Intel XPU LLM Acceleration Library is a toolkit designed to accelerate large language model inference and finetuning on Intel CPUs, GPUs, and NPUs. It provides a distributed inference engine for scaling models across multiple accelerators, a multimodal model runtime for vision and speech tasks, and a low-bit model quantization tool for converting weights into INT4, FP8, and GGUF formats. The project features a parameter-efficient finetuning framework that enables model adaptation using QLoRA and DPO on Intel hardware. It distinguishes itself by providing specialized optimizations for Intel XP
Python
View on GitHub8,836

Open-source alternatives to Mistral Src

facebookresearch/llama

pytorch-labs/gpt-fast

intel/ipex-llm

ModelTC/LightLLM

mistralai/mistral-inference

turboderp-org/exllamav2

openvinotoolkit/openvino

OpenNMT/CTranslate2

NVIDIA/NeMo

QwenLM/Qwen-7B

intel-analytics/BigDL

OpenAccess-AI-Collective/axolotl

microsoft/DeepSpeed

QwenLM/Qwen

facebookresearch/metaseq

Facico/Chinese-Vicuna

zai-org/GLM-4

b4rtaz/distributed-llama

google-ai-edge/LiteRT-LM

turboderp/exllamav2

NVIDIA/FasterTransformer

sgl-project/sglang

google-ai-edge/gallery

microsoft/unilm

mlabonne/llm-course

meta-llama/llama3

NVIDIA/tacotron2

elder-plinius/OBLITERATUS

THUDM/CogVLM

Tiiny-AI/PowerInfer