What are the best open-source alternatives to Gpt Fast?

30 open-source projects similar to pytorch-labs/gpt-fast, ranked by shared features. Top picks: meta-pytorch/gpt-fast, mistralai/mistral-src, intel/ipex-llm, opennmt/ctranslate2, nvidia/fastertransformer, facebookresearch/metaseq, openbmb/minicpm, intel-analytics/bigdl, microsoft/deepspeed, skyzh/tiny-llm.

Is meta-pytorch/gpt-fast a good alternative to Gpt Fast?

gpt-fast is a PyTorch transformer inference engine designed for text generation using a native tensor library implementation. It provides a runtime for executing large language models without the need for external C++ extensions. The project implements speculative decoding to accelerate generation…

Is mistralai/mistral-src a good alternative to Gpt Fast?

This project is a large language model inference library and framework designed to run models for text generation, problem solving, and coding assistance. It includes a multimodal framework for processing combined image and text inputs and a tool-use implementation that enables the execution of ext…

Is intel/ipex-llm a good alternative to Gpt Fast?

Intel XPU LLM Acceleration Library is a toolkit designed to accelerate large language model inference and finetuning on Intel CPUs, GPUs, and NPUs. It provides a distributed inference engine for scaling models across multiple accelerators, a multimodal model runtime for vision and speech tasks, and…

Is opennmt/ctranslate2 a good alternative to Gpt Fast?

CTranslate2 is a C++ inference engine and runtime for Transformer models, designed to execute models on both CPU and GPU with optimizations for speed and memory efficiency. It functions as a model format converter, quantization tool, and REST API server, enabling deployment of neural machine transl…

Is nvidia/fastertransformer a good alternative to Gpt Fast?

FasterTransformer is a high-performance inference optimization library and distributed runtime designed to accelerate the execution of transformer models. It provides a toolkit for reducing model precision and parallelizing execution across multiple GPUs to increase throughput and reduce latency fo…

Is facebookresearch/metaseq a good alternative to Gpt Fast?

Metaseq is a transformer sequence modeling toolkit designed for training, fine-tuning, and deploying sequence-to-sequence models using open pre-trained weights. It provides a comprehensive framework for large language model training, including dedicated tools for sequence dataset processing and a s…

Is openbmb/minicpm a good alternative to Gpt Fast?

MiniCPM is a collection of small language models designed for local, on-device deployment in resource-constrained environments. The project focuses on running dense Transformer models on consumer hardware, including GPUs, CPUs, and Apple Silicon, without requiring custom code forks. The project di…

Is intel-analytics/bigdl a good alternative to Gpt Fast?

BigDL is a PyTorch acceleration framework and distributed inference engine designed for large language models. It provides a toolkit for running models on Intel hardware, integrating quantization tools and libraries for parameter-efficient fine-tuning. The project distinguishes itself through the…

Is microsoft/deepspeed a good alternative to Gpt Fast?

DeepSpeed is a distributed deep learning optimization library and framework designed for the training and inference of massive AI models. It serves as a model parallelism orchestrator and a toolkit for scaling large language models across multiple GPUs and compute nodes. The project distinguishes…

Is skyzh/tiny-llm a good alternative to Gpt Fast?

tiny-llm is a large language model inference engine and transformer model implementation. It serves as a quantized model runtime and paged key-value cache manager, providing a specialized inference stack optimized for Apple Silicon. The system distinguishes itself through high-throughput execution…

Back to pytorch-labs/gpt-fast

Open-source alternatives to Gpt Fast

30 open-source projects similar to pytorch-labs/gpt-fast, ranked by how many features they have in common. Compare stars, activity and what each one does to find the best Gpt Fast alternative.

meta-pytorch/gpt-fast
meta-pytorch/gpt-fast
6,223View on GitHub
gpt-fast is a PyTorch transformer inference engine designed for text generation using a native tensor library implementation. It provides a runtime for executing large language models without the need for external C++ extensions. The project implements speculative decoding to accelerate generation by using a small draft model for token prediction and a larger model for verification. It further optimizes performance through a compiled prefill stage and a multi-GPU tensor parallelism library that shards linear layers across multiple graphics processing units. Memory efficiency is managed throu
Python
View on GitHub6,223
mistralai/mistral-src
mistralai/mistral-src
10,821View on GitHub
This project is a large language model inference library and framework designed to run models for text generation, problem solving, and coding assistance. It includes a multimodal framework for processing combined image and text inputs and a tool-use implementation that enables the execution of external functions based on model reasoning. The system features a distributed GPU inference engine that spreads large model workloads across multiple graphics processors to increase processing speed and meet memory requirements. It also provides containerized model deployment through pre-packaged imag
Jupyter Notebook
View on GitHub10,821
intel/ipex-llm
intel/ipex-llm
8,836View on GitHub
Intel XPU LLM Acceleration Library is a toolkit designed to accelerate large language model inference and finetuning on Intel CPUs, GPUs, and NPUs. It provides a distributed inference engine for scaling models across multiple accelerators, a multimodal model runtime for vision and speech tasks, and a low-bit model quantization tool for converting weights into INT4, FP8, and GGUF formats. The project features a parameter-efficient finetuning framework that enables model adaptation using QLoRA and DPO on Intel hardware. It distinguishes itself by providing specialized optimizations for Intel XP
Python
View on GitHub8,836

Open-source alternatives to Gpt Fast

meta-pytorch/gpt-fast

mistralai/mistral-src

intel/ipex-llm

OpenNMT/CTranslate2

NVIDIA/FasterTransformer

facebookresearch/metaseq

OpenBMB/MiniCPM

intel-analytics/BigDL

microsoft/DeepSpeed

skyzh/tiny-llm

OpenNMT/OpenNMT-py

meta-llama/llama-models

LostRuins/koboldcpp

ggerganov/llama.cpp

timdettmers/bitsandbytes

OpenAccess-AI-Collective/axolotl

facebookresearch/llama

unslothai/unsloth

QwenLM/Qwen

hpcaitech/ColossalAI

rustformers/llm

intel-analytics/ipex-llm

ModelTC/LightLLM

ztxz16/fastllm

guillaumekln/faster-whisper

vllm-project/vllm-omni

openvinotoolkit/openvino

FMInference/FlexGen

PaddlePaddle/PaddleNLP

InternLM/lmdeploy