What are the best open-source alternatives to Alpaca.cpp?

30 open-source projects similar to antimatter15/alpaca.cpp, ranked by shared features. Top picks: ggerganov/llama.cpp, pytorch/executorch, facico/chinese-vicuna, tiiny-ai/powerinfer, ggerganov/whisper.cpp, apple/ml-fastvlm, openvinotoolkit/openvino, kvcache-ai/ktransformers, mlc-ai/mlc-llm, microsoft/bitnet.

Is ggerganov/llama.cpp a good alternative to Alpaca.cpp?

llama.cpp is a high-performance C++ inference engine and runtime for executing large language models locally across various hardware architectures. It provides the core components for local model execution, including a dedicated model quantizer for compressing weights into the GGUF format and a sys…

Is pytorch/executorch a good alternative to Alpaca.cpp?

ExecuTorch is a lightweight C++ runtime for deploying PyTorch models on mobile, embedded, and edge hardware. It provides an ahead-of-time compilation pipeline that exports, quantizes, and lowers model graphs into compact serialized programs, then executes them through a minimal runtime with hardwar…

Is facico/chinese-vicuna a good alternative to Alpaca.cpp?

Chinese-Vicuna is a Chinese large language model and instruction-following AI based on the LLaMA architecture. It is specifically designed for natural language understanding and generation in the Chinese language, utilizing an instruction-tuned model to follow complex user prompts across conversati…

Is tiiny-ai/powerinfer a good alternative to Alpaca.cpp?

PowerInfer is a high-performance local large language model inference engine and sparse inference framework. It provides a runtime for executing models on consumer-grade hardware, utilizing a GPU acceleration backend to optimize tensor operations for graphics processors. The system distinguishes i…

Is ggerganov/whisper.cpp a good alternative to Alpaca.cpp?

whisper.cpp is a C++ implementation of the Whisper speech-to-text model, serving as a lightweight machine learning inference engine and quantized runtime. It provides high-performance automatic speech recognition and real-time audio transcription without requiring a Python environment. The project…

Is apple/ml-fastvlm a good alternative to Alpaca.cpp?

This project is a vision language model framework and vision-to-text pipeline designed for deploying and optimizing models that process both images and text. It provides an on-device inference engine and a vision language model framework to run quantized models locally on mobile and desktop hardwar…

Is openvinotoolkit/openvino a good alternative to Alpaca.cpp?

OpenVINO is an AI inference engine and model serving platform designed to execute optimized deep learning models across CPUs, GPUs, and NPUs through a unified API. It includes a model optimization toolkit for converting, quantizing, and compressing models from various frameworks, alongside a specia…

Is kvcache-ai/ktransformers a good alternative to Alpaca.cpp?

Ktransformers is a comprehensive framework designed for the operation, fine-tuning, and serving of large language models. It functions as a heterogeneous inference engine and quantized execution runtime, enabling the deployment of massive models by distributing computational workloads across both C…

Is mlc-ai/mlc-llm a good alternative to Alpaca.cpp?

MLC LLM is a machine learning compiler and inference engine designed to execute large language models locally across diverse hardware platforms, including desktop, mobile, and web environments. By utilizing machine learning compilation, the project transforms high-level model definitions into speci…

Is microsoft/bitnet a good alternative to Alpaca.cpp?

BitNet is a quantized inference engine designed to execute highly compressed language models by performing arithmetic on low-precision, bit-level weight data. It functions as a model optimization toolkit and a high-performance kernel library, enabling the execution of large language models on consu…

Back to antimatter15/alpaca.cpp

Open-source alternatives to Alpaca.cpp

30 open-source projects similar to antimatter15/alpaca.cpp, ranked by how many features they have in common. Compare stars, activity and what each one does to find the best Alpaca.cpp alternative.

ggerganov/llama.cpp
ggerganov/llama.cpp
116,912View on GitHub
llama.cpp is a high-performance C++ inference engine and runtime for executing large language models locally across various hardware architectures. It provides the core components for local model execution, including a dedicated model quantizer for compressing weights into the GGUF format and a system for generating text embeddings for semantic search. The project distinguishes itself through specialized memory and execution optimizations, such as block-wise weight quantization to reduce memory footprints and memory-mapped model loading. It supports structured text generation by using formal
C++
View on GitHub116,912
pytorch/executorch
pytorch/executorch
4,296View on GitHub
ExecuTorch is a lightweight C++ runtime for deploying PyTorch models on mobile, embedded, and edge hardware. It provides an ahead-of-time compilation pipeline that exports, quantizes, and lowers model graphs into compact serialized programs, then executes them through a minimal runtime with hardware acceleration and on-device large language model inference capabilities. The project distinguishes itself through a hardware accelerator delegate system that partitions model subgraphs and offloads computation to specialized backends including NPUs, GPUs, and DSPs from Apple, Arm, Intel, MediaTek,
Pythondeep-learningembeddedgpu
View on GitHub4,296
facico/chinese-vicuna
Facico/Chinese-Vicuna
4,121View on GitHub
Chinese-Vicuna is a Chinese large language model and instruction-following AI based on the LLaMA architecture. It is specifically designed for natural language understanding and generation in the Chinese language, utilizing an instruction-tuned model to follow complex user prompts across conversations. The project provides a LoRA fine-tuning framework and quantization systems to enable model adaptation and inference on consumer hardware. It implements quantized inference to reduce memory usage on both CPUs and GPUs, supported by a low-level C++ implementation to minimize system resource requi
Calpacachinesellama
View on GitHub4,121

Open-source alternatives to Alpaca.cpp

ggerganov/llama.cpp

pytorch/executorch

Facico/Chinese-Vicuna

Tiiny-AI/PowerInfer

ggerganov/whisper.cpp

apple/ml-fastvlm

openvinotoolkit/openvino

kvcache-ai/ktransformers

mlc-ai/mlc-llm

microsoft/BitNet

soniqo/speech-swift

timdettmers/bitsandbytes

ymcui/Chinese-LLaMA-Alpaca

apple/ml-stable-diffusion

LostRuins/koboldcpp

facebookresearch/metaseq

QwenLM/Qwen-7B

TingsongYu/PyTorch-Tutorial-2nd

facebookresearch/codellama

meta-pytorch/gpt-fast

GeeeekExplorer/nano-vllm

llmware-ai/llmware

nomic-ai/gpt4all-ui

RunanywhereAI/runanywhere-sdks

vikhyat/moondream

vllm-project/llm-compressor

google-ai-edge/LiteRT-LM

menloresearch/jan

meta-llama/llama-models

Lightning-AI/litgpt