30 open-source projects similar to alexsjones/llmfit, ranked by how many features they have in common. Compare stars, activity and what each one does to find the best Llmfit alternative.
An open-source LLM router that optimize your agent for cost and performance — with every run.
Local proxy that compresses your LLM API requests so you pay less, with no change to the answers. Trims wasted tokens from prompts, history, tool output, and code before they're sent: -31% input / -74% output, measured live. Any provider, no extra model calls. Also an MCP server and embeddable library (Rust, Python, Ruby, Kotlin, Swift).
This project is a terminal-based command line interface client and agent orchestrator for interacting with multiple large language model providers. It functions as an OpenAI API client and a local API gateway that exposes chat completions and embeddings through an HTTP server. The system distinguishes itself by providing a retrieval-augmented generation tool for indexing local files and URLs into a vector database to provide custom document context. It allows for the creation of specialized AI agents that combine custom system prompts with tool calling and external function execution. The to
DeepSeek-TUI is an AI coding agent orchestrator and framework designed to automate complex programming tasks. It functions as a harness for coordinating AI models that can read source code, edit files, and execute shell commands through automated agent workflows. The system is distinguished by its multi-agent coordination capabilities, which allow for the spawning of parallel sub-agents to handle concurrent investigations or implementation slices. It employs autonomous goal-seeking loops to pursue objectives across multiple turns and utilizes a tool integration gateway to connect models to ex
Screenpipe is a local-first platform designed to record, index, and analyze desktop activity. By capturing screen, audio, and keyboard input, it creates a comprehensive and searchable history of computer usage. The system functions as an activity recorder and automation framework, providing a persistent, context-aware memory that allows artificial intelligence agents to observe and interact with local desktop environments. The platform distinguishes itself through a privacy-focused architecture that processes all data locally. It utilizes on-device computer vision and speech recognition to tr
Open-source AI agent harness in native Rust — GUI, CLI, headless, and webapp from one binary. Multi-provider, MCP, skills, plugins, agent teams.
A lightweight desktop app to manage, sync, and organize AI agent skills across 15+ coding tools — Cursor, Claude Code, Codex, Copilot, and more.
OpenHuman is an AI application framework for building private intelligence systems and personal AI layers. It provides a system for deploying private AI assistants that execute technical tasks and manage personal knowledge bases. The project features a model-agnostic request proxy that routes AI workloads to different large language models based on requirements for reasoning, speed, or vision. It integrates an OAuth-driven data integrator to synchronize personal information from external services into a local knowledge base composed of hierarchical Markdown summaries. The framework also inclu
This project is an automated technical writing tool that functions as a documentation-as-code framework. It parses source code and configuration files to generate structured instructional manuals and operational guides, ensuring that technical documentation remains synchronized with software updates through version control systems. The system utilizes large language model orchestration and static analysis to interpret codebase metadata and system definitions. By applying template-driven logic and context-aware prompt engineering, it transforms raw technical data into consistent, human-readabl
Let AI agents message, watch, and spawn each other across terminals. Claude Code, Codex, Antigravity CLI, Cursor CLI, OpenCode, Kilo, Pi, Kimi
Like htop, but for AI coding agents. Monitor Claude Code & Codex CLI sessions, tokens, context window, rate limits, and ports in real-time.
Ultra-fast token & cost tracker for LLM Token Usage (e.g. Claude Code)
A simple TUI for serving local LLM models. Pick a model, pick a backend, serve it
LARK is a development toolkit for training, fine-tuning, and deploying large language models and multimodal models based on PaddlePaddle. It functions as a comprehensive framework that includes an LLM training orchestrator, an inference server, and a multimodal model framework for processing text, image, and video inputs. The project features a retrieval-augmented generation system for building conversational applications that integrate web search and private knowledge bases. It provides specific capabilities for multimodal reasoning and complex logic, enabling the extraction of structured da
This is a collection of Jupyter notebooks that serve as educational guides for training, fine-tuning, and deploying machine learning models within the Hugging Face ecosystem. The notebooks cover the full lifecycle of model development, from loading and configuring pre-trained transformers to packaging trained models for real-time inference via scalable endpoints. The notebooks demonstrate a range of capabilities including diffusion model training and fine-tuning for image generation and editing, transformer model adaptation for natural language processing tasks, and parameter-efficient fine-t
AutoGluon is an automated machine learning framework and multimodal library designed to automate the end-to-end pipeline from data preprocessing to high-accuracy model training and validation. It functions as an automated model trainer for tabular, image, text, and time series data, as well as a tool for time series forecasting and foundation model finetuning. The project is distinguished by its ability to jointly process and fuse different data types, allowing for the construction of multimodal neural networks that integrate images, text, and structured tables. It supports zero-shot inferenc
The PyTorch Tutorials repository is a collection of educational resources that provides step-by-step guidance on building, training, and deploying neural networks using the PyTorch framework. It covers the complete machine learning workflow, from data loading and model definition through optimization loops and model persistence, with dedicated guides for distributed training, model fine-tuning, and deployment. The tutorials offer practical demonstrations of adapting pre-trained models to new tasks through transfer learning, scaling training across multiple GPUs or machines using PyTorch's dis
Ramalama is a containerized runtime and management tool for large language models. It functions as an OCI AI model manager and registry client, allowing users to package, distribute, and execute AI models as standardized container images. The project differentiates itself by using OCI-compliant distribution for models and retrieval augmented generation assets, enabling the packaging of vector databases into immutable container images. It features hardware-aware image selection that automatically detects GPU or CPU capabilities to pull the most optimized image for the host environment. The sy
Genkit is an open-source framework for building AI-powered applications. It provides a unified interface for connecting to hundreds of generative AI models from multiple providers, enabling text, image, audio, and video generation through a single API. The framework structures multi-step AI interactions—including chat, retrieval-augmented generation, tool use, and agentic workflows—as composable, traceable flows with built-in streaming and state management. The framework distinguishes itself through a comprehensive developer toolkit that includes a command-line interface and a local developer
Examples of Machine Learning code using Comet.ml
AITemplate is an ahead-of-time deep learning compiler that translates PyTorch neural networks into standalone C++ source code. It functions as a PyTorch to C++ compiler and a GPU kernel fusion engine, producing self-contained executable binaries that run inference without requiring a Python interpreter or deep learning framework runtime. The project generates optimized CUDA and HIP C++ code specifically for NVIDIA TensorCores and AMD MatrixCores. It focuses on maximizing throughput for half-precision floating-point operations through a system that combines multiple neural network operators in
Cherry Studio is a cross-platform desktop application that serves as a centralized workspace for managing and interacting with multiple artificial intelligence models. It functions as a local-first orchestrator, prioritizing user privacy by storing all conversation history and knowledge bases directly on your device. By providing a unified interface for both cloud-based and local AI services, the platform simplifies API key management and allows for consistent model interaction across different operating systems. The application distinguishes itself through a robust retrieval-augmented genera
Exo is a distributed inference engine designed to run machine learning models across local hardware. It functions as a network orchestration layer that automatically discovers available devices to form a unified computing cluster, allowing users to scale artificial intelligence workloads by distributing computational tasks across multiple machines. The platform distinguishes itself through its ability to manage the entire lifecycle of local models while providing a standardized gateway for external applications. By translating local model outputs into industry-standard formats, it enables exi