30 open-source projects similar to 1rgs/claude-code-proxy, ranked by how many features they have in common. Compare stars, activity and what each one does to find the best Claude Code Proxy alternative.
OmniRoute is a unified LLM API gateway that connects multiple AI providers to a single endpoint. Its primary purpose is to simplify the integration of various AI models into tools and agents by translating different provider formats into a standardized API. The project distinguishes itself through a multi-strategy request routing system that optimizes for cost, speed, and availability, including automatic model fallbacks and a circuit-breaker resilience model to isolate provider failures. It employs a local-first security posture, using AES-256-GCM encryption to store API keys and conversatio
ClawRouter is an AI model router and API gateway designed to classify query complexity and assign prompts to the most efficient model tier. It operates as a multi-model AI proxy that orchestrates traffic between various large language models and AI media generators through a unified interface. The project distinguishes itself by integrating a non-custodial micropayment processor using the x402 protocol. This allows for per-request API access and USDC settlement on Base and Solana chains, replacing static API keys with wallet-based authentication and real-time budget enforcement. The system c
ruby_llm is an LLM integration framework and AI agent orchestrator designed to connect applications to multiple large language model providers through a unified interface. It serves as a toolkit for building autonomous assistants with custom personas, managing structured output via JSON schemas, and implementing vector embedding engines for semantic search. The project distinguishes itself as an observability suite and multimodal toolkit. It provides specialized capabilities for tracking token usage, calculating model costs, and tracing workflows via OpenTelemetry, while supporting the proces
gpt4free-ts is a TypeScript-based LLM API proxy and gateway that provides a unified interface for accessing large language models without paid subscriptions or official API keys. It functions as a containerized AI bridge that routes requests to various free third-party providers to retrieve chat completions. The project acts as an OpenAI API wrapper, translating requests and responses into the standard OpenAI chat completions format to ensure compatibility with existing AI tools. It utilizes a provider-based routing system to distribute request loads across available endpoints. The gateway s
Plano is an AI agent orchestrator and LLM gateway proxy that unifies access to multiple AI providers through a single interoperable interface. It functions as a model routing engine that decouples applications from specific vendors using semantic aliases, allowing traffic to be shifted between providers without modifying application code. The system distinguishes itself with intent-based agent routing, which directs prompts to specialized agents based on semantic analysis. It features an interceptor-based filter chain system that acts as guardrail middleware to enforce safety policies, rewrit
TaskWeaver is an LLM agent framework that interprets natural language requests and executes them as Python code, SQL queries, or shell commands. It functions as a conversational code interpreter that maintains stateful data structures across turns, generating executable code from user prompts within a session-based environment. The system is designed as a self-hosted AI agent platform that can be deployed in Docker, managing sessions and providing a web UI for data analytics and automation tasks. The framework distinguishes itself through a role-based multi-agent architecture that divides the
This project is a multi-provider AI gateway and proxy server that intercepts and routes requests between AI clients and various large language model providers. It functions as an API protocol translator and model router, mapping incoming requests to specific upstream providers or local runners to provide a unified interface for multiple models. The system distinguishes itself by bridging chat platforms and command line interfaces, converting messages from chat services into managed command line sessions. It further optimizes traffic by executing certain web search and fetch requests locally a
LangChainJS is an AI agent orchestrator and application framework designed for building autonomous systems that use large language models to plan and execute tasks. It serves as an integration library that connects language models with tools, memory, and external data sources to create context-aware logic and complex workflows. The project provides a provider-agnostic interface and model provider abstraction, allowing applications to switch between different language model providers without rewriting core logic. It includes a toolkit for retrieval augmented generation, utilizing retrievers to
Goose is an autonomous coding assistant and extensible AI agent framework designed to automate software development workflows. It functions as an orchestration engine that can install, execute, and test code, as well as manage local files and shell commands. The platform is model-agnostic, providing a flexible interface to connect with diverse cloud-based or self-hosted large language model providers. It distinguishes itself through a standardized context protocol for integrating external tools and extensions, and a recipe system that allows users to define and repeat complex, multi-step AI w
Mods is a terminal-based AI client that sends prompts to large language models and streams responses back to the command line. It functions as a multi-provider AI gateway, routing queries to OpenAI, Cohere, Groq, Gemini, and local endpoints, and includes a conversation history manager that saves, caches, branches, and resumes text-based interactions. The tool also operates as a Model Context Protocol client, connecting to external MCP servers via stdio, SSE, or HTTP to extend model capabilities with specialized tools and data. The project distinguishes itself through a config-driven provider
mcp-context-forge is a Model Context Protocol federation gateway that unifies diverse AI tool servers and APIs into a single consistent interface for discovery and execution. It acts as a centralized proxy that aggregates multiple servers and APIs, allowing AI agents to access and invoke a unified set of tools, prompts, and resources. The project distinguishes itself through a multi-protocol translation bridge that converts communication between standard I/O, SSE, gRPC, and REST to enable interoperability between disparate tool servers. It includes a comprehensive LLM evaluation framework for
mistral.rs is an inference engine for large language models that runs locally and exposes models behind OpenAI and Anthropic-compatible APIs. It serves as a multi-model serving platform, capable of loading several models in a single server process with per-request routing and on-demand loading and unloading. The engine supports multimodal inference, processing text alongside images, video, audio, and speech inputs, and includes a quantized model deployment runtime that reduces memory use and speeds up inference on consumer hardware. The project distinguishes itself through an agentic tool exe
Hermes-webui is a self-hosted AI orchestrator and web interface for managing autonomous agents. It serves as a multi-provider gateway that connects cloud and local large language models, providing a central hub to execute scheduled background jobs, run shell commands, and manage agent memory on private hardware. The system distinguishes itself through a persistent memory manager that utilizes knowledge graphs and markdown files for long-term context across sessions. It features a model context protocol host for extending agent capabilities with standardized tools and supports the orchestratio
This project is a terminal-based command line interface client and agent orchestrator for interacting with multiple large language model providers. It functions as an OpenAI API client and a local API gateway that exposes chat completions and embeddings through an HTTP server. The system distinguishes itself by providing a retrieval-augmented generation tool for indexing local files and URLs into a vector database to provide custom document context. It allows for the creation of specialized AI agents that combine custom system prompts with tool calling and external function execution. The to
Helicone is an AI gateway and observability platform designed to intercept, manage, and monitor interactions with large language models. By acting as a reverse-proxy, it provides a centralized layer for routing requests across multiple AI providers, allowing developers to maintain consistent application logic while gaining deep visibility into model performance, usage, and costs. The platform distinguishes itself through a robust suite of traffic management and prompt engineering tools. It enables policy-driven control, including automatic failover between providers, rate limiting, and edge-b
BAML is a prompt engineering framework and LLM client generator that defines AI prompts as type-safe functions. It serves as a structured data extraction tool and workflow orchestrator, transforming unstructured model responses into strongly typed objects using a custom schema language and alignment algorithms. The project distinguishes itself by using a compiler to generate language-specific boilerplate code for API communication and output parsing. It features a dedicated environment for designing complex prompt templates with conditional logic and reusable snippets, and employs genetic alg
The BeeAI Framework is an LLM agent framework and multi-agent orchestration engine used to build autonomous agents that coordinate reasoning, tool execution, and complex workflows. It functions as a structured AI output controller and RAG integration library, providing a unified interface to manage multiple language model providers. The framework is distinguished by its implementation of the Model Context Protocol, allowing agents, tools, and models to be shared between different AI platforms and hosted as agentic tooling servers. It enables the design of collaborative agent teams through dec
This project is an AI model API gateway and proxy server designed to provide a unified interface for interacting with diverse artificial intelligence service providers. It functions as a centralized middleware platform that routes, load balances, and translates API requests across multiple models, enabling developers to access text, image, audio, and video generation capabilities through a single, standardized integration. The gateway distinguishes itself through comprehensive administrative and financial controls, including event-driven usage accounting, real-time token consumption tracking,
This project is a framework for developing multimodal AI agents that function as programmable participants in real-time communication rooms. It enables the construction of agents that can see, hear, and speak by integrating speech-to-text, large language models, and text-to-speech pipelines to facilitate low-latency, natural conversations. The system is distinguished by its advanced orchestration of real-time media and conversational flow, including support for full-duplex speech, preemptive response generation, and sophisticated interruption management. It further differentiates itself throu
PrivateGPT is a private AI document assistant and local knowledge base manager designed for querying private files and documents using retrieval-augmented generation. It functions as a local language model application and API gateway, allowing users to obtain cited answers from unstructured data without sending information to external servers. The system differentiates itself by acting as a tool integrator that connects language models to external functions, including web search, tabular data analysis, and custom action extensions. It provides a standardized API layer that allows local infere
This project is a secure intermediary proxy gateway for large language model APIs. It functions as a relay service that forwards requests to AI providers while managing service accounts and routing traffic. The service provides a compatibility layer that supports multiple endpoint formats, allowing different third-party AI clients to communicate with a single provider. It distinguishes itself through a service account management system that assigns individual proxy settings to multiple accounts to prevent IP bans and distributes traffic via load balancing to avoid rate limits. The system inc
Langroid is a multi-agent orchestration framework and tool integration suite designed for building complex AI applications. It serves as a multi-modal integration layer that connects diverse local and remote language models with an agentic retrieval-augmented generation system. The project distinguishes itself through a collaborative message-exchange paradigm, allowing specialized agents to delegate tasks hierarchically and coordinate via structured communication. It features an advanced state management system for conversational AI, including the ability to rewind and prune conversation hist
This project is an autonomous AI agent framework and workflow orchestrator designed to automate machine learning engineering. It functions as a reasoning engine that reads research papers and writes code to train and deploy machine learning models through iterative reasoning loops and tool execution. The system distinguishes itself by integrating a GPU-accelerated sandboxed execution environment, allowing it to run and verify machine learning scripts in isolated remote containers. It utilizes a model provider integration gateway to route inference requests across various hosted or local endpo
Llama-swap is a local inference orchestrator and API gateway for large language models. It functions as an OpenAI API proxy that manages the lifecycle of multiple local model servers, automatically starting and stopping them to swap models based on incoming request identifiers. The project distinguishes itself through dynamic model swapping and hardware optimization. It utilizes a specialized matrix-based concurrency control to define which models can run simultaneously and employs cost-based eviction to remove inactive servers from memory based on relative resource costs. The system provide
Exo is a distributed inference engine designed to run machine learning models across local hardware. It functions as a network orchestration layer that automatically discovers available devices to form a unified computing cluster, allowing users to scale artificial intelligence workloads by distributing computational tasks across multiple machines. The platform distinguishes itself through its ability to manage the entire lifecycle of local models while providing a standardized gateway for external applications. By translating local model outputs into industry-standard formats, it enables exi
Axonhub is an AI gateway and multi-model API proxy that provides a unified interface for routing requests to multiple large language model providers. It functions as a load balancer and translation layer, converting a standardized API format into provider-specific payloads to enable communication with various AI models without provider-specific code. The system manages traffic through rule-based routing and automatic failover to maintain high availability. It differentiates its operations by providing a provider-agnostic interface that decouples client requests from specific model backends us
This project is an AI-focused API gateway and proxy system designed to intercept, standardize, and route requests across heterogeneous language model providers. It functions as a middleware layer that normalizes incoming traffic and manages authentication, ensuring consistent integration across diverse service interfaces. The system features a programmable routing engine that executes user-defined scripts to evaluate request content in real-time. This allows for dynamic traffic management, where requests are inspected, transformed, and redirected to specific model endpoints based on custom lo
Antigravity-Manager is an artificial intelligence model orchestration platform that functions as a unified gateway for interacting with multiple external service providers. It standardizes heterogeneous vendor data structures into a consistent internal schema, allowing third-party tools to interface with various models through a single, normalized API. The system distinguishes itself through automated infrastructure management, including the lifecycle tracking of service accounts and the secure rotation of authentication credentials. By acting as a middleware layer, it intercepts traffic to p