Gpt Load

Gpt Load

gpt-load is a transparent proxy gateway that routes API requests to multiple AI providers—including OpenAI, Google Gemini, and Anthropic Claude—through a single endpoint while preserving each provider's native format and authentication. It acts as a centralized routing layer, allowing applications to switch between AI services by changing only the base URL without modifying any client code or business logic.

The proxy distinguishes itself through intelligent traffic management across pools of API keys, offering automatic key rotation, weighted or round-robin load balancing, and failover that detects unhealthy keys and reroutes requests to healthy ones within the same group. Configuration is managed dynamically through a web interface or REST API with hot-reload, applying changes immediately without service restarts, and settings are organized across three tiers—environment, system, and group—for flexible deployment. API keys are stored encrypted in a MySQL database with support for enabling, disabling, or rotating the encryption key, and separate authentication tokens secure the management interface and proxy requests.

For production deployments, gpt-load supports high-availability cluster setups with a master-slave architecture, horizontal scaling, configuration synchronization across nodes, and graceful shutdown protocols. Real-time monitoring provides health checks, request logs, and usage statistics through a web dashboard, while in-memory caching and rate-limiting use fast key-value stores for low-latency access. The project includes a web-based management interface for viewing and modifying settings, managing API keys and provider groups, and monitoring cluster health without editing files or restarting the service.

Features

AI Proxy - Provides real-time statistics, health checks, and detailed request logs through a web management dashboard.

API Key Failovers - Detects failed or unhealthy API keys and reroutes requests to healthy ones within the same group without interrupting service.

Configuration and Log Persistence - Stores API keys, user authentication data, usage statistics, and system logs in a relational database for durability.

API Migration Tools - Switches an existing application to the proxy by changing only the base URL, leaving all business logic and SDK calls untouched.

Proxy Configuration Reloaders - Updates system and group settings without restarting the service by applying changes immediately through hot-reload.

Real-Time Runtime Updates - Updates settings dynamically without restarting the service so you can tune performance in real time.

API Key Failure Retriers - Blacklists failing API keys and restores them after a recovery period to keep the proxy service running without interruption.

Configuration Synchronizers - Uses a master node to manage configuration and database writes, while follower nodes handle proxy traffic and sync state.

Graceful Shutdowns - Removes a node from the load balancer, waits for in-flight requests to finish, then stops the service and cleans up resources.

Leader-Follower Proxy Clusters - Supports leader-follower architecture with horizontal scaling for high availability in enterprise environments.

Master-Slave Proxy Clusters - Deploys a master-slave architecture where the master manages configuration and slaves handle proxy traffic for failover and load distribution.

Proxy Cluster Deployments - Deploying a master-slave architecture with horizontal scaling, configuration sync, and graceful shutdown for enterprise use.

Horizontal Scaling Deployments - Adds slave nodes that automatically sync configuration and integrate with the load balancer to handle increased traffic.

API Key Rotation - Manages a pool of API keys across providers with automatic rotation, load balancing, and recovery from failures.

Multi-Provider Key Rotators - An intelligent proxy that routes requests to multiple AI providers with automatic API key rotation and failover.

AI Proxy Clusters - Deploys a master-slave architecture for horizontal scaling, failover, and centralized configuration management.

AI Model Load Balancers - Distributes requests across multiple upstream endpoints using weighted load balancing to improve availability.

AI Provider Proxies - Routing requests to OpenAI, Gemini, and Claude through a single proxy endpoint while preserving native API formats and authentication.

Upstream Endpoint Load Balancing - Distributes requests among multiple upstream AI endpoints using weighted load balancing to improve availability.

API Key Load Balancers - Distributes incoming requests evenly across all active API keys in a group to maximize throughput and reliability.

Round-Robin API Key Distributors - Routes requests across a pool of API keys using round-robin or weighted allocation with automatic failover on failure.

AI Provider Group Managers - Organizes API keys into groups for different AI services and applies group-specific proxy keys and configurations.

API Key Management - Adds, removes, and monitors API keys for multiple providers from a single management interface.

API Key Encryption at Rest - Persists API keys, usage statistics, and logs in a MySQL database with at-rest encryption support.

AI Provider Gateways - A unified gateway that forwards requests to OpenAI, Google Gemini, and Anthropic Claude through a single endpoint.

Web-Based Configuration Dashboards - Views and modifies settings through an online dashboard without editing files or restarting the service.

Configuration Hot-Reloading - Applies system and group configuration changes immediately via hot-reload, eliminating service downtime.

Hot-Reload Configuration Managers - Applies changes to system settings and group configurations immediately without requiring a service restart.

Web & API Configuration Managers - Updates system and group settings dynamically without restarting the service through a web interface or REST API.

Proxy Performance Monitoring - Displays real-time statistics, health checks, and detailed request logs through a web management interface.

Real-Time Monitoring Systems - Displays live statistics, health checks, and detailed request logs through a web management interface.

Service Health Monitoring - Provides real-time statistics, health checks, and detailed request logging through a web management interface.

AI Provider Routing - Route requests to multiple AI providers through a single endpoint while preserving each provider's native API format and authentication.

API Proxy Routings - Forward OpenAI-compatible API calls through a proxy service that handles routing, streaming responses, and error retries without changing client code.

Configuration APIs - Reads and updates configuration settings from external tools through a programmatic interface.

Environment-System-Group Override Layers - Loads settings from environment, system, and group layers so higher layers override lower ones for flexible deployment.

Proxy Authentication Schemes - Uses distinct authentication keys for the management interface and proxy requests with global and group-level keys.

Encrypted Key Storage with Rate Limiting - Stores API keys encrypted in a database and enforces per-client rate limits with authentication for proxy and management interfaces.

via REST API - Creates, updates, and deletes API keys, views statistics, and adjusts system settings through programmatic endpoints.

Encrypted Key-Value Stores - Encrypts API keys at rest and allows enabling, disabling, or changing the encryption key at any time.

Encrypted Key Registries - Stores API keys encrypted in a relational database with support for enabling, disabling, or rotating the encryption key.

Separate Management - Uses distinct authentication keys for the management interface and proxy requests with global and group-level keys.

Configuration Validation - Checks all configuration values against strict rules to catch errors before they affect operation.

Rate Limiting with Usage Monitoring - Enforces per-client rate limits, authenticates requests, and tracks usage statistics in real time through a web dashboard.

AI Proxy Cluster Health Monitors - Tracks request volume, response time, and database or cache connection status for each node in the cluster.

Real-Time Node Health Monitors - Provides a web dashboard with real-time health checks, request logs, and usage statistics for the proxy cluster.

tbphpgpt-load

Features

Star history