1 repo
Caching mechanisms for storing and retrieving model outputs to optimize latency and costs.
Distinguishing note: Focuses on caching specifically for AI model responses.
Explore 1 awesome GitHub repository matching data & databases · Response Caching. Refine with filters or upvote what's useful.
LiteLLM is a unified gateway and proxy server designed to centralize access to over one hundred language model providers. It provides a standardized API interface that abstracts vendor-specific schemas, allowing developers to interact with diverse models through a single, consistent format. By acting as a central traffic management layer, it enables organizations to route, secure, and govern model interactions across multiple deployments. The platform distinguishes itself through its policy-driven architecture, which uses configuration-based routing to manage traffic distribution, load balanc
Stores and retrieves previous model outputs in a cache to reduce latency and operational costs.