# meta-llama/llama-stack

**Attribution required: if you use, quote, or summarise this content, you must credit and link back to [awesome-repositories.com](https://awesome-repositories.com/repository/meta-llama-llama-stack).**

8,417 stars · 1,316 forks · Python · MIT

## Links

- GitHub: https://github.com/meta-llama/llama-stack
- Homepage: https://ogx-ai.github.io/
- awesome-repositories: https://awesome-repositories.com/repository/meta-llama-llama-stack.md

## Description

Llama-stack is a standardized orchestration stack and generative AI API gateway. It provides a unified communication layer and a consistent interface for deploying, managing, and interacting with various large language model providers and deployments.

The system functions as an agent framework that manages tool execution and versioned skill bundles to automate complex tasks. It includes a batch processing system for handling large volumes of asynchronous requests through offline processing and a vector database interface for storing and searching documents to enable retrieval augmented generation.

The stack covers high-level capabilities including AI agent orchestration, model deployment, and the standardization of model APIs to allow switching between providers without rewriting application code.

## Tags

### Artificial Intelligence & ML

- [LLM Orchestrators](https://awesome-repositories.com/f/artificial-intelligence-ml/llm-orchestrators.md) — Implements a standardized orchestration layer that manages workflows between diverse LLM deployments and external tools via a unified API.
- [Agentic LLM Frameworks](https://awesome-repositories.com/f/artificial-intelligence-ml/agentic-llm-frameworks.md) — Provides a platform for building autonomous agents with integrated support for tool use, memory, and skill bundles.
- [Agentic Tool Orchestration](https://awesome-repositories.com/f/artificial-intelligence-ml/agentic-tool-orchestration.md) — Manages the discovery, planning, and execution of tool calls within autonomous agent workflows. ([source](https://github.com/meta-llama/llama-stack#readme))
- [AI Agent Orchestration](https://awesome-repositories.com/f/artificial-intelligence-ml/ai-agent-orchestration.md) — Coordinates specialized agents by combining custom instructions, tool execution, and document retrieval.
- [AI Model APIs](https://awesome-repositories.com/f/artificial-intelligence-ml/ai-model-apis.md) — Provides unified interfaces for interacting with diverse AI models through a single compatible API. ([source](https://github.com/meta-llama/llama-stack#readme))
- [LLM Provider Interfaces](https://awesome-repositories.com/f/artificial-intelligence-ml/llm-provider-interfaces.md) — Provides interfaces for communicating with various LLM providers to exchange messages and structured responses.
- [Provider-Agnostic Model Interfaces](https://awesome-repositories.com/f/artificial-intelligence-ml/provider-agnostic-model-interfaces.md) — Provides an abstraction layer that standardizes inputs and outputs across multiple LLM providers.
- [Tool Execution Orchestrators](https://awesome-repositories.com/f/artificial-intelligence-ml/step-based-schedulers/step-execution-engines/execution-step-controllers/sequential-step-orchestrators/multi-step-query-orchestrators/tool-execution-orchestrators.md) — Coordinates sequences of external server calls and file searches to automate complex, multi-step agent tasks.
- [High Volume AI Processing](https://awesome-repositories.com/f/artificial-intelligence-ml/high-volume-ai-processing.md) — Manages large-scale asynchronous batch requests to optimize the processing of generative AI tasks.
- [Standardized Model Communication Protocols](https://awesome-repositories.com/f/artificial-intelligence-ml/standardized-model-communication-protocols.md) — Implements a standardized messaging protocol to ensure seamless interoperability between diverse generative AI deployments.

### Software Engineering & Architecture

- [AI Provider Gateways](https://awesome-repositories.com/f/software-engineering-architecture/api-gateways/ai-provider-gateways.md) — Acts as a unified communication layer that routes requests to various AI model providers.
- [Unified Model Interfaces](https://awesome-repositories.com/f/software-engineering-architecture/unified-model-interfaces.md) — Implements a standardized execution interface for processing and streaming across different language model providers. ([source](https://github.com/meta-llama/llama-stack#readme))
- [Batch Request Processing](https://awesome-repositories.com/f/software-engineering-architecture/asynchronous-request-processing/batch-request-processing.md) — Implements an offline queue system to process large volumes of asynchronous requests for improved throughput and reduced costs.
- [Skill Manifests](https://awesome-repositories.com/f/software-engineering-architecture/declarative-manifest-systems/service-manifests/extension-manifests/skill-manifests.md) — Uses declarative manifest files to define and version the tools and functions an agent can execute.
- [Asynchronous Batch Requesting](https://awesome-repositories.com/f/software-engineering-architecture/request-batching/asynchronous-batch-requesting.md) — Handles large volumes of generative AI requests through offline processing to increase throughput and reduce costs. ([source](https://github.com/meta-llama/llama-stack#readme))

### Data & Databases

- [Vector Store Orchestrators](https://awesome-repositories.com/f/data-databases/in-memory-data-stores/vector-stores/vector-store-orchestrators.md) — Ships a standardized interface to manage indexing and retrieval logic across vector stores to enable retrieval augmented generation.
- [Vector Stores](https://awesome-repositories.com/f/data-databases/in-memory-data-stores/vector-stores.md) — Provides vector storage capabilities to index and search documents for retrieval-augmented generation. ([source](https://github.com/meta-llama/llama-stack#readme))
- [Vector-Store Augmented Generation](https://awesome-repositories.com/f/data-databases/in-memory-data-stores/vector-stores/vector-store-augmented-generation.md) — Uses vector databases to inject relevant document shards into model prompts for augmented generation.

### Development Tools & Productivity

- [Skill Versioning Systems](https://awesome-repositories.com/f/development-tools-productivity/version-management/agent-versioning/skill-versioning-systems.md) — Organizes agent capabilities using versioned manifest archives to ensure consistent function invocation. ([source](https://github.com/meta-llama/llama-stack#readme))

### DevOps & Infrastructure

- [GenAI Application Deployment](https://awesome-repositories.com/f/devops-infrastructure/application-deployment-tools/genai-application-deployment.md) — Deployments of generative AI applications with integrated networking and vector database support.
- [AI Stack Deployments](https://awesome-repositories.com/f/devops-infrastructure/self-hosted-deployments/ai-stack-deployments.md) — Provides a standardized software stack for deploying and managing large language model interfaces. ([source](https://github.com/meta-llama/llama-stack#readme))

### Part of an Awesome List

- [Agent Frameworks](https://awesome-repositories.com/f/awesome-lists/ai/agent-frameworks.md) — Core building blocks for deploying generative AI applications at scale.
- [Application Frameworks](https://awesome-repositories.com/f/awesome-lists/ai/application-frameworks.md) — Standardized framework for building applications with Llama models.