Tabby is a self-hosted AI coding assistant designed to provide real-time code completion and interactive chat capabilities within development environments. By functioning as a private server application, it allows teams to maintain control over their infrastructure and data while integrating intelligent code generation directly into their existing workflows.
The platform distinguishes itself through its repository-aware knowledge retrieval and multi-model orchestration. It indexes local and remote source code repositories and technical documentation into a searchable vector-based knowledge graph, enabling the assistant to provide context-specific answers and code suggestions. The system manages distinct pipelines for completion, chat, and embedding models, allowing users to tune performance and hardware utilization based on specific task requirements.
The architecture supports scalable, containerized deployment, enabling consistent performance across local and cloud environments. It utilizes declarative configuration to manage infrastructure and service replicas, while integrating with development environments through standard messaging interfaces. Users can configure specific models for different tasks, ensuring compatibility with performance benchmarks and hardware constraints.