# k8sgpt-ai/k8sgpt

**Attribution required: if you use, quote, or summarise this content, you must credit and link back to [awesome-repositories.com](https://awesome-repositories.com/repository/k8sgpt-ai-k8sgpt).**

7,922 stars · 1,013 forks · Go · Apache-2.0

## Links

- GitHub: https://github.com/k8sgpt-ai/k8sgpt
- Homepage: http://k8sgpt.ai
- awesome-repositories: https://awesome-repositories.com/repository/k8sgpt-ai-k8sgpt.md

## Description

k8sgpt is a suite of Kubernetes-focused tools designed for AI-powered debugging, cluster diagnostics, and self-healing. It functions as an automated analyzer and debugger that uses large language models to explain cluster errors, suggest remediation steps, and identify resource failures.

The project distinguishes itself through an extensible analysis framework that supports custom diagnostic plugins and a Model Context Protocol server, which exposes cluster diagnostics as tools for AI assistants. It includes a self-healing agent capable of automatically generating and applying fixes for detected anomalies, as well as data anonymization middleware to mask sensitive information before it is transmitted to external AI providers.

The toolset covers a broad range of operational capabilities, including continuous health monitoring via an operator, compliance auditing against policy engines, and multi-cluster orchestration to identify widespread failure patterns. It also provides observability features such as diagnostic results export, observability metrics integration, and pod failure troubleshooting.

## Tags

### System Administration & Monitoring

- [Kubernetes Cluster Diagnostics](https://awesome-repositories.com/f/system-administration-monitoring/kubernetes-cluster-diagnostics.md) — Uses large language models to explain Kubernetes cluster errors and provide actionable remediation instructions.
- [Cluster Health Monitoring](https://awesome-repositories.com/f/system-administration-monitoring/cluster-health-monitoring.md) — Scans Kubernetes resources for issues and generates automated explanations for identified problems. ([source](https://github.com/k8sgpt-ai/k8sgpt/blob/main/MCP.md))
- [Cluster Monitoring Systems](https://awesome-repositories.com/f/system-administration-monitoring/monitoring-and-observability/observability-platforms/metric-performance-monitors/cluster-monitoring-systems.md) — Performs background scanning as an operator to collect and aggregate metrics for monitoring cluster health. ([source](https://github.com/k8sgpt-ai/k8sgpt#readme))
- [Kubernetes Monitors](https://awesome-repositories.com/f/system-administration-monitoring/monitoring-and-observability/observability-platforms/metric-performance-monitors/infrastructure-monitoring/kubernetes-monitors.md) — Scans multiple Kubernetes environments continuously to track health and export diagnostic metrics to observability platforms.
- [Cluster Component Diagnostics](https://awesome-repositories.com/f/system-administration-monitoring/cluster-component-diagnostics.md) — Analyzes core resources including nodes and services to identify failures and provide automated resolutions. ([source](https://github.com/k8sgpt-ai/k8sgpt#readme))
- [Pod Troubleshooting](https://awesome-repositories.com/f/system-administration-monitoring/diagnostic-tools/diagnostics/failure-analysis-tools/build-failure-troubleshooting/startup-failure-diagnostics/instance-startup-failure-troubleshooting/pod-troubleshooting.md) — Retrieves container logs and system events to troubleshoot specific deployments through guided workflows. ([source](https://github.com/k8sgpt-ai/k8sgpt/blob/main/MCP.md))
- [Cluster Health Monitors](https://awesome-repositories.com/f/system-administration-monitoring/real-time-monitoring-systems/continuous-evaluation-monitors/cluster-health-monitors.md) — Runs as a background operator to perform recurring cluster scans and export diagnostic results as observability metrics.

### Artificial Intelligence & ML

- [Operational Self-Healing](https://awesome-repositories.com/f/artificial-intelligence-ml/agentic-automation-frameworks/operational-self-healing.md) — Provides autonomous agent capabilities to monitor system health and apply fixes to resolve Kubernetes anomalies.
- [AI Backend Integrations](https://awesome-repositories.com/f/artificial-intelligence-ml/ai-backend-integrations.md) — Connects the diagnostic tool to various cloud-based, local, or custom LLM backends for data processing. ([source](https://github.com/k8sgpt-ai/k8sgpt/blob/main/INTEGRATIONS.md))
- [AI Assistant Integrations](https://awesome-repositories.com/f/artificial-intelligence-ml/ai-assistant-integrations.md) — Exposes Kubernetes diagnostic tools to AI clients via the Model Context Protocol.
- [Model Context Protocol Integrations](https://awesome-repositories.com/f/artificial-intelligence-ml/ai-assistant-integrations/model-context-protocol-integrations.md) — Implements the Model Context Protocol to expose cluster diagnostic functions to AI assistants. ([source](https://github.com/k8sgpt-ai/k8sgpt/blob/main/MCP.md))
- [MCP Server Integrations](https://awesome-repositories.com/f/artificial-intelligence-ml/artificial-intelligence-tooling/agent-and-tool-integrations/mcp-server-integrations.md) — Exposes diagnostic tools and system data to external AI clients via a Model Context Protocol server.
- [External Diagnostic Integrations](https://awesome-repositories.com/f/artificial-intelligence-ml/external-service-integrations/external-diagnostic-integrations.md) — Connects to external HTTP services to execute user-defined analysis logic within diagnostic workflows. ([source](https://github.com/k8sgpt-ai/k8sgpt/blob/main/SECURITY_SELF_ASSESSMENT.md))
- [Model Context Protocol Servers](https://awesome-repositories.com/f/artificial-intelligence-ml/model-context-protocol-servers.md) — Implements a Model Context Protocol server that exposes cluster diagnostics as tools for AI assistants.

### Part of an Awesome List

- [Sensitive Data Redaction](https://awesome-repositories.com/f/awesome-lists/devtools/information-extraction/sensitive-data-identification/sensitive-data-redaction.md) — Masks sensitive information and resource labels to prevent data leakage when transmitting cluster data to AI providers. ([source](https://github.com/k8sgpt-ai/k8sgpt#readme))
- [Detection and Monitoring](https://awesome-repositories.com/f/awesome-lists/ai/detection-and-monitoring.md) — Scans Kubernetes clusters to diagnose and triage issues in plain English.

### Development Tools & Productivity

- [Infrastructure Diagnostic Explanations](https://awesome-repositories.com/f/development-tools-productivity/lsp-client-extensions/ai-diagnostic-explanations/infrastructure-diagnostic-explanations.md) — Provides human-readable explanations and remediation steps for Kubernetes cluster errors using generative AI. ([source](https://github.com/k8sgpt-ai/k8sgpt/blob/main/GENERAL_TECHNICAL_REVIEW.md))
- [Custom Detection Logic](https://awesome-repositories.com/f/development-tools-productivity/custom-detection-logic.md) — Allows the execution of user-defined analysis logic to detect project-specific cluster issues. ([source](https://github.com/k8sgpt-ai/k8sgpt/blob/main/CHANGELOG.md))

### DevOps & Infrastructure

- [Infrastructure Analysis](https://awesome-repositories.com/f/devops-infrastructure/infrastructure/version-control-systems/git-based-repositories/git-based-code-analysis-platforms/llm-based-analysis/infrastructure-analysis.md) — Sends Kubernetes resource data to generative AI backends for human-readable technical analysis.
- [Platform Resource Diagnostics](https://awesome-repositories.com/f/devops-infrastructure/platform-resource-management/platform-resource-diagnostics.md) — Diagnoses issues within standard clusters and platform-specific extensions to identify resource failures. ([source](https://github.com/k8sgpt-ai/k8sgpt/blob/main/INTEGRATIONS.md))
- [Multi-Cluster Orchestrators](https://awesome-repositories.com/f/devops-infrastructure/kubernetes-cluster-management/multi-cluster-orchestrators.md) — Coordinates the scanning and diagnosis of multiple separate Kubernetes environments to identify widespread failure patterns.
- [Multi-Cluster Diagnostics](https://awesome-repositories.com/f/devops-infrastructure/multi-cluster-diagnostics.md) — Scans and diagnoses issues across several environments simultaneously to identify widespread failure patterns. ([source](https://github.com/k8sgpt-ai/k8sgpt/blob/main/ROADMAP.md))

### Education & Learning Resources

- [Kubernetes AI Diagnostics](https://awesome-repositories.com/f/education-learning-resources/error-diagnostics/kubernetes-ai-diagnostics.md) — Functions as an automated analyzer using LLMs to explain cluster errors and suggest remediation steps.

### Security & Cryptography

- [Kubernetes Infrastructure Remediation](https://awesome-repositories.com/f/security-cryptography/automated-configuration-remediation/kubernetes-infrastructure-remediation.md) — Provides a self-healing agent that automatically generates and applies fixes for detected Kubernetes anomalies.
- [Cluster Resource Remediation](https://awesome-repositories.com/f/security-cryptography/automated-configuration-remediation/cluster-resource-remediation.md) — Generates and applies self-healing recommendations to resolve detected anomalies within the cluster. ([source](https://github.com/k8sgpt-ai/k8sgpt/blob/main/ROADMAP.md))
- [Compliance Security Audits](https://awesome-repositories.com/f/security-cryptography/compliance-security-audits.md) — Checks policy reports and configuration resources to identify security or compliance misconfigurations. ([source](https://github.com/k8sgpt-ai/k8sgpt/blob/main/CHANGELOG.md))
- [Security and Compliance](https://awesome-repositories.com/f/security-cryptography/governance-policy-frameworks/compliance-governance/security-and-compliance.md) — Integrates with policy engines to analyze cluster configurations against governance and compliance rules. ([source](https://github.com/k8sgpt-ai/k8sgpt/blob/main/ROADMAP.md))
- [Kubernetes Policy Engines](https://awesome-repositories.com/f/security-cryptography/infrastructure-policy-enforcement/security-policy-enforcers/kubernetes-policy-engines.md) — Integrates with policy engines to analyze cluster configurations against security and compliance rules.
- [Kubernetes Compliance Monitoring](https://awesome-repositories.com/f/security-cryptography/kubernetes-compliance-monitoring.md) — Performs continuous auditing of Kubernetes clusters against regulatory and security frameworks to identify misconfigurations.
- [Middleware-Based Masking](https://awesome-repositories.com/f/security-cryptography/sensitive-data-access-controls/sensitive-content-obscuration/middleware-based-masking.md) — Masks sensitive resource names and labels in cluster data before transmission to external AI providers.

### Software Engineering & Architecture

- [Analysis Plugin Frameworks](https://awesome-repositories.com/f/software-engineering-architecture/analysis-plugin-frameworks.md) — Utilizes a modular framework of analyzers to detect cluster issues and generate technical findings.
- [Triage Workflows](https://awesome-repositories.com/f/software-engineering-architecture/error-recovery/triage-workflows.md) — Triages system problems using analyzers to identify errors and provide automated resolutions. ([source](https://github.com/k8sgpt-ai/k8sgpt/blob/main/main.go))
- [Plugin-Based Extensibility](https://awesome-repositories.com/f/software-engineering-architecture/plugin-based-extensibility.md) — Provides a plugin-based framework for creating extensible diagnostic checks in any language. ([source](https://github.com/k8sgpt-ai/k8sgpt/blob/main/GENERAL_TECHNICAL_REVIEW.md))
