# prometheus-operator/kube-prometheus

**Attribution required: if you use, quote, or summarise this content, you must credit and link back to [awesome-repositories.com](https://awesome-repositories.com/repository/prometheus-operator-kube-prometheus).**

7,682 stars · 2,048 forks · Jsonnet · Apache-2.0

## Links

- GitHub: https://github.com/prometheus-operator/kube-prometheus
- Homepage: https://prometheus-operator.dev/
- awesome-repositories: https://awesome-repositories.com/repository/prometheus-operator-kube-prometheus.md

## Topics

`alerts` `cluster-monitoring` `dashboards` `hacktoberfest` `kubernetes` `operator` `prometheus` `prometheus-operator`

## Description

kube-prometheus is a monitoring stack deployment and orchestration framework. It uses an operator pattern to automate the installation and lifecycle management of Prometheus and Alertmanager via custom resource definitions.

The project focuses on scaling data collection through hash-based target sharding and topology-aware distribution to reduce cross-zone traffic. It implements a sidecar-based configuration reloading mechanism and utilizes consistent hashing to distribute scrape targets across multiple instances.

The system covers broad observability capabilities including metric data collection, distributed alerting rule evaluation, and alert notification routing. It manages data persistence through dynamic storage provisioning and ensures long-term data availability via object storage backups and remote write configurations.

The implementation is primarily written in Jsonnet.

## Tags

### DevOps & Infrastructure

- [Operator-Based Lifecycles](https://awesome-repositories.com/f/devops-infrastructure/deployment-lifecycle-managers/operator-based-lifecycles.md) — Automates the deployment, scaling, and upgrading of monitoring stacks using the Kubernetes Operator pattern.
- [Automatic Monitoring Shard Scaling](https://awesome-repositories.com/f/devops-infrastructure/automatic-compute-scaling/automatic-monitoring-shard-scaling.md) — Adjusts the number of monitoring instances automatically to distribute targets across shards based on workload. ([source](https://prometheus-operator.dev/docs/proposals/accepted/shard-autoscaling/))
- [Custom Resource Mappings](https://awesome-repositories.com/f/devops-infrastructure/configuration-management/file-based-configuration/custom-resource-mappings.md) — Translates high-level Kubernetes custom resources into Prometheus and Alertmanager configuration files using the operator pattern.
- [Monitoring Stack Automation](https://awesome-repositories.com/f/devops-infrastructure/deployment-management/installation-package-management/automated-installers/monitoring-stack-automation.md) — Automates the installation and lifecycle of Prometheus and Alertmanager using custom resource definitions. ([source](https://prometheus-operator.dev/docs/getting-started/introduction/))
- [Monitoring Shard Scaling](https://awesome-repositories.com/f/devops-infrastructure/devops/operational-reliability/stateful-workload-orchestration/stateful-workload-scaling/monitoring-shard-scaling.md) — Implements hash-based target sharding to distribute scrape workloads across multiple Prometheus instances for scalability. ([source](https://prometheus-operator.dev/docs/getting-started/introduction/))
- [Alertmanager Deployment](https://awesome-repositories.com/f/devops-infrastructure/kubernetes-deployments/alertmanager-deployment.md) — Orchestrates the deployment of high-availability Alertmanager clusters integrated with custom configuration objects. ([source](https://prometheus-operator.dev/docs/api-reference/api/))
- [Lightweight Metrics Agents](https://awesome-repositories.com/f/devops-infrastructure/kubernetes-deployments/metric-infrastructure-deployments/lightweight-metrics-agents.md) — Runs lightweight instances focused on scraping and forwarding metrics without utilizing local storage. ([source](https://prometheus-operator.dev/docs/getting-started/design/))
- [Dynamic Rule Management](https://awesome-repositories.com/f/devops-infrastructure/configuration-management/configuration-resolution-engines/project-configuration-managers/dynamic-rule-management.md) — Supports the dynamic loading of alert and recording rules without requiring a process restart. ([source](https://prometheus-operator.dev/docs/getting-started/design/))
- [High-Availability Instance Redundancy](https://awesome-repositories.com/f/devops-infrastructure/high-availability-instance-redundancy.md) — Runs multiple identical monitoring instances with unique labels to ensure high availability and eliminate single points of failure. ([source](https://prometheus-operator.dev/docs/platform/high-availability/))
- [Remote Write Configuration Delegation](https://awesome-repositories.com/f/devops-infrastructure/storage-management/remote-write-endpoints/remote-write-configuration-delegation.md) — Allows users to define their own remote write settings via scoped resources without requiring administrative access. ([source](https://prometheus-operator.dev/docs/proposals/accepted/remote-write/))
- [Storage Provisioning](https://awesome-repositories.com/f/devops-infrastructure/storage-provisioning.md) — Integrates with storage classes to automatically create and mount high-performance volumes across cloud providers. ([source](https://prometheus-operator.dev/docs/platform/storage/))

### System Administration & Monitoring

- [Observability Stack Deployments](https://awesome-repositories.com/f/system-administration-monitoring/observability-stack-deployments.md) — Automates the installation and management of a full monitoring stack, including Prometheus, Alertmanager, and associated dashboards. ([source](https://prometheus-operator.dev/docs/getting-started/design/))
- [Alert Notification Management](https://awesome-repositories.com/f/system-administration-monitoring/alert-notification-management.md) — Deduplicates alerts and routes grouped notifications to external integrations such as email or chat. ([source](https://prometheus-operator.dev/docs/platform/platform-guide/))
- [Alert Routing](https://awesome-repositories.com/f/system-administration-monitoring/alert-routing.md) — Configures how alerts are grouped, inhibited, and routed to specific receivers based on matching metadata rules. ([source](https://prometheus-operator.dev/docs/api-reference/api/))
- [Label-Based Targeting](https://awesome-repositories.com/f/system-administration-monitoring/automation-target-specifications/label-based-targeting.md) — Automatically discovers monitoring targets by matching Kubernetes resource labels against predefined selector patterns.
- [Pull-Based Metric Scraping](https://awesome-repositories.com/f/system-administration-monitoring/logging-and-telemetry/metric-data-ingestion/pull-based-metric-scraping.md) — Implements a pull-based mechanism to collect telemetry from Kubernetes pods using service discovery. ([source](https://prometheus-operator.dev/docs/api-reference/api/))
- [Monitoring Rule Definitions](https://awesome-repositories.com/f/system-administration-monitoring/monitoring-and-observability/rule-based-alerting-engines/alerting-rule-validators/monitoring-rule-definitions.md) — Creates rules for triggering alerts or pre-calculating metrics based on discovered system data. ([source](https://prometheus-operator.dev/docs/api-reference/api/))
- [Scrape Configuration Automators](https://awesome-repositories.com/f/system-administration-monitoring/monitoring-and-observability/scrape-configuration-automators.md) — Automates the generation of scrape configurations for various targets using custom resource definitions. ([source](https://prometheus-operator.dev/docs/proposals/implemented/scrape-config/))
- [Monitoring Instance Provisioning](https://awesome-repositories.com/f/system-administration-monitoring/monitoring-service-deployment-templates/monitoring-instance-provisioning.md) — Sets up required permissions and service accounts to discover and scrape targets within a cluster. ([source](https://prometheus-operator.dev/docs/platform/platform-guide/))
- [Pod Discovery](https://awesome-repositories.com/f/system-administration-monitoring/pod-discovery.md) — Finds and configures monitoring for specific pods using Kubernetes label selectors. ([source](https://prometheus-operator.dev/docs/getting-started/design/))
- [Prometheus Cluster Management](https://awesome-repositories.com/f/system-administration-monitoring/prometheus-cluster-management.md) — Automates the deployment and scaling of Prometheus instances using cluster-wide stateful sets or node-local scopes. ([source](https://prometheus-operator.dev/docs/proposals/implemented/prometheus-agent/))
- [Service Discovery](https://awesome-repositories.com/f/system-administration-monitoring/service-discovery.md) — Identifies active Kubernetes pods and services as monitoring endpoints using label selectors. ([source](https://prometheus-operator.dev/docs/getting-started/design/))
- [Alert Notification Routing](https://awesome-repositories.com/f/system-administration-monitoring/alert-notification-routing.md) — Prometheus Operator configures routing, inhibition rules, and custom receivers for alert notifications. ([source](https://prometheus-operator.dev/docs/getting-started/design/))
- [Object Storage Backups](https://awesome-repositories.com/f/system-administration-monitoring/backup-disaster-recovery/object-storage-backups.md) — Streams time-series data to external object storage targets to enable global querying and long-term retention. ([source](https://prometheus-operator.dev/docs/platform/thanos/))
- [Configuration Hot-Reloading](https://awesome-repositories.com/f/system-administration-monitoring/configuration-hot-reloading.md) — Uses a sidecar container to monitor configuration changes and trigger API reloads without restarting the main process.
- [External Service Monitoring](https://awesome-repositories.com/f/system-administration-monitoring/external-service-monitoring.md) — Probes ingresses and static endpoints using a prober service to monitor external availability. ([source](https://prometheus-operator.dev/docs/getting-started/design/))
- [Metric Data Preservation](https://awesome-repositories.com/f/system-administration-monitoring/metric-data-preservation.md) — Preserves metric samples during network outages or pod evictions by utilizing persistent storage. ([source](https://prometheus-operator.dev/docs/proposals/implemented/prometheus-agent/))
- [Alerting Rule Validators](https://awesome-repositories.com/f/system-administration-monitoring/monitoring-and-observability/rule-based-alerting-engines/alerting-rule-validators.md) — Implements validation checks to ensure monitoring rules are syntactically and semantically correct before application. ([source](https://prometheus-operator.dev/docs/getting-started/introduction/))
- [PromQL-Based Alerting and Recording Rule Evaluation](https://awesome-repositories.com/f/system-administration-monitoring/monitoring-and-observability/rule-based-alerting-engines/alerting-rule-validators/sql-based-alerting-rules/promql-based-alerting-and-recording-rule-evaluation.md) — Executes PromQL alerting and recording rules against a distributed query API to generate alerts. ([source](https://prometheus-operator.dev/docs/platform/thanos/))
- [Distributed Rule Processing](https://awesome-repositories.com/f/system-administration-monitoring/monitoring-and-observability/rule-based-alerting-engines/alerting-rule-validators/sql-based-alerting-rules/promql-based-alerting-and-recording-rule-evaluation/distributed-rule-processing.md) — Executes recording and alerting rules across multiple instances using a centralized ruler for scalability. ([source](https://prometheus-operator.dev/docs/getting-started/design/))
- [Rule Evaluators](https://awesome-repositories.com/f/system-administration-monitoring/monitoring-and-observability/rule-based-alerting-engines/rule-evaluation-debuggers/rule-evaluators.md) — Deploys specialized instances to evaluate monitoring rules via compatible API endpoints. ([source](https://prometheus-operator.dev/docs/api-reference/api/))
- [Reusable Security Profiles](https://awesome-repositories.com/f/system-administration-monitoring/monitoring-and-observability/scrape-configuration-automators/reusable-security-profiles.md) — Defines reusable TLS and authorization profiles to avoid redundant configuration across monitoring resources. ([source](https://prometheus-operator.dev/docs/proposals/implemented/scrapeclasses/))
- [Node-Local Monitoring Agents](https://awesome-repositories.com/f/system-administration-monitoring/node-local-monitoring-agents.md) — Runs monitoring agents on every node to optimize metric collection from local targets and reduce network traffic. ([source](https://prometheus-operator.dev/docs/proposals/accepted/agent-daemonset/))
- [Scrape Job Aggregation](https://awesome-repositories.com/f/system-administration-monitoring/prometheus-exporters/custom-metric-scraping/scrape-job-aggregation.md) — Aggregates scrape configurations using service discovery mechanisms like DNS, cloud providers, and cluster APIs. ([source](https://prometheus-operator.dev/docs/api-reference/api/))
- [Agent-Mode Metric Forwarding](https://awesome-repositories.com/f/system-administration-monitoring/prometheus-metric-ingestion/agent-mode-metric-forwarding.md) — Provides a lightweight agent deployment that collects metrics and forwards them to remote storage without utilizing local long-term storage. ([source](https://prometheus-operator.dev/docs/api-reference/api/))
- [Remote Write Resource Constraints](https://awesome-repositories.com/f/system-administration-monitoring/remote-write-ingestion/remote-write-resource-constraints.md) — Imposes constraints on queue capacity and metadata transmission to maintain cluster stability during remote writes. ([source](https://prometheus-operator.dev/docs/proposals/accepted/remote-write/))

### Data & Databases

- [Push-Based Controller Topologies](https://awesome-repositories.com/f/data-databases/cluster-topology-management/push-based-controller-topologies.md) — Distributes scrape targets based on network topology or zones to reduce cross-zone traffic. ([source](https://prometheus-operator.dev/docs/proposals/accepted/zone-aware-sharding/))
- [Topology Aware Sharding](https://awesome-repositories.com/f/data-databases/distributed-sharding-architectures/process-sharding/topology-aware-sharding.md) — Pins shards to specific zones to restrict scraping to local targets and reduce network traffic. ([source](https://prometheus-operator.dev/docs/platform/sharding/))
- [Data Forwarders](https://awesome-repositories.com/f/data-databases/data-engineering-infrastructure/data-persistence-storage/data-storage/data-forwarders.md) — Streams collected metrics to remote long-term storage without requiring local evaluation. ([source](https://prometheus-operator.dev/docs/platform/prometheus-agent/))
- [Data Persistence](https://awesome-repositories.com/f/data-databases/data-persistence.md) — Ensures metric data survives pod restarts and rescheduling by coupling pods with persistent volume claims.
- [Shard Data Preservation](https://awesome-repositories.com/f/data-databases/data-sharding/shard-data-preservation.md) — Retains pods from scaled-down shards so historical metric data remains queryable until the retention period expires. ([source](https://prometheus-operator.dev/docs/platform/sharding/))

### Development Tools & Productivity

- [Automated Target Configuration](https://awesome-repositories.com/f/development-tools-productivity/configuration-generators/configuration-file-generators/automated-target-configuration.md) — Generates target configurations automatically using label selectors instead of manual configuration files. ([source](https://prometheus-operator.dev/docs/getting-started/introduction/))
- [Dynamic Scrape Target Discovery](https://awesome-repositories.com/f/development-tools-productivity/targeting-utilities/dynamic-scrape-target-discovery.md) — Identifies monitoring targets for scraping using label selectors and intermediary services. ([source](https://prometheus-operator.dev/docs/developer/getting-started/))
- [Scrape Target Sharding](https://awesome-repositories.com/f/development-tools-productivity/targeting-utilities/dynamic-scrape-target-discovery/scrape-target-sharding.md) — Splits scrape targets into groups assigned to different instances to scale collection capacity. ([source](https://prometheus-operator.dev/docs/platform/high-availability/))
- [Target Distribution](https://awesome-repositories.com/f/development-tools-productivity/targeting-utilities/dynamic-scrape-target-discovery/target-distribution.md) — Splits scrape targets across multiple instances using hash-based distribution to scale data collection. ([source](https://prometheus-operator.dev/docs/platform/sharding/))
- [Custom Scrape Targets](https://awesome-repositories.com/f/development-tools-productivity/targeting-utilities/dynamic-scrape-target-discovery/custom-scrape-targets.md) — Configures external monitoring targets or complex scrape patterns that exceed standard service monitor capabilities. ([source](https://prometheus-operator.dev/docs/developer/scrapeconfig/))
- [Sharding Override Controls](https://awesome-repositories.com/f/development-tools-productivity/targeting-utilities/dynamic-scrape-target-discovery/target-distribution/sharding-override-controls.md) — Determines which shard handles a target by overriding the default sharding hash with specific labels. ([source](https://prometheus-operator.dev/docs/platform/sharding/))

### Networking & Communication

- [Alerting Integrations](https://awesome-repositories.com/f/networking-communication/alerting-integrations.md) — Connects monitoring instances to an alerting cluster to automatically discover and forward triggered alerts. ([source](https://prometheus-operator.dev/docs/platform/platform-guide/))
- [Service Discovery Integrations](https://awesome-repositories.com/f/networking-communication/service-discovery-integrations.md) — Integrates with cloud providers and DNS to identify monitoring targets across diverse environments. ([source](https://prometheus-operator.dev/docs/developer/scrapeconfig/))
- [Health Status APIs](https://awesome-repositories.com/f/networking-communication/proxy-servers/status-monitoring-apis/health-status-apis.md) — Exposes operational status, replica counts, and health conditions via a dedicated API subresource. ([source](https://prometheus-operator.dev/docs/proposals/implemented/status-subresource/))

### Software Engineering & Architecture

- [Hash-Based Data Distribution](https://awesome-repositories.com/f/software-engineering-architecture/hash-based-data-distribution.md) — Distributes scrape targets across multiple Prometheus instances using consistent hashing to scale data collection.
- [Zonal Target Deployment](https://awesome-repositories.com/f/software-engineering-architecture/service-instance-managers/specialized-instance-deployment/zonal-target-deployment.md) — Generates node selectors to deploy instances within the same zone as the targets they monitor. ([source](https://prometheus-operator.dev/docs/proposals/accepted/zone-aware-sharding/))

### Testing & Quality Assurance

- [Node-Local Target Filtering](https://awesome-repositories.com/f/testing-quality-assurance/label-based-filtering/profiling-target-filters/node-local-target-filtering.md) — Ensures agents only target pods on the local node using field selectors to reduce API load. ([source](https://prometheus-operator.dev/docs/proposals/accepted/agent-daemonset/))
