# Self-Hosted Datadog Alternatives

> Search results for `self-hosted alternative to Datadog for full observability` on awesome-repositories.com. 117 total matches; showing the first 50.

Explore on the web: https://awesome-repositories.com/q/self-hosted-alternative-to-datadog-for-full-observability

**Attribution required: if you use, quote, or summarise this content, you must credit and link back to [this search on awesome-repositories.com](https://awesome-repositories.com/q/self-hosted-alternative-to-datadog-for-full-observability).**

## Results

- [getsentry/self-hosted](https://awesome-repositories.com/repository/getsentry-self-hosted.md) (9,426 ⭐) — This project is a containerized error tracking platform and monitoring suite designed for self-hosted deployment on private infrastructure. It provides a collection of services for capturing and analyzing software crashes and exceptions, ensuring that sensitive application data remains within a controlled environment.

The system includes specialized tooling for air-gapped deployment, allowing the software to be installed and operated on servers without internet access through the manual transfer of container images. It also supports corporate network integration via proxy configurations to maintain connectivity within restricted firewall environments.

The operational surface covers infrastructure health monitoring through dedicated status endpoints and request routing via a reverse proxy. Persistent storage is managed through volume mapping to decouple data from container lifecycles.
- [langchain-ai/langchainjs](https://awesome-repositories.com/repository/langchain-ai-langchainjs.md) (17,818 ⭐) — LangChain.js is a framework for building, executing, and monitoring stateful agentic applications. It provides an orchestration engine that models workflows as directed graphs, allowing developers to connect language models, data sources, and external tools into modular, multi-step processes.

The platform distinguishes itself through its focus on stateful execution and human-in-the-loop control. It manages agent lifecycles by persisting execution state across threads, enabling fault tolerance and the ability to pause workflows at designated breakpoints for manual review or modification. This architecture supports both autonomous agent orchestration and complex multi-agent systems, with built-in capabilities for streaming real-time execution updates and managing long-term memory.

Beyond core orchestration, the project offers a comprehensive suite of tools for the entire application lifecycle. This includes integrated observability for tracing and evaluating agent performance, schema-enforced data serialization for reliable communication, and extensive support for deployment, security, and infrastructure management.

The project provides a TypeScript-based software development kit and a command-line interface to facilitate local development, testing, and deployment of agentic workflows.
- [kedacore/keda](https://awesome-repositories.com/repository/kedacore-keda.md) (10,314 ⭐) — KEDA is a Kubernetes event-driven autoscaler and cloud event scaling engine. It functions as a custom metrics provider that monitors external event sources—including message brokers, databases, and cloud metrics—to dynamically adjust the replica counts of containerized workloads.

The project is distinguished by its scale-to-zero workflow, which reduces workloads to zero replicas during inactivity and automatically restarts them when new events are detected. It operates as a multi-cloud event trigger system, using a pluggable scaler interface to integrate with a wide array of third-party services and cloud identity providers.

The system manages the scaling of various resource types, including deployments and discrete Kubernetes jobs. It provides comprehensive identity and authentication support via integration with cloud secret managers, IAM roles, and vault services. Additionally, it includes observability features for exporting telemetry via OpenTelemetry and tools for calculating complex scaling logic using multi-source metric aggregation.
- [berriai/litellm](https://awesome-repositories.com/repository/berriai-litellm.md) (50,579 ⭐) — LiteLLM is a unified gateway and proxy server designed to centralize access to over one hundred language model providers. It provides a standardized API interface that abstracts vendor-specific schemas, allowing developers to interact with diverse models through a single, consistent format. By acting as a central traffic management layer, it enables organizations to route, secure, and govern model interactions across multiple deployments.

The platform distinguishes itself through its policy-driven architecture, which uses configuration-based routing to manage traffic distribution, load balancing, and automatic fallbacks without requiring code changes. It incorporates a robust security and compliance layer that enforces content moderation, secret redaction, and fine-grained access control. Additionally, it supports complex operational requirements such as semantic routing, rule-based complexity scoring, and persistent virtual key management for multi-tenant environments.

Beyond core routing, the project provides comprehensive governance and observability tools to monitor usage, track spending, and log request metadata across teams. It includes an integrated software development kit for tool calling and agent orchestration, alongside support for advanced features like response caching, batch processing, and structured output configuration. The system is designed for enterprise-wide deployment, offering features for audit logging, single sign-on integration, and granular cost reporting.
- [stoatchat/self-hosted](https://awesome-repositories.com/repository/stoatchat-self-hosted.md) (2,497 ⭐) — This project is a self-hosted communication suite and private messaging infrastructure. It is a containerized chat platform designed for deployment on independent hardware to maintain full control over user data and server dependencies.

The system features a modular plugin framework that allows custom features and behaviors to be loaded into the client at runtime via manifest files. It is designed as a proxy-compatible service, supporting configurable network port routing to operate behind external reverse proxy servers.

The platform covers capabilities for containerized service orchestration, private communication infrastructure deployment, and custom plugin development.
- [signoz/signoz](https://awesome-repositories.com/repository/signoz-signoz.md) (27,355 ⭐) — SigNoz is a full-stack observability platform designed to collect, store, and visualize metrics, logs, and distributed traces in a unified environment. It leverages OpenTelemetry-based data collection to ingest telemetry from diverse sources using vendor-neutral protocols, ensuring interoperability across complex microservices architectures. The platform utilizes a high-performance columnar storage engine to enable rapid aggregation and filtering, providing a centralized backend for monitoring application health and performance.

What distinguishes the platform is its focus on automated instrumentation and semantic correlation. It allows users to capture telemetry data across various programming languages and frameworks without manual code changes, often requiring only simple environment variable updates. Once ingested, the system automatically links logs, metrics, and traces through shared identifiers, enabling seamless navigation between different telemetry types during root cause analysis. The frontend further supports this by using virtualized rendering to efficiently display complex distributed traces containing millions of spans.

The platform provides a comprehensive suite of tools for infrastructure monitoring, application performance tracking, and log management. Users can define complex alert conditions and manage monitoring configurations as version-controlled resources, ensuring consistency across deployment environments. Additionally, the system includes specialized support for monitoring large language model applications and provides visual query pipelines that translate user-defined filters into optimized database queries for real-time dashboard generation.

The entire observability stack can be deployed using container orchestration tools, with built-in utilities for verifying service status and managing data retention.
- [cockroachdb/cockroach](https://awesome-repositories.com/repository/cockroachdb-cockroach.md) (32,207 ⭐) — Cockroach is a distributed SQL database designed to scale horizontally across multiple nodes while maintaining strict ACID compliance and global data consistency. It functions as a relational database engine that automatically partitions data into ranges, rebalancing them across a cluster to accommodate growing storage and throughput requirements. By utilizing a distributed consensus protocol, the system ensures that all nodes agree on the order of operations, providing fault tolerance and continuous availability even in the event of hardware failures.

The system distinguishes itself through a layered architecture that separates the relational SQL abstraction from a distributed key-value store. It achieves global consistency without requiring perfectly synchronized hardware clocks by employing a hybrid logical clock synchronization mechanism. To support high-concurrency environments, it utilizes multi-version concurrency control and lock-free transaction execution, which allow for consistent snapshots and efficient conflict resolution. Furthermore, the engine is built for compatibility, implementing the standard wire protocol to support existing relational database drivers and tools.

Beyond its core transactional capabilities, the platform includes comprehensive tooling for cluster orchestration, security, and performance diagnostics. It supports a variety of deployment models, ranging from self-hosted on-premises configurations to fully managed cloud services. The system provides a command-line interface for session management and query execution, ensuring that administrators can monitor cluster health and manage workloads through standard relational interfaces.
- [keephq/keep](https://awesome-repositories.com/repository/keephq-keep.md) (11,938 ⭐) — Keep is an open-source AIOps alert management platform that aggregates, deduplicates, and orchestrates the lifecycle of alerts from multiple monitoring tools. It functions as a multi-provider integration hub to centralize the flow of data between observability, ticketing, and communication tools.

The platform distinguishes itself through incident workflow automation and AI-powered enrichment. It uses a declarative workflow engine to execute multi-step operational sequences and integrates large language models to summarize event data and correlate technical logs for faster incident resolution.

The system provides broader capabilities for unified alert routing and bi-directional state synchronization across external platforms. It includes a containerized observability stack for telemetry and employs role-based access control and database-backed authentication to secure system entry.

The platform is deployed as a series of containerized services, including frontend, backend, and websocket layers.
- [datadog/terraform-provider-datadog](https://awesome-repositories.com/repository/datadog-terraform-provider-datadog.md) (451 ⭐) — Terraform Datadog provider
- [langfuse/langfuse](https://awesome-repositories.com/repository/langfuse-langfuse.md) (29,190 ⭐) — Langfuse is an open-source observability and evaluation platform designed for language model applications. It provides a centralized system for tracking execution traces, monitoring performance metrics, and managing prompt templates. By capturing hierarchical units of work and telemetry data, the platform enables developers to debug complex application lifecycles and analyze token usage, latency, and model interactions in production environments.

The platform distinguishes itself through an integrated evaluation framework that allows for systematic benchmarking and automated scoring of model outputs. Users can perform comparative experimentation by running multiple prompt or model versions side-by-side, and convert production traces into versioned test datasets to validate performance against ground truth. A dedicated prompt management system further decouples logic from application code, offering a playground for refinement and dynamic fetching of versioned templates.

Beyond core observability, the project supports a comprehensive suite of administrative and operational tools, including organizational access controls, identity provider integration, and automated workflow triggers. It is built for flexible deployment, supporting containerized orchestration in private, cloud, or Kubernetes-based environments to ensure data control and high-availability scaling.

The platform is designed for self-hosting and provides infrastructure-as-code templates to facilitate consistent environment setup. It integrates with standard observability ecosystems through open telemetry support and offers programmatic interfaces for headless management and automated deployment workflows.
- [datadog/jmeter-datadog-backend-listener](https://awesome-repositories.com/repository/datadog-jmeter-datadog-backend-listener.md) (0 ⭐) — Datadog Backend Listener for Apache JMeter is a JMeter plugin used to send test results to the Datadog platform. It includes the following features:
- [formbricks/formbricks](https://awesome-repositories.com/repository/formbricks-formbricks.md) (12,391 ⭐) — Formbricks is an open-source survey and feedback platform designed to help teams capture and analyze user insights through targeted, in-app, and website-based interactions. It functions as a comprehensive customer experience analytics system that allows organizations to maintain full control over their data, user attributes, and survey workflows.

The platform distinguishes itself through its event-driven architecture, which enables precise behavioral targeting by triggering surveys based on specific user actions or application events. It supports deep integration with external ecosystems by automatically synchronizing response data to CRMs, databases, and communication tools, while providing programmatic interfaces for managing resources and automating feedback loops.

Beyond core collection, the system includes advanced logic for conditional branching, scoring, and personalized routing to create adaptive survey experiences. It offers extensive customization options, including white-labeling, CSS overrides, and multi-channel distribution across web, mobile, and email environments.

The platform is built for self-hosting, supporting containerized deployments with built-in multi-tenant data isolation and enterprise-grade security features like single sign-on and role-based access control.
- [pgsty/pigsty](https://awesome-repositories.com/repository/pgsty-pigsty.md) (4,703 ⭐) — Pigsty is a full-stack orchestration suite for deploying, monitoring, and managing high-availability PostgreSQL clusters and their supporting infrastructure. It functions as a cluster management platform and high-availability suite that automates failover, manages virtual IPs, and ensures data consistency through distributed consensus.

The project distinguishes itself by providing a comprehensive database infrastructure-as-code framework and a dedicated observability stack. It incorporates a backup and recovery manager supporting point-in-time recovery via S3-compatible object storage, alongside compatibility layers that allow PostgreSQL to emulate the wire protocols of Oracle, MySQL, and MongoDB.

Its broader capabilities cover database security hardening through role-based access control and traffic encryption, performance tuning for specific workloads, and advanced traffic management via connection pooling and load balancing. The platform also supports the deployment of integrated components such as Redis, Kafka, and vector search for retrieval-augmented generation tasks.

The system uses idempotent playbooks for infrastructure automation and provides a graphical user interface for cluster administration and web-based database exploration.
- [prometheus-operator/prometheus-operator](https://awesome-repositories.com/repository/prometheus-operator-prometheus-operator.md) (9,941 ⭐) — The Prometheus Operator is a Kubernetes monitoring orchestrator and controller that manages Prometheus clusters and observability components through declarative custom resources. It functions as a custom resource controller that translates high-level Kubernetes resource definitions into the configuration files required by the underlying monitoring software.

The project automates the deployment, scaling, and lifecycle of an observability stack, including the integration of components like Thanos and Alertmanager. It distinguishes itself by syncing monitoring targets, alerting rules, and scrape configurations directly via the Kubernetes API to maintain a consistent desired state across the cluster.

The system covers several capability areas, including automated target discovery via label queries, declarative alerting and recording rule management, and the configuration of remote storage endpoints. It also handles infrastructure state management, synthetic endpoint probing, and the synchronization of notification routing and receivers.

Resource correctness is maintained through admission webhooks that validate configuration rules and resource schemes before they are persisted to the cluster.
- [sindresorhus/observable-to-promise](https://awesome-repositories.com/repository/sindresorhus-observable-to-promise.md) (51 ⭐) — Convert an Observable to a Promise
- [hoppscotch/hoppscotch](https://awesome-repositories.com/repository/hoppscotch-hoppscotch.md) (79,618 ⭐) — Hoppscotch is an open-source API development ecosystem designed for building, testing, and debugging REST, GraphQL, and real-time APIs. It provides a unified platform that functions across web browsers, desktop applications, and command-line interfaces, allowing developers to manage the entire API lifecycle from a single environment.

The platform distinguishes itself through a highly interactive, command-driven interface that utilizes a global spotlight palette and keyboard shortcuts to streamline complex workflows. It supports advanced request manipulation and validation by executing JavaScript-based scripts and assertions within a sandboxed runtime. Furthermore, it integrates AI-assisted tools to automate the generation of request payloads, test scripts, and documentation, while maintaining compatibility with existing API definitions and collections from other formats.

Beyond core testing capabilities, the project offers a collaborative workspace for teams to organize, share, and synchronize API collections and environment variables. It includes robust support for diverse authorization methods, proxy interception for network requests, and enterprise-grade features such as SCIM user provisioning and activity auditing. The software is available for self-hosted deployment via containerized architectures, ensuring consistent behavior across various production and development environments.
- [coollabsio/coolify](https://awesome-repositories.com/repository/coollabsio-coolify.md) (57,055 ⭐) — This project is a self-hosted platform-as-a-service that provides a centralized management interface for deploying, configuring, and monitoring containerized applications and databases on private infrastructure. It functions as a visual control plane, automating the end-to-end lifecycle of services from source code to production. By managing container orchestration, networking, and resource allocation, it allows users to maintain full control over their own hardware while streamlining the delivery of software.

The platform distinguishes itself through its agentless architecture, which uses secure shell connections to execute administrative tasks and manage remote servers without requiring persistent local software. It integrates directly with version control systems to trigger automated build and deployment pipelines, including the creation of temporary, isolated preview environments for every pull request. This workflow is supported by a declarative engine that uses templates to standardize the deployment of complex multi-container architectures and persistent database engines.

Beyond core orchestration, the system handles the operational requirements of hosted services by managing dynamic reverse-proxy routing and automated SSL certificate lifecycles. It provides a comprehensive suite of infrastructure management tools, including browser-based terminal access for debugging, automated system dependency installation, and persistent state management via a central database. These capabilities ensure that infrastructure remains synchronized and consistent across multiple remote environments.
- [coleam00/local-ai-packaged](https://awesome-repositories.com/repository/coleam00-local-ai-packaged.md) (3,539 ⭐) — This project is a containerized local AI infrastructure stack designed to deploy large language models and vector databases on private hardware. It functions as an orchestration platform that combines AI runners, knowledge graphs, and a visual workflow builder for creating agentic chatflows and automating tasks via tool integration.

The platform distinguishes itself through a low-code approach to agent orchestration, utilizing a visual interface to design complex sequences and connect agents to external tools and search engines. It includes a dedicated local observability stack to track prompts, traces, and application performance, as well as hardware-specific optimization profiles to maximize inference speed on graphics processors and central processing units.

The system covers a broad range of operational capabilities, including retrieval-augmented generation via vector database storage, centralized traffic routing with reverse proxy encryption, and shared-volume filesystem mounting for local data synchronization. It also manages network exposure to toggle between private and public web traffic configurations.

The infrastructure is deployed as a pre-configured set of Docker-based services.
- [datadog/ddqa](https://awesome-repositories.com/repository/datadog-ddqa.md) (109 ⭐) — Datadog's QA manager for releases of GitHub repositories
- [datadog/datadog-agent](https://awesome-repositories.com/repository/datadog-datadog-agent.md) (3,519 ⭐) — The Datadog Agent is an infrastructure monitoring agent and host telemetry collector. It functions as a background process that gathers system metrics and application health data to send to a centralized monitoring platform.

The project operates as a plugin-based metric collector, using a modular system of independent check scripts to gather data from various third-party services and applications. It serves as a remote telemetry transmitter, providing a pipeline to stream infrastructure and system information to a remote analysis and alerting backend.

Its capabilities cover application performance monitoring, host resource tracking, and infrastructure performance monitoring. The agent collects low-level system telemetry from the operating system kernel and filesystem while aggregating application-level performance data to identify service degradation.
- [dubinc/dub](https://awesome-repositories.com/repository/dubinc-dub.md) (23,722 ⭐) — This project is a comprehensive link management and marketing attribution platform designed for creating, tracking, and analyzing shortened URLs. It functions as a centralized hub for marketing analytics, providing tools to monitor link performance, visualize conversion funnels, and manage affiliate programs through a unified dashboard.

The platform distinguishes itself by integrating advanced attribution modeling and partner management directly into the link infrastructure. It supports complex marketing workflows, including automated commission calculations, fraud detection, and payout distribution for affiliates, alongside granular traffic redirection based on device, location, or A/B testing requirements. By utilizing custom domains and reverse proxy configurations, it ensures reliable data collection that bypasses common browser-based tracking restrictions.

Beyond core link operations, the system offers extensive programmatic capabilities, including a robust API, SDKs, and event-driven webhooks for real-time integration with external services. It also incorporates enterprise-grade administrative features such as multi-tenant workspace isolation, role-based access control, and single sign-on integration to support collaborative team environments.

The platform is built to be deployed within private infrastructure, allowing organizations to maintain full control over their data and system configuration.
- [jasontaylordev/cleanarchitecture](https://awesome-repositories.com/repository/jasontaylordev-cleanarchitecture.md) (19,657 ⭐) — This project is a comprehensive template for building enterprise-grade applications using clean architecture principles. It provides a structured foundation that decouples core business logic from infrastructure concerns, ensuring that domain entities remain independent of specific frameworks or database implementations. By utilizing a mediator-based request dispatching pattern, the system separates state-mutating commands from read-only queries, promoting a clean separation of concerns across the entire codebase.

The architecture is organized into vertical slices, grouping related logic and dependencies into self-contained feature folders to prevent code bloat and simplify navigation. This approach is supported by an automated request pipeline that handles cross-cutting concerns such as validation, authorization, and logging consistently. The template also includes robust scaffolding tools that generate standardized project structures, allowing developers to quickly initialize multi-layered solutions with pre-configured database persistence and API integration.

Beyond the core structure, the project provides extensive tooling for the full development lifecycle. This includes automated database management, integrated service orchestration for local development, and a multi-layered testing suite that covers unit, integration, and acceptance scenarios. The framework also incorporates built-in observability features, such as request auditing, performance monitoring, and standardized error handling, to maintain system reliability and transparency in distributed environments.
- [appwrite/appwrite](https://awesome-repositories.com/repository/appwrite-appwrite.md) (56,318 ⭐) — Appwrite is a backend-as-a-service platform that provides a unified development environment for building full-stack applications. It integrates essential infrastructure components—including authentication, databases, storage, and serverless functions—into a single, centralized interface to simplify application development and resource management.

The platform distinguishes itself through a container-based microservices architecture that ensures consistent execution across diverse infrastructure. It features a versatile connectivity layer that links frontend applications with third-party services, databases, and external APIs through standardized interfaces. Developers can manage and automate the configuration of these backend resources using infrastructure-as-code tools, while granular role-based access control enforces security policies across all platform resources and API endpoints.

Beyond its core services, the platform offers a broad capability surface that includes cross-platform data synchronization, event-driven webhooks, and comprehensive billing and usage monitoring. It supports extensive integrations for AI utilities, payment processing, messaging, and logging, allowing developers to extend application functionality through modular, event-driven workflows.

The platform is designed for both managed and self-hosted deployments, providing tools for production environment optimization, data migration, and custom domain configuration.
- [quarkusio/quarkus](https://awesome-repositories.com/repository/quarkusio-quarkus.md) (15,479 ⭐) — Quarkus is a Kubernetes-native Java framework designed for building high-performance, memory-efficient applications. It utilizes ahead-of-time native compilation to transform Java code into standalone, optimized binaries that eliminate the need for a virtual machine, enabling rapid startup and reduced memory consumption. By performing code augmentation during the build phase, it shifts heavy processing tasks away from runtime, ensuring that applications are optimized for cloud-native environments.

The framework distinguishes itself through a unified approach to reactive and imperative programming, allowing developers to mix non-blocking, event-driven logic with traditional blocking code. It features a specialized dependency injection container optimized for build-time resolution and supports virtual thread concurrency to improve throughput in high-concurrency environments. Its container-native lifecycle management ensures seamless integration with cloud infrastructure, providing automated health monitoring and service orchestration.

Quarkus covers a broad capability surface, including comprehensive support for RESTful web services, event-driven messaging, and secure identity management. It integrates with standard enterprise specifications and provides extensive tooling for automated infrastructure provisioning, distributed tracing, and observability. The platform also includes a developer-focused dashboard and live-coding capabilities to streamline the development lifecycle.

The project provides extensive documentation and a modular extension system that allows developers to add features while maintaining native compatibility. It is designed to be installed and managed through standard build automation tools, supporting a wide range of deployment targets including serverless functions and managed Kubernetes clusters.
- [datawranglerai/self-host-n8n-on-gcr](https://awesome-repositories.com/repository/datawranglerai-self-host-n8n-on-gcr.md) (608 ⭐) — Self-host n8n on Google Cloud without the subscription fees or server headaches - because your automation workflows shouldn't cost more than your coffee budget
- [johanmorganti/osm-datadog](https://awesome-repositories.com/repository/johanmorganti-osm-datadog.md) (0 ⭐) — Monitoring OpenStreetMap with Datadog
- [gravitl/netmaker](https://awesome-repositories.com/repository/gravitl-netmaker.md) (11,630 ⭐) — Netmaker is a platform for automating and managing virtual mesh networks built on WireGuard. It functions as a centralized control plane that orchestrates encrypted, peer-to-peer tunnels across distributed infrastructure, including cloud environments, on-premise data centers, and containerized clusters. By automating the configuration of routing tables and access policies, the system enables secure, private connectivity between diverse devices and services without requiring manual network administration.

The platform distinguishes itself through its focus on zero-trust network access and software-defined perimeters, which hide network resources from the public internet while enforcing granular, identity-based security policies. It supports complex network topologies by providing dynamic relay-based routing for firewall-traversal and gateway-based bridging for isolated subnets. These capabilities allow for the creation of scalable, high-performance overlays that maintain consistent connectivity even when direct peer-to-peer paths are unavailable.

Beyond core connectivity, the project provides a comprehensive suite of management tools, including automated node provisioning, private service discovery via integrated DNS, and multi-tenant infrastructure support. It also offers robust observability features, such as administrative audit logging and network health monitoring, to ensure operational visibility. The entire networking stack can be self-hosted to maintain data sovereignty, and the platform integrates with external identity providers to streamline authentication and device onboarding.
- [linkedin/school-of-sre](https://awesome-repositories.com/repository/linkedin-school-of-sre.md) (8,093 ⭐) — This project is a comprehensive educational resource and curriculum focused on site reliability engineering, distributed systems, and infrastructure operations. It provides technical guides, a systems engineering course, and instructional manuals designed to teach the principles of managing large-scale computing environments.

The curriculum covers high-level architectural design for scalability and resilience, including fault-tolerant infrastructure, high-availability patterns, and microservices decomposition. It emphasizes the practical application of site reliability engineering through the study of system design, resource estimation, and the elimination of single points of failure.

The material extends into broad operational capabilities, including container orchestration, continuous integration and delivery pipelines, layered observability, and network routing. It also provides detailed instruction on Linux system administration, database management, security auditing, and the implementation of service level indicators and objectives.
- [datadog/dd-trace-java](https://awesome-repositories.com/repository/datadog-dd-trace-java.md) (724 ⭐) — Datadog APM client for Java
- [collabnix/dockerlabs](https://awesome-repositories.com/repository/collabnix-dockerlabs.md) (8,008 ⭐) — dockerlabs is a collection of educational labs and technical tutorials designed to teach the fundamentals of containerization and microservice architecture. It provides instructional material and hands-on exercises covering image optimization, security training, infrastructure setup, and cluster orchestration.

The project features specific courses and guides focused on reducing image size through multi-stage builds, securing workloads via vulnerability scanning and encrypted networks, and deploying multi-node clusters with high availability using Swarm orchestration.

The materials cover a broad range of operational capabilities, including container lifecycle management, persistent data storage, and complex networking configurations. It also includes guidance on implementing observability stacks for monitoring and logging, as well as the administration of private image registries.
- [amruthpillai/reactive-resume](https://awesome-repositories.com/repository/amruthpillai-reactive-resume.md) (38,613 ⭐) — This project is a web-based platform designed for creating, managing, and sharing professional resumes. It functions as a structured document builder that integrates artificial intelligence to assist with content generation, editing, and analysis. Users can maintain a collection of resumes, customize their visual presentation through various templates, and export them into multiple formats for job applications.

The platform distinguishes itself through its autonomous AI agent capabilities, which can perform research, suggest incremental edits, and apply data patches directly to documents. It also provides a secure, self-hostable environment that allows users to maintain full control over their data and infrastructure. The system supports advanced authentication methods, including passkeys and federated identity providers, ensuring that personal and professional information remains protected.

Beyond core editing, the application includes tools for document organization, such as tagging, filtering, and legacy data migration. It features a robust document generation engine that separates content from design, allowing for precise layout control and styling. Users can share their resumes via password-protected public URLs and monitor document performance through integrated analytics.

The application is designed for containerized deployment, utilizing Docker Compose to facilitate consistent installation across private infrastructure. It includes built-in health monitoring and feature flagging to manage system performance and functionality without requiring code redeployments.
- [roberthein/observable](https://awesome-repositories.com/repository/roberthein-observable.md) (378 ⭐) — The easiest way to observe values in Swift.
- [macrozheng/springcloud-learning](https://awesome-repositories.com/repository/macrozheng-springcloud-learning.md) (6,924 ⭐) — This project is a reference implementation of a distributed system built using Spring Cloud Alibaba, Spring Boot, and JDK 17. It serves as a comprehensive model for implementing a microservices architecture.

The system integrates a wide range of distributed patterns, including global transaction coordination for data consistency, OAuth2 and JWT for identity management, and Kubernetes-based container orchestration. It features a dedicated observability stack for distributed request tracing, log aggregation, and service health monitoring.

The implementation covers several functional domains, including e-commerce operations such as product inventory management, order processing, and marketing campaign execution. It also incorporates technical capabilities for asynchronous message queuing, distributed data caching, full-text search, and cloud object storage.

The project provides deployment templates for Kubernetes to manage the scaling and reliability of the microservices cluster.
- [grpc-ecosystem/go-grpc-middleware](https://awesome-repositories.com/repository/grpc-ecosystem-go-grpc-middleware.md) (6,749 ⭐) — go-grpc-middleware is a gRPC middleware framework for Go designed to handle cross-cutting concerns, reliability, and observability. It provides a collection of interceptors that can be used to modify inbound and outbound calls to enforce system-wide policies.

The framework distinguishes itself through specialized toolkits for service reliability, including automatic retry logic for failed client calls and panic recovery mechanisms that translate runtime crashes into standard error responses. It also features an observability suite for collecting performance metrics and recording request activity via adapter-based logging.

The project covers broad capability areas including traffic management through rate limiting and timeout enforcement, as well as API integration for inbound message validation and request authentication verification. It employs architectural patterns such as interceptor chaining and conditional execution logic to control how middleware is applied to specific service methods.
- [cfahlgren1/observers](https://awesome-repositories.com/repository/cfahlgren1-observers.md) (0 ⭐) — A Lightweight Library for AI Observability
- [asciinema/asciinema](https://awesome-repositories.com/repository/asciinema-asciinema.md) (16,852 ⭐) — Asciinema is a platform for capturing, replaying, and sharing command-line sessions. It provides a comprehensive suite of tools to record terminal activity into lightweight, text-based files that preserve ANSI escape sequences, allowing users to document technical workflows, troubleshooting steps, and software demonstrations with high fidelity.

The project distinguishes itself through its versatile playback and distribution capabilities. It features a web-based player that renders interactive terminal sessions directly in the browser, supporting features like seeking, playback speed control, and custom visual themes. Beyond interactive playback, it includes utilities for converting recordings into animated images or videos, and provides infrastructure for self-hosting recording servers to maintain full control over data storage and security.

The platform supports a wide range of integration and automation needs, including embedding interactive sessions into technical documentation, broadcasting live terminal activity to remote viewers, and programmatically generating recordings via scripts. It also offers robust management tools for indexing, searching, and organizing historical session data.

The software is designed for flexible deployment, with server and storage components packaged into containerized units for independent hosting.
- [alibaba/higress](https://awesome-repositories.com/repository/alibaba-higress.md) (7,558 ⭐) — Higress is an AI API gateway and cloud-native traffic manager that functions as a Kubernetes ingress controller. It provides a centralized system for routing, securing, and optimizing traffic directed toward large language models, AI agents, and microservice architectures.

The project distinguishes itself through deep AI orchestration, including the ability to host and manage Model Context Protocol servers that transform REST APIs into tools for AI agents. It features specialized AI infrastructure for model request proxying, protocol translation across multiple providers, and semantic-based caching to reduce token consumption and latency.

Broad capabilities cover API lifecycle management and traffic control, including canary releases, load balancing, and rate limiting. The system includes a comprehensive security suite with WAF filtering, OIDC and OAuth2 identity integration, and automated TLS certificate management. Extensibility is provided via a WebAssembly-based plugin system that allows for hot-loading custom logic without interrupting traffic.

The gateway can be deployed to Kubernetes or Docker and supports the Kubernetes Gateway API and Ingress standards.
- [fosrl/pangolin](https://awesome-repositories.com/repository/fosrl-pangolin.md) (21,255 ⭐) — Pangolin is a zero-trust remote access platform designed to provide secure, identity-aware connectivity to private network resources. It functions as a cloud-native network controller that orchestrates encrypted tunnels, traffic routing, and access policies across distributed environments. By leveraging WireGuard for secure data transport, the platform enables authenticated access to internal web applications, terminal sessions, and remote desktops without exposing services to the public internet.

The platform distinguishes itself through a declarative infrastructure model that synchronizes network state using version-controlled manifests. It supports complex connectivity requirements through peer-to-peer NAT traversal, which facilitates direct encrypted connections between nodes, with automatic fallback to server-based relaying when necessary. Additionally, it provides browser-based access to remote resources, eliminating the need for local client software for many common administrative and service-access tasks.

Beyond its core tunneling capabilities, the platform includes a comprehensive suite of tools for traffic management, security, and observability. It features granular access control policies based on user identity, geolocation, and network attributes, alongside automated certificate management and multi-factor authentication. The system also provides extensive monitoring, audit logging, and alerting capabilities to track infrastructure health and security events across multi-site deployments.

Pangolin is designed for containerized and multi-site environments, offering flexible deployment options through standard packaging and automated reconciliation workflows.
- [n8n-io/self-hosted-ai-starter-kit](https://awesome-repositories.com/repository/n8n-io-self-hosted-ai-starter-kit.md) (14,997 ⭐) — This project provides a dockerized AI workflow stack and orchestration templates for deploying a self-hosted AI environment. It establishes a localized infrastructure for building autonomous agents and model chains that process private data on-premises without external cloud dependencies.

The environment is designed to support autonomous agent development, allowing models to dynamically select tools, execute shell commands, and interact with local file systems. It includes integrated vector database support to enable retrieval augmented generation and private document analysis.

The stack covers a broad range of capabilities, including local model inference hosting, node-based workflow sequencing, and stateful conversation memory. It also incorporates text analysis tools for embedding generation, structured information extraction, and automated file system change triggers.
- [apache/incubator-skywalking](https://awesome-repositories.com/repository/apache-incubator-skywalking.md) (24,832 ⭐) — SkyWalking is a comprehensive observability stack and application performance monitoring platform. It functions as a distributed tracing system and an AI application monitor, providing a centralized suite for collecting and analyzing logs, metrics, and traces to maintain the health of containerized architectures.

The platform distinguishes itself through a service topology visualizer that renders interactive maps of infrastructure dependencies and communication patterns. It also includes specialized capabilities for generative AI workflow observation to track the execution flow and performance of AI components within a software stack.

The system covers a broad range of monitoring capabilities, including automated performance alerting driven by machine learning for anomaly detection. Its telemetry surface encompasses distributed request tracing, log pipeline management, and the aggregation of performance metrics for microservices and system resource profiling.
- [tpei/observable](https://awesome-repositories.com/repository/tpei-observable.md) (9 ⭐) — Implementation of the Observer pattern in crystal
- [capsoftware/cap](https://awesome-repositories.com/repository/capsoftware-cap.md) (17,026 ⭐) — Cap is a self-hosted screen recording and video collaboration platform designed for teams to replace synchronous meetings with asynchronous video updates. It provides a comprehensive suite for capturing high-resolution desktop activity, including system audio, microphone input, and camera overlays, which are then processed through an integrated post-production workflow.

The platform distinguishes itself by offering full data sovereignty through containerized deployment and object storage abstractions, allowing users to host their media assets on private infrastructure or S3-compatible buckets. Beyond simple recording, it features keyframe-based video compositing, automated AI-powered transcription, and visual branding tools that enable creators to polish and annotate their content before sharing.

The system facilitates team engagement through a centralized workspace where viewers can provide feedback via timestamped comments, reactions, and playback analytics. It also includes programmatic interfaces for embedding videos into external applications, managing media assets, and automating distribution workflows.

The project is distributed as a containerized application, enabling deployment on private servers to maintain complete control over data storage and access permissions.
- [chatwoot/chatwoot](https://awesome-repositories.com/repository/chatwoot-chatwoot.md) (31,959 ⭐) — Chatwoot is a self-hosted, omnichannel customer support platform designed to aggregate messages from diverse social and digital channels into a single, collaborative team inbox. It provides organizations with full data ownership and control over their support infrastructure, ensuring strict logical separation of customer data through multi-tenant architecture. By centralizing communication, the platform enables teams to manage, route, and resolve inquiries within a unified workspace that maintains complete interaction history for every contact.

The platform distinguishes itself through an event-driven automation engine and a visual rule builder that allow teams to manage conversations and workflows without writing custom code. It incorporates intelligent features such as automated response drafting, conversation context recall, and a self-service knowledge base to improve agent efficiency. These capabilities are supported by granular role-based access controls and comprehensive performance analytics, which provide insights into agent productivity, inbox activity, and customer satisfaction trends.

Beyond its core messaging and routing functions, the system offers a broad suite of operational tools including proactive engagement triggers, team workload balancing, and multilingual support. It supports flexible deployment strategies, including containerized and cloud-native orchestration, to accommodate various production environments. The platform is designed for extensibility, allowing for custom attribute management and integration with external systems via webhooks and API-based channels.
- [kubesphere/kubesphere](https://awesome-repositories.com/repository/kubesphere-kubesphere.md) (16,842 ⭐) — KubeSphere is a distributed operating system for cloud-native application management that provides a centralized control plane for Kubernetes clusters. It functions as a comprehensive DevOps portal, enabling teams to orchestrate containerized workloads, manage CI/CD pipelines, and enforce security policies across hybrid cloud, datacenter, and edge environments.

The platform distinguishes itself through its multi-cluster federation capabilities and robust multi-tenancy model, which allow for logical resource isolation and granular access control across shared infrastructure. It integrates a modular plugin architecture that supports platform extensibility, enabling users to customize observability, storage, and security components to meet specific operational requirements.

Beyond core management, the platform provides a unified observability suite that aggregates metrics, logs, and distributed traces to visualize system health and microservice topology. It also includes advanced traffic governance tools, such as service mesh integration and automated release strategies, to maintain stability during application updates.

The project offers a web-based dashboard and a flexible installer to simplify the provisioning and administration of container platforms. It supports diverse infrastructure needs, ranging from bare metal load balancing to hardware accelerator management, through a unified graphical interface.
- [tc39/proposal-observable](https://awesome-repositories.com/repository/tc39-proposal-observable.md) (3,107 ⭐) — Observables for ECMAScript
- [getsentry/sentry](https://awesome-repositories.com/repository/getsentry-sentry.md) (44,108 ⭐) — This project is a comprehensive software observability suite and application performance monitoring platform designed to track runtime errors, performance bottlenecks, and system health. It functions as a centralized diagnostic service that aggregates and categorizes exceptions, providing the infrastructure necessary to visualize complex execution paths across distributed systems and microservices.

The platform distinguishes itself through a high-throughput distributed event ingestion pipeline and a columnar storage analytics engine that enables rapid aggregation of large-scale performance metrics. It utilizes runtime-level instrumentation hooks to capture execution data directly from the host environment and employs symbolication-based stack trace resolution to map minified code or raw memory addresses back to original source files. Furthermore, the system includes specialized capabilities for monitoring the operational performance of AI agents and ensuring sensitive data compliance through schema-driven scrubbing of incoming event payloads.

Beyond core error tracking and tracing, the platform supports a wide range of programming languages and frameworks, allowing for consistent visibility across diverse software architectures. It integrates with external services to automate incident response workflows and provides a command-line interface for managing releases, debug symbols, and project configurations. The system also features a modular, plugin-based architecture that facilitates connectivity with third-party tools for issue tracking and alerting.
- [redux-observable/redux-observable](https://awesome-repositories.com/repository/redux-observable-redux-observable.md) (0 ⭐) — RxJS-based middleware for Redux. Compose and cancel async actions to create side effects and more.
- [healthchecks/healthchecks](https://awesome-repositories.com/repository/healthchecks-healthchecks.md) (9,891 ⭐) — Healthchecks is a heartbeat monitoring service and cron job monitoring tool designed to track the execution and success of scheduled tasks and systemd timers. It functions as a dead man switch, alerting users when expected periodic signals from remote processes fail to arrive.

The system accepts health signals via HTTP and SMTP, allowing it to track infrastructure heartbeats from sources ranging from CI/CD workflows to network routers. It distinguishes itself by supporting the capture of diagnostic data, including exit codes and execution logs, and by calculating the duration between start and success signals to detect hanging jobs.

The platform includes a health dashboard, status badge generation, and a Prometheus-compatible metrics exporter for external observability. Alerts are routed through a multi-channel notification system including webhooks and SMS, while large request payloads can be offloaded to S3-compatible object storage.

User security is managed through WebAuthn two-factor authentication and optional reverse proxy identity integration.
- [gitroomhq/postiz-app](https://awesome-repositories.com/repository/gitroomhq-postiz-app.md) (32,271 ⭐) — Postiz is an open-source social media management platform designed to centralize the scheduling, publishing, and analysis of content across diverse social networks, community forums, and blogging platforms. It functions as a unified hub where users can coordinate, review, and distribute content through a shared team workspace, while leveraging integrated artificial intelligence to assist in drafting text and generating multimedia assets.

The platform distinguishes itself through a modular architecture that utilizes a provider-specific adapter pattern to ensure consistent content distribution across various external services. It incorporates an AI-driven tool execution model that connects natural language models to internal functions, enabling automated content generation and media configuration. Furthermore, the system provides a programmatic API gateway that allows external applications to interact with its scheduling and management features via structured payloads.

Beyond core scheduling, the platform includes comprehensive tools for performance tracking, media storage abstraction, and collaborative workflows. It supports complex content strategies through features like multi-part thread scheduling and automated campaign execution, while maintaining secure identity management through OAuth-based mediation and support for external identity providers.

The application is designed for self-hosting and can be deployed into containerized environments using provided configuration charts.
- [dataelement/bisheng](https://awesome-repositories.com/repository/dataelement-bisheng.md) (11,455 ⭐) — Bisheng is an enterprise AI framework and LLM DevOps platform designed to manage the full lifecycle of large language models. It provides a unified system for dataset curation, supervised fine-tuning, model versioning, and performance evaluation.

The platform features a visual workflow orchestrator for building retrieval-augmented generation pipelines and complex task sequences using flowcharts with conditional logic and human intervention points. It also includes an AI agent framework that uses a specialized guidance language to embed domain expertise and professional business logic into autonomous agents.

The system covers comprehensive enterprise AI governance through role-based access control, single sign-on, and integrated observability tools for monitoring system health and traffic. Additional capabilities include layout-aware document parsing for extracting text and tables from printed or handwritten sources and high-availability infrastructure deployment.
