# Streaming, queues and change data capture

> Search results for `Streaming, queues and change data capture` on awesome-repositories.com. 120 total matches; showing the first 50.

Explore on the web: https://awesome-repositories.com/q/streaming-queues-and-change-data-capture

**Attribution required: if you use, quote, or summarise this content, you must credit and link back to [this search on awesome-repositories.com](https://awesome-repositories.com/q/streaming-queues-and-change-data-capture).**

## Results

- [mvdctop/movie_data_capture](https://awesome-repositories.com/repository/mvdctop-movie-data-capture.md) (7,405 ⭐) — Movie Data Capture is a media library organizer and movie metadata scraper designed to automatically categorize and name files in a local media collection. It functions as an automated content tagger that identifies movie files and applies descriptive tags by extracting film details from web databases.

The system utilizes an HTTP web scraper to fetch information from external APIs and remote HTML content. It employs a filename pattern parser to extract movie titles and release years from local files using regular expressions, which are then used to automate search queries.

The tool maps scraped metadata to folders on a local file system and persists movie details and organization mappings using a JSON data store. These capabilities support home media server management by ensuring local titles are matched with correct descriptions and technical details.
- [debezium/debezium](https://awesome-repositories.com/repository/debezium-debezium.md) (12,421 ⭐) — Debezium is a distributed change data capture platform that streams row-level database modifications as real-time events. By parsing database transaction logs, the system broadcasts structural and data changes to message brokers, enabling reactive processing and data integration across distributed architectures.

The platform utilizes log-based capture to extract modifications directly from transaction logs, ensuring minimal impact on source system performance while maintaining the original commit order of operations. It employs database-specific connector adapters to translate proprietary binary formats into a unified event structure, supported by schema-registry-backed serialization to maintain consistent data definitions. To ensure a complete baseline for synchronization, the system performs snapshot-based initial states before transitioning to continuous event streaming.

The tool supports a broad range of data integration tasks, including the maintenance of analytical stores and the synchronization of data across operational systems. Users can refine the data stream by applying filters to include or exclude specific tables, columns, or data types, and the system maintains an accurate representation of data models by parsing structural statements during the capture process.

The project is implemented as a plugin for distributed message queues, facilitating integration into existing event-driven pipelines.
- [clickhouse/clickhouse](https://awesome-repositories.com/repository/clickhouse-clickhouse.md) (48,229 ⭐) — ClickHouse is a high-performance, columnar analytical database designed for real-time query execution and large-scale data aggregation. It functions as a distributed data warehouse capable of processing petabytes of information, while also providing an embedded engine that integrates directly into applications for native query capabilities without external dependencies. The system is built to handle high-throughput ingestion and complex analytical workloads, delivering millisecond-level latency for interactive dashboards and operational monitoring.

The platform distinguishes itself through advanced storage and execution techniques, including vectorized query processing and a merge tree storage engine that maintains performance during massive insertions. It features adaptive subcolumn mapping for semi-structured data and supports native vector search for machine learning and generative AI applications. To facilitate efficient data movement, the engine utilizes zero-copy shared memory buffers, minimizing overhead when interacting with external analytical tools or processing diverse file formats like Parquet, JSON, and Arrow.

Beyond its core storage and processing capabilities, the project provides a comprehensive suite of tools for observability, security, and data integration. It includes built-in support for natural language querying, automated workflow orchestration for AI agents, and extensive diagnostic features for query plan inspection. The platform also offers robust cloud infrastructure management, including support for private networking, compliant deployment strategies, and integrated billing consolidation.
- [kkulma/climate-change-data](https://awesome-repositories.com/repository/kkulma-climate-change-data.md) (0 ⭐) — ML open projects - APIs - Open Data
- [fastapi/fastapi](https://awesome-repositories.com/repository/fastapi-fastapi.md) (99,260 ⭐) — FastAPI is a web framework for building APIs with Python. It leverages standard language type hints to provide automatic data validation, request parsing, and interactive API documentation generation. The framework supports asynchronous request handling and manages execution contexts to prevent blocking the main event loop.

The project includes a dependency injection system that allows for the resolution and injection of reusable components into request handlers. This system supports request-scoped caching, lifecycle management, and integration with security mechanisms like OAuth2 and JSON Web Tokens. Developers can organize applications into modular routers and mount sub-applications to manage complex routing logic.

Infrastructure features include middleware support for cross-origin resource sharing, background task management, and static file serving. The framework automatically generates OpenAPI specifications for defined endpoints, which can be customized through metadata and schema extensions. Testing utilities are provided to simulate HTTP and WebSocket connections, allowing for isolated verification of application behavior.
- [supabase/supabase](https://awesome-repositories.com/repository/supabase-supabase.md) (104,317 ⭐) — This project provides an integrated backend platform built around a relational database. It automatically generates REST and GraphQL APIs from database schemas, allowing for direct data interaction through standard requests and client libraries. The platform includes a comprehensive authentication system that manages user identity, session handling, and fine-grained access control through database-native row-level security policies.

Beyond core data management, the platform offers specialized services for object storage, vector data processing for semantic search, and real-time communication features like broadcast messaging and database change subscriptions. It also supports server-side logic execution through globally distributed edge functions, database-resident functions, and a native job scheduler for automated tasks.

Developers can manage the entire project lifecycle using a command-line interface and containerized local development environments. The platform supports both managed cloud services and self-hosted deployments, providing options for infrastructure control and data sovereignty.
- [avelino/awesome-go](https://awesome-repositories.com/repository/avelino-awesome-go.md) (175,576 ⭐) — This project serves as a comprehensive language ecosystem index, functioning as a centralized, community-curated directory for the Go programming language. It organizes a vast landscape of software components, libraries, and development tools into a structured, navigable hierarchy, enabling developers to efficiently discover resources tailored to specific functional domains.

The repository distinguishes itself through a decentralized contribution model, where community-driven updates ensure the index remains current with the rapidly evolving software landscape. Beyond simple resource listing, it acts as a technical knowledge repository, aggregating professional literature, style guides, and best practices to support developer onboarding and professional growth across the entire software development lifecycle.

The directory covers a broad capability surface, including essential utilities for distributed systems engineering, application security, data processing, and development productivity. It provides access to specialized tools for database management, web framework integration, testing, and build automation, alongside educational materials that help developers master language-specific architectural patterns.

The project is maintained as a static resource aggregation, providing a holistic view of external links and documentation to orient developers within the Go ecosystem.
- [bee-queue/bee-queue](https://awesome-repositories.com/repository/bee-queue-bee-queue.md) (4,032 ⭐) — A simple, fast, robust job/task queue for Node.js, backed by Redis.
- [permify/permify](https://awesome-repositories.com/repository/permify-permify.md) (5,812 ⭐)
- [princemaple/elixir-queue](https://awesome-repositories.com/repository/princemaple-elixir-queue.md) (34 ⭐) — Queue data structure for Elixir-lang
- [datahub-project/datahub](https://awesome-repositories.com/repository/datahub-project-datahub.md) (12,141 ⭐) — DataHub is a metadata management platform designed to unify technical, operational, and business context across diverse data ecosystems. By utilizing a graph-based metadata model and an event-driven ingestion architecture, it creates a centralized source of truth that maps complex data relationships, lineage, and ownership. This foundational framework enables organizations to maintain a synchronized view of their data landscape, supporting both human-led discovery and automated data operations.

The platform distinguishes itself through its focus on grounding artificial intelligence and autonomous agents in verified enterprise context. It provides specialized capabilities to inject provenance-aware lineage, business definitions, and quality signals into AI prompts, ensuring that generated insights are accurate and trustworthy. Through a policy-as-code governance engine, it enforces access controls and compliance rules directly within the metadata graph, allowing for programmatic oversight of data assets across hybrid environments.

Beyond its core identity, the project offers a comprehensive suite of tools for data discovery, observability, and lifecycle management. It includes features for automated lineage extraction, impact analysis, and semantic search, enabling users to navigate data dependencies and resolve quality issues efficiently. The platform also supports collaborative workflows, allowing teams to manage business glossaries, certify data assets, and automate access requests through integrated communication channels.

DataHub is built to scale, utilizing a distributed architecture that allows storage, search, and graph processing layers to operate independently. It provides standardized interfaces and a bridge-based connector framework to facilitate integration with heterogeneous data sources and external AI agent frameworks.
- [crystal-lang/crystal](https://awesome-repositories.com/repository/crystal-lang-crystal.md) (20,299 ⭐) — Crystal is a statically typed, compiled programming language designed for high performance and memory safety. It leverages an LLVM-based compiler to translate source code into optimized machine-executable binaries, while its type-inference-based static analysis enforces strict safety rules during the build process.

The language distinguishes itself through a fiber-based concurrent runtime that manages lightweight execution units for asynchronous input and output without blocking the main process. It also features a powerful compile-time macro system that allows for the inspection and transformation of the abstract syntax tree, enabling developers to automate repetitive tasks and generate code dynamically during compilation. Furthermore, Crystal provides a native foreign function interface that maps native memory layouts and function signatures to local identifiers, facilitating direct interaction with external system libraries.

Beyond its core language features, Crystal includes a comprehensive suite of tooling for the entire software lifecycle. This includes dependency management, automated testing frameworks, documentation generation, and project scaffolding utilities. The ecosystem supports high-performance systems programming, cross-architecture compilation, and the production of statically linked binaries to simplify deployment across diverse environments.
- [cakephp/queue](https://awesome-repositories.com/repository/cakephp-queue.md) (0 ⭐) — This is a Queue system for CakePHP.
- [flowiseai/flowise](https://awesome-repositories.com/repository/flowiseai-flowise.md) (53,641 ⭐) — Flowise is a low-code platform designed for building and deploying complex language model workflows through a visual, node-based interface. It functions as an orchestrator for autonomous multi-agent systems, allowing users to construct conversational pipelines by connecting language models, memory stores, and external tools on a drag-and-drop canvas.

The platform distinguishes itself through its support for sophisticated agentic patterns, including supervisor-worker delegation and iterative reasoning strategies. Users can design directed acyclic graphs to manage conditional branching, state persistence, and complex task distribution. It also provides a robust framework for retrieval-augmented generation, enabling the creation of self-correcting systems that can index document data and validate information autonomously.

Beyond its visual design capabilities, the project serves as a comprehensive backend for AI applications. It includes a secure credential management layer for third-party API keys, role-based access controls, and a RESTful API that allows for programmatic management of chat sessions, workflows, and assistant configurations.

The application is designed for flexible deployment, supporting containerized environments for consistent operation across local and cloud infrastructure. Detailed documentation and tutorials are available to guide users through the lifecycle of building, testing, and scaling production-ready AI agents.
- [ibis-project/ibis](https://awesome-repositories.com/repository/ibis-project-ibis.md) (6,574 ⭐) — Ibis is a portable Python dataframe library and multi-backend query engine that provides a unified interface for executing data transformations across diverse compute engines. It functions as a Python SQL expression compiler and dialect transpiler, allowing users to define data logic once and execute it across cloud warehouses, embedded databases, and distributed clusters without rewriting code.

The project distinguishes itself through a database backend abstraction that decouples transformation logic from the underlying execution engine. It enables polyglot data workflows by mixing raw SQL strings with programmatic Python expressions and provides a dataframe interop layer to convert query results between various memory formats.

Its capability surface covers a broad range of data analysis operations, including row filtering, aggregation, and geospatial processing, as well as comprehensive schema and database hierarchy management. The system also supports data streaming via a unified batch and streaming API, as well as integration with cloud storage and various file formats for data import and export.
- [encoredev/encore](https://awesome-repositories.com/repository/encoredev-encore.md) (12,049 ⭐) — Encore is a distributed systems framework designed to unify backend development, infrastructure provisioning, and observability. It functions as an infrastructure-as-code platform that allows developers to define cloud resources, databases, and messaging topics directly within their application code. By analyzing these declarations at compile-time, the system automatically manages the deployment of cloud resources and security policies, ensuring parity between local development and production environments.

The platform distinguishes itself through its integrated development experience, which includes a local workspace that mirrors production infrastructure to facilitate testing and debugging. It provides automated AI-assisted development tools that leverage application metadata and runtime telemetry to aid in code generation and performance analysis. Furthermore, the framework enforces architectural standards and automates the creation of ephemeral, production-like environments for every pull request, streamlining the validation process before deployment.

Beyond its core orchestration capabilities, the framework includes a comprehensive suite for building type-safe APIs and event-driven services. It handles the complexities of service communication, including automated client library generation, request validation, and distributed tracing instrumentation. The system also incorporates robust security primitives, such as identity token validation, secret management, and automated traffic control, to support the development of secure, scalable backend architectures.
- [arroyosystems/arroyo](https://awesome-repositories.com/repository/arroyosystems-arroyo.md) (4,819 ⭐) — Arroyo is a high-performance stream processing platform built in Rust. It executes continuous SQL queries on streaming data with event-time semantics, enabling accurate windowed aggregations, joins, and stateful computations on unbounded event streams. The platform uses native Rust execution for high throughput and low latency, with periodic checkpointing for exactly-once fault tolerance and horizontal scaling across distributed workers.

The system integrates deeply with Kafka for reading and writing topics with exactly-once delivery and supports change data capture (CDC) from MySQL and Postgres databases via Debezium. A wide range of source and sink connectors covers systems such as Kinesis, Redis, Delta Lake, Iceberg, MQTT, NATS, and more. SQL pipelines can be defined ad hoc or as derived streams, with support for user-defined functions written in Rust or Python for custom transformation logic. Deployment is managed through a web UI, CLI, and REST API, with options for single-node, multi-node, or Kubernetes clusters using Helm.

Event-time processing includes watermarking to handle out-of-order data and supports tumbling, sliding, and session windows. The engine provides comprehensive SQL functions for string manipulation, timestamp arithmetic, JSON and array operations, data type conversion, and mathematical computations. Additional operational features include anomaly detection by counting events over time windows, synthetic data generation for testing, and authentication and TLS encryption for secure access.
- [adrianbrad/queue](https://awesome-repositories.com/repository/adrianbrad-queue.md) (357 ⭐) — ⏪️ Go package providing multiple queue implementations. Developed in a thread-safe generic way.
- [cockroachdb/cockroach](https://awesome-repositories.com/repository/cockroachdb-cockroach.md) (32,207 ⭐) — Cockroach is a distributed SQL database designed to scale horizontally across multiple nodes while maintaining strict ACID compliance and global data consistency. It functions as a relational database engine that automatically partitions data into ranges, rebalancing them across a cluster to accommodate growing storage and throughput requirements. By utilizing a distributed consensus protocol, the system ensures that all nodes agree on the order of operations, providing fault tolerance and continuous availability even in the event of hardware failures.

The system distinguishes itself through a layered architecture that separates the relational SQL abstraction from a distributed key-value store. It achieves global consistency without requiring perfectly synchronized hardware clocks by employing a hybrid logical clock synchronization mechanism. To support high-concurrency environments, it utilizes multi-version concurrency control and lock-free transaction execution, which allow for consistent snapshots and efficient conflict resolution. Furthermore, the engine is built for compatibility, implementing the standard wire protocol to support existing relational database drivers and tools.

Beyond its core transactional capabilities, the platform includes comprehensive tooling for cluster orchestration, security, and performance diagnostics. It supports a variety of deployment models, ranging from self-hosted on-premises configurations to fully managed cloud services. The system provides a command-line interface for session management and query execution, ensuring that administrators can monitor cluster health and manage workloads through standard relational interfaces.
- [aws/aws-cdk](https://awesome-repositories.com/repository/aws-aws-cdk.md) (12,817 ⭐) — The AWS Cloud Development Kit is an infrastructure-as-code framework that enables developers to define and provision cloud resources using familiar programming languages. By utilizing construct-based synthesis, it translates high-level, object-oriented code into declarative templates, allowing for the automated management of complex cloud environments through a centralized, code-driven control plane.

The framework distinguishes itself through its ability to model infrastructure as a dependency-aware resource graph, ensuring that components are provisioned and updated in the correct order. It employs a language-agnostic intermediate representation to synthesize these definitions into platform-specific configurations, while supporting aspect-oriented policy injection to apply security and compliance rules across infrastructure definitions during the synthesis phase.

Beyond core provisioning, the project provides a modular component registry for distributing and reusing pre-configured infrastructure building blocks. It supports multi-account orchestration, allowing for the deployment of consistent resource sets across different regions and accounts from a single template, and includes capabilities for detecting infrastructure drift to ensure deployed environments remain aligned with their defined state.

The project is distributed as a software development kit, providing programmatic interfaces to manage the full lifecycle of cloud resources and integrate infrastructure definitions directly into application codebases.
- [diamondio/better-queue](https://awesome-repositories.com/repository/diamondio-better-queue.md) (549 ⭐) — Better Queue for NodeJS
- [gocolly/colly](https://awesome-repositories.com/repository/gocolly-colly.md) (25,101 ⭐) — Colly is a high-performance web scraping framework designed for the automated extraction of structured data from websites. It provides a programmable toolkit that manages the complexities of large-scale data collection, including concurrent request orchestration, automatic cookie handling, and robots.txt compliance. By utilizing an asynchronous execution model, the engine maintains high throughput while preventing resource exhaustion during recursive or distributed crawling tasks.

The framework is distinguished by its modular, event-driven architecture, which allows developers to hook into specific lifecycle stages of a network request to process content or control flow. It features a flexible middleware pipeline for handling proxy rotation, user agents, and rate limiting, alongside an interface-driven storage layer that supports swapping default in-memory state for persistent external databases. This design enables the coordination of multiple scraping instances and the maintenance of crawl history across application restarts.

Beyond its core engine, the project offers extensive customization options for network transport, including support for custom round-trippers to manage connection pooling and timeouts. It also provides robust observability tools, allowing for the attachment of custom debuggers and logging observers to monitor internal state during execution. Developers can further extend functionality through a plugin system or by sharing request context and configuration across different collector instances to support complex, multi-stage data extraction workflows.
- [electric-sql/electric](https://awesome-repositories.com/repository/electric-sql-electric.md) (9,909 ⭐) — Electric is a Postgres data synchronization engine and replication proxy designed to enable local-first software. It replicates data from Postgres databases to client-side stores in real time using logical replication, allowing applications to maintain a local embedded database for offline access and low-latency updates.

The system distinguishes itself by using shapes to filter and authorize specific subsets of database rows and columns before streaming them to clients or edge workers. It further supports multi-user collaboration by integrating a conflict-free replicated data type framework to ensure consistent state synchronization across different users.

The project covers a broad range of capabilities, including reactive state management and real-time data streaming to client interfaces and server-side renders. It provides tools for data shaping and transformation, database integration across various cloud and serverless Postgres providers, and security primitives such as token-based authorization and end-to-end encryption.

The service can be deployed as a containerized web service on cloud platforms with support for rolling deployment management.
- [mosaicml/streaming](https://awesome-repositories.com/repository/mosaicml-streaming.md) (1,521 ⭐) — A Data Streaming Library for Efficient Neural Network Training
- [ariya/phantomjs](https://awesome-repositories.com/repository/ariya-phantomjs.md) (29,489 ⭐) — PhantomJS is a scriptable, headless browser engine based on WebKit that provides a programmatic interface for automating web page interactions. It operates without a graphical user interface, allowing for the execution of JavaScript to navigate pages, manipulate the document object model, and perform functional testing of web applications.

The tool distinguishes itself by providing low-level control over the browser rendering lifecycle and network stack. It enables real-time interception and modification of network traffic, alongside the ability to generate visual snapshots and document exports from pages that rely on complex dynamic content. By maintaining a virtual display buffer and running the engine in an isolated memory space, it ensures consistent layout calculations and stability during automated sessions.

Beyond its core rendering capabilities, the project supports complex automation workflows through command-line configuration and inter-process communication. These features facilitate the integration of browser-based tasks into larger software systems, enabling automated data extraction, performance analysis, and the verification of web application behavior.
- [rqlite/rqlite](https://awesome-repositories.com/repository/rqlite-rqlite.md) (17,586 ⭐) — rqlite is a distributed relational database that replicates SQLite data across a cluster using the Raft consensus algorithm. It functions as a fault-tolerant storage system that provides high availability and a web API for executing SQL queries and managing relational data without requiring native database drivers.

The system distinguishes itself by using an HTTP SQL interface to expose database operations and cluster management. It features a real-time change data capture stream that pushes database mutations to external HTTP endpoints via webhooks and supports the scaling of read throughput through non-voting read replicas.

The project covers a broad range of distributed capabilities, including automated cluster discovery via DNS or Consul, TLS-encrypted transport for inter-node communication, and atomic request execution. It also includes tools for point-in-time snapshot backups, node health monitoring, and cluster leadership transfer.
- [mafintosh/stream-shift](https://awesome-repositories.com/repository/mafintosh-stream-shift.md) (0 ⭐) — Returns the next buffer/object in a stream's readable queue
- [pubkey/rxdb](https://awesome-repositories.com/repository/pubkey-rxdb.md) (23,048 ⭐) — This project is a reactive, offline-first NoSQL database engine designed for JavaScript applications. It provides a robust framework for managing application state by synchronizing data across browsers, mobile devices, and server-side runtimes. By treating local storage as the primary source of truth, it enables applications to remain functional without network connectivity, automatically reconciling changes with remote backends once a connection is restored.

The database distinguishes itself through a modular architecture that supports cross-environment synchronization and high-performance data management. It features a bidirectional replication protocol that handles conflict resolution and state convergence, alongside a pluggable storage abstraction that allows developers to swap between engines like IndexedDB, SQLite, or in-memory stores without altering application logic. To ensure responsiveness, the system offloads storage operations to background worker threads and coordinates database access across multiple browser tabs through a leader election mechanism.

The platform offers a comprehensive suite of capabilities for data integrity, performance, and security. It enforces strict data validation through schema-based definitions and optimizes storage footprints using transparent key compression. Developers can bind database query results directly to user interface components, enabling reactive state management where the UI automatically updates in response to local or remote data changes.

The project is built for extensibility, offering a wide range of plugins for encryption, full-text search, and integration with various backend protocols including GraphQL, REST, and peer-to-peer channels. It provides extensive documentation and standardized interfaces to facilitate integration into diverse application architectures.
- [sindresorhus/p-queue](https://awesome-repositories.com/repository/sindresorhus-p-queue.md) (4,217 ⭐) — Promise queue with concurrency control
- [google/comprehensive-rust](https://awesome-repositories.com/repository/google-comprehensive-rust.md) (33,049 ⭐) — Comprehensive Rust is a structured educational curriculum designed to teach the Rust programming language, focusing on its core principles of memory safety, performance, and type correctness. The project provides a comprehensive learning path for software engineers, covering the language's ownership model, borrow checking, and compile-time validation mechanisms that eliminate common memory-related errors without the need for a garbage collector.

The curriculum distinguishes itself by offering specialized modules that demonstrate how to apply these safety guarantees in diverse, high-performance environments. It includes dedicated training for systems programming, bare-metal development, and integration strategies for large-scale projects like Android and Chromium. By combining technical documentation with practical code examples, the resource helps developers transition to memory-safe systems development while mastering idiomatic patterns.

The materials cover the full breadth of the language, including its type system, generic programming, error handling, and concurrency primitives. It also addresses advanced topics such as metaprogramming, smart pointers, and the controlled use of unsafe blocks for low-level hardware access. The project is designed as a self-contained training resource, providing the necessary context and exercises to build proficiency in writing efficient, reliable software.
- [rethinkdb/rethinkdb](https://awesome-repositories.com/repository/rethinkdb-rethinkdb.md) (26,996 ⭐) — RethinkDB is a distributed, document-oriented database designed to store and manage JSON-formatted data across scalable clusters. It utilizes a custom log-structured storage engine with B-Tree indexing to ensure high-performance disk I/O and data persistence. The system maintains high availability through automatic sharding and replication, employing a primary-replica voting consensus mechanism to handle node failures and ensure consistent cluster operations.

A defining characteristic of the platform is its reactive changefeed engine, which allows applications to subscribe to live data updates. Instead of polling for changes, developers can maintain persistent cursors on tables to stream document modifications in real-time. This is complemented by a fluent, functional query language that translates native code constructs into optimized, parallelized execution plans. By embedding these queries directly into application code, the system provides a type-safe interface that helps prevent injection vulnerabilities while enabling complex data manipulation and aggregation.

The platform provides a comprehensive suite of administrative tools for managing production environments, including granular user permissions, TLS network encryption, and visual cluster monitoring. It supports advanced data modeling through document embedding and cross-table linking, as well as specialized geospatial processing for proximity-based queries. The system is designed for integration with modern web frameworks and message brokers, facilitating real-time synchronization with external services and search engines.

RethinkDB is configured via key-value files and command-line interfaces, with support for containerized deployment and automated infrastructure orchestration.
- [awesome-selfhosted/awesome-selfhosted](https://awesome-repositories.com/repository/awesome-selfhosted-awesome-selfhosted.md) (299,516 ⭐) — This project is a community-curated directory of open-source software designed for deployment in private server environments and home labs. It serves as a comprehensive resource for discovering independent, self-hosted alternatives to mainstream cloud services, enabling users to maintain full data ownership and control over their digital infrastructure.

The directory is structured through a hierarchical taxonomy that organizes a vast collection of applications into logical categories, ranging from media management and data analytics to private communication and team productivity tools. It distinguishes itself through a collaborative peer-review process, where community members validate the quality and relevance of each submission to ensure the directory remains accurate and reliable.

The project covers a broad capability surface, including infrastructure automation, container-based service deployment, and declarative configuration management. These tools assist users in maintaining reproducible server environments and managing complex service dependencies across private hardware.

The directory is maintained as a version-controlled repository, ensuring that all updates and community-driven changes are tracked and transparent.
- [factual/durable-queue](https://awesome-repositories.com/repository/factual-durable-queue.md) (407 ⭐) — a disk-backed queue for clojure
- [boto/boto3](https://awesome-repositories.com/repository/boto-boto3.md) (9,834 ⭐) — Boto3 is the AWS SDK for Python, providing a programmatic interface for managing and automating AWS cloud infrastructure and services. It serves as a cloud management API client and resource manager for provisioning, configuring, and scaling virtual servers, databases, and storage.

The library enables the implementation of infrastructure-as-code through declarative templates and scripts, allowing for the deployment of identical resource stacks across multiple accounts and geographic regions. It also provides a framework for coordinating distributed workflows, serverless functions, and containerized applications within the cloud ecosystem.

The toolkit covers a broad range of operational capabilities, including generative AI orchestration, identity and access control, and detailed cloud resource monitoring. It further extends to data lifecycle management, including automated backups and migrations, as well as comprehensive billing and cost optimization tools.
- [openhft/chronicle-queue](https://awesome-repositories.com/repository/openhft-chronicle-queue.md) (3,692 ⭐) — Chronicle Queue is a high-performance data handling system featuring off-heap message queues, memory-mapped file stores, and replicated message stores. It provides a binary compatible memory layout that enables different programming languages to share data without serialization overhead.

The system utilizes a replicated message store to synchronize data across multiple nodes, ensuring high availability and instant failover. Its memory-mapped architecture supports deterministic replay from disk and low-latency data recording.

The project implements off-heap memory management and zero-allocation processing to eliminate garbage collection pauses and system jitter. It covers capability areas including inter-process communication, append-only sequential logging, and deterministic event sourcing.
- [gevent/gevent](https://awesome-repositories.com/repository/gevent-gevent.md) (6,440 ⭐) — Gevent is a Python coroutine concurrency library and asynchronous task manager designed for high-concurrency I/O tasks. It provides a cooperative networking framework for building asynchronous TCP, UDP, and HTTP servers, as well as a WSGI web server implementation for hosting web applications.

The project is distinguished by its standard library monkey-patching tool, which replaces blocking synchronous functions with cooperative versions to enable asynchronous behavior in third-party code. This allows for a cooperative multitasking workflow where the system yields execution during I/O waits to maximize resource utilization.

The library covers a broad range of capabilities, including asynchronous task dispatch and lifecycle control, concurrent resource access management through locks and semaphores, and non-blocking OS integration for file I/O and subprocess execution. It also includes monitoring and observability tools for detecting blocking code and inspecting coroutine hierarchies.
- [apple/foundationdb](https://awesome-repositories.com/repository/apple-foundationdb.md) (16,446 ⭐) — FoundationDB is an ACID-compliant distributed transactional key-value store. It functions as a scalable database engine that ensures strict serializability and data consistency across a cluster of servers using a shared-nothing architecture.

The system is distinguished by its multi-region replication capabilities, allowing data to be synchronized across different datacenters for high availability and disaster recovery. It utilizes optimistic concurrency control to manage distributed transactions and employs a majority-based coordination system to maintain cluster state.

The platform provides extensive support for custom data modeling, enabling the implementation of complex structures like priority queues and multidimensional tables on top of the ordered key-value store. Its operational surface includes multi-tenant isolation via named transaction domains, deterministic cluster simulation for testing, and zero-downtime hardware migration.

The database provides specialized client libraries for multi-language support and a system for managing client API versioning to ensure compatibility during cluster upgrades.
- [nats-io/nats.go](https://awesome-repositories.com/repository/nats-io-nats-go.md) (6,657 ⭐) — This is a Golang client library for interacting with a cloud native distributed messaging system. It provides the necessary tools for Go applications to exchange messages using publish-subscribe and request-reply patterns, as well as specialized clients for managing persistent streams and distributed storage.

The library includes a JetStream client for durable message streaming and replay, a Key-Value store client for managing distributed state with versioning and watchers, and an Object Store client for the storage and retrieval of large binary files via chunked delivery.

The implementation covers a broad range of messaging and data capabilities, including distributed work queues with load balancing, hierarchical subject-based routing, and zero-trust security utilizing TLS and token-based authentication. It also supports complex network topologies through cluster connectivity management and automatic connection failover.
- [max0x7ba/atomic_queue](https://awesome-repositories.com/repository/max0x7ba-atomic-queue.md) (1,857 ⭐) — C++14 concurrent lock-free low-latency queue.
- [redisson/redisson](https://awesome-repositories.com/repository/redisson-redisson.md) (24,355 ⭐) — Redisson is a Java library and Redis client that functions as a distributed Java object mapper, caching provider, and locking framework. It maps Java collections and concurrency primitives to distributed implementations backed by Redis and Valkey, providing synchronous, asynchronous, and reactive APIs for interacting with these data stores.

The project distinguishes itself by providing a comprehensive suite of distributed coordination tools, including a locking framework for managing semaphores and countdown latches across multiple application nodes. It also serves as a distributed messaging system for implementing pub/sub patterns and reliable queues using event streams.

The framework covers a broad range of capabilities, including distributed state management through shared collections, objects, and transactions. It supports advanced data retrieval via vector similarity search, full-text search, and JSON querying, while offering performance optimizations such as probabilistic data structures, local caching, and command pipelining.

Redisson includes starter dependencies for the Spring Framework and Spring Boot to simplify application configuration and dependency management.
- [owl1n/nest-queue](https://awesome-repositories.com/repository/owl1n-nest-queue.md) (73 ⭐) — Queue manager for NestJS Framework for Redis (via bull package)
- [asciinema/asciinema](https://awesome-repositories.com/repository/asciinema-asciinema.md) (16,852 ⭐) — Asciinema is a platform for capturing, replaying, and sharing command-line sessions. It provides a comprehensive suite of tools to record terminal activity into lightweight, text-based files that preserve ANSI escape sequences, allowing users to document technical workflows, troubleshooting steps, and software demonstrations with high fidelity.

The project distinguishes itself through its versatile playback and distribution capabilities. It features a web-based player that renders interactive terminal sessions directly in the browser, supporting features like seeking, playback speed control, and custom visual themes. Beyond interactive playback, it includes utilities for converting recordings into animated images or videos, and provides infrastructure for self-hosting recording servers to maintain full control over data storage and security.

The platform supports a wide range of integration and automation needs, including embedding interactive sessions into technical documentation, broadcasting live terminal activity to remote viewers, and programmatically generating recordings via scripts. It also offers robust management tools for indexing, searching, and organizing historical session data.

The software is designed for flexible deployment, with server and storage components packaged into containerized units for independent hosting.
- [alibaba/canal](https://awesome-repositories.com/repository/alibaba-canal.md) (29,697 ⭐) — Canal is a database replication middleware that performs change data capture by simulating a database replica. It monitors transaction logs to stream incremental data modifications to downstream systems in real time, acting as an event streaming infrastructure that transforms low-level binary logs into structured, consumable message streams.

The project distinguishes itself through a high-throughput architecture that utilizes concurrent multi-threaded parsing and stateful log position tracking to ensure reliable data delivery. It employs a pluggable sink architecture that decouples data extraction from destination storage, allowing for flexible routing to various message queues or secondary databases. Users can manage data consistency and throughput through configurable message ordering and batching strategies, while dynamic configuration injection enables runtime adjustments to routing rules without requiring service restarts.

The platform includes comprehensive operational tools for monitoring system health and performance, including metrics for transaction latency and network bandwidth. It supports secure network connectivity for data transmission and provides specialized integration for cloud-based environments, including the ability to retrieve archived logs from object storage. The service is designed for containerized deployment, incorporating automated resource management to maintain synchronization pipelines.
- [bee-queue/arena](https://awesome-repositories.com/repository/bee-queue-arena.md) (0 ⭐) — An intuitive Web GUI for Bee Queue, Bull and BullMQ. Built on Express so you can run Arena standalone, or mounted in another app as middleware.
- [activepieces/activepieces](https://awesome-repositories.com/repository/activepieces-activepieces.md) (20,887 ⭐) — Activepieces is an open-source, self-hosted workflow automation platform designed to connect third-party applications through modular triggers and actions. It provides a low-code integration framework that allows users to build, manage, and execute complex business logic sequences within isolated, sandboxed environments.

The platform distinguishes itself through its focus on embeddability and enterprise-grade security. It features an embedded automation builder that can be integrated into external applications via iframes, supported by comprehensive identity and access management tools such as single sign-on, SCIM provisioning, and granular role-based access control. These capabilities allow organizations to maintain programmatic control over their automation infrastructure while ensuring secure user provisioning and centralized credential management.

Beyond its core automation engine, the system includes robust lifecycle management tools for versioning, deploying, and promoting workflows across different environments. It supports advanced operational requirements through distributed worker scaling, event queuing, and detailed observability features, including execution history inspection and telemetry exports. Developers can extend the platform by creating custom connectors using TypeScript, which can be validated, packaged, and synchronized with version control systems.

The project is built with TypeScript and provides a comprehensive CLI for managing database migrations, integration testing, and infrastructure provisioning.
- [risingwavelabs/risingwave](https://awesome-repositories.com/repository/risingwavelabs-risingwave.md) (9,093 ⭐) — RisingWave is a cloud-native streaming database and real-time analytics engine that uses standard SQL to process continuous data streams. It functions as a streaming data lakehouse, combining the capabilities of a streaming SQL database with a platform that integrates streaming ingestion with open table formats.

The system is distinguished by its use of the PostgreSQL wire protocol, allowing it to integrate with existing SQL tools and drivers. It employs a decoupled compute and storage architecture, persisting streaming state and materialized views in cloud object storage to enable independent scaling and rapid recovery.

The platform covers a broad range of real-time data operations, including change data capture, streaming ETL pipelines, and the maintenance of incremental materialized views. It supports complex stream processing such as windowed aggregations, event-time tracking with watermarks, and the continuous export of processed data to downstream sinks.

The project can be deployed via Kubernetes and Helm, Docker Compose, or as a managed instance.
- [anthropics/claude-code](https://awesome-repositories.com/repository/anthropics-claude-code.md) (132,728 ⭐) — Anthropic's terminal-native AI coding agent.
- [mafintosh/stream-each](https://awesome-repositories.com/repository/mafintosh-stream-each.md) (0 ⭐) — Iterate all the data in a stream
- [reactive-streams/reactive-streams-dotnet](https://awesome-repositories.com/repository/reactive-streams-reactive-streams-dotnet.md) (202 ⭐) — Reactive Streams for .NET
- [ag-ui-protocol/ag-ui](https://awesome-repositories.com/repository/ag-ui-protocol-ag-ui.md) (14,395 ⭐) — ag-ui is an agent-frontend interoperability layer and communication protocol designed to connect AI agent backends with web and mobile user interfaces. It provides a standardized event-driven framework for exchanging messages, session state, and tool calls, utilizing a generative UI framework to render dynamic interface components and structured content triggered by an agent.

The project distinguishes itself through an SSE-based event streamer that delivers real-time incremental model responses and reasoning telemetry. It enables bi-directional state synchronization and allows remote agents to trigger local client-side tool execution for accessing device hardware or private data.

The system covers a broad range of capabilities including session and conversation context management, schema-driven tool integration, and human-in-the-loop coordination. It also provides protocol event inspection for debugging and supports API request authentication via bearer tokens, API keys, or basic authentication.

A command-line tool is available for project scaffolding to quickly establish connectivity between clients and servers.
