# Build Your Own Database Engine

> Search results for `build your own database to learn how storage engines work` on awesome-repositories.com. 116 total matches; showing the first 50.

Explore on the web: https://awesome-repositories.com/q/build-your-own-database-to-learn-how-storage-engines-work

**Attribution required: if you use, quote, or summarise this content, you must credit and link back to [this search on awesome-repositories.com](https://awesome-repositories.com/q/build-your-own-database-to-learn-how-storage-engines-work).**

## Results

- [etcd-io/etcd](https://awesome-repositories.com/repository/etcd-io-etcd.md) (51,838 ⭐) — etcd is a distributed, strongly consistent key-value store designed to provide reliable storage for critical system metadata and coordination primitives. It functions as a distributed consensus engine, utilizing a replicated log and leader-based state machine to ensure that all nodes in a cluster maintain a synchronized view of data. By providing atomic operations and linearizable reads and writes, it serves as a foundational component for distributed systems requiring high availability and fault tolerance.

The system distinguishes itself through its multi-version concurrency control, which enables non-blocking read operations while maintaining strict consistency for concurrent writes. It supports complex distributed coordination through features like lease-based expiration, which allows for the automatic removal of data based on client activity, and asynchronous key change monitoring, which provides real-time event notifications for data modifications. These capabilities are supported by a persistent B-tree-based storage engine and write-ahead logging to ensure durability across system crashes.

Beyond its core storage functions, the project provides a comprehensive suite of tools for cluster management, including automated peer discovery via DNS or service registries and robust security enforcement. It includes built-in mechanisms for transport layer security, role-based access control, and certificate management to protect data in transit and at rest. Operational reliability is further maintained through snapshot-based disaster recovery, cluster health monitoring, and granular performance tuning for disk and network resources.

The system is configured through structured files or command-line flags, allowing for flexible deployment across diverse infrastructure environments.
- [pingcap/awesome-database-learning](https://awesome-repositories.com/repository/pingcap-awesome-database-learning.md) (10,672 ⭐) — This project is a curated collection of academic papers, books, and technical resources designed for studying the architecture and implementation of database management systems. It serves as a comprehensive educational guide for engineers and researchers looking to understand the fundamental principles behind modern data storage and retrieval.

The repository distinguishes itself by providing structured learning paths across critical database domains, including the design of persistent storage engines, the mechanics of query optimization, and the complexities of distributed transaction management. It covers the theoretical and practical aspects of system internals, such as buffer management, disk input and output, and the consensus algorithms required to maintain consistency across distributed nodes.

Beyond these core areas, the collection offers resources on concurrency control protocols, performance benchmarking, and advanced execution models. The materials are organized to support the study of how systems manage data integrity, optimize query planning, and utilize high-performance processing techniques.
- [codecrafters-io/build-your-own-x](https://awesome-repositories.com/repository/codecrafters-io-build-your-own-x.md) (516,240 ⭐) — This project provides a comprehensive framework for creating, managing, and executing educational programming challenges. It includes standardized systems for authoring instructional content, defining test cases, and structuring documentation to ensure consistent learning outcomes. The platform supports a wide range of programming languages through dedicated execution environments that handle compilation, dependency management, and automated testing.

The infrastructure facilitates both local and remote development workflows, offering command-line utilities for testing code without requiring version-control commits. It features an automated orchestration lifecycle for containerized test execution, complemented by diagnostic tools for debugging network protocols and monitoring program output. Additionally, the project includes maintenance workflows for repository history management and integration tools for synchronizing data with external version-control hosts.
- [react-native-async-storage/async-storage](https://awesome-repositories.com/repository/react-native-async-storage-async-storage.md) (5,067 ⭐) — React Native AsyncStorage is a persistent key-value storage library designed for React Native applications. It provides a unified local storage interface that works identically on both iOS and Android, ensuring saved data remains available across app restarts and when the device has no network connectivity.

The library uses an asynchronous background I/O queue to handle all storage operations without blocking the JavaScript thread, communicating with native storage engines through React Native's bridge protocol. It includes a serialization layer that converts JavaScript values to strings for storage and restores them on retrieval, along with an in-memory LRU cache that reduces disk reads while maintaining consistency through write-based invalidation. A multi-engine storage adapter abstracts platform-specific persistence backends, enabling flexible backend selection at initialization.

The library covers the full range of local storage tasks: storing, reading, and removing key-value pairs, with persistence that survives app restarts and supports offline data availability. Common use cases include preserving app state, caching frequently accessed data, and storing user preferences like theme or language settings.
- [buildthingsuseful/build-your-own-kafka](https://awesome-repositories.com/repository/buildthingsuseful-build-your-own-kafka.md) (65 ⭐) — Build Your Own Kafka
- [balloonwj/cppguide](https://awesome-repositories.com/repository/balloonwj-cppguide.md) (6,030 ⭐) — CppGuide is a curated collection of educational resources and practical guides focused on C++ server development, Linux kernel internals, concurrent programming, network protocols, and security exploitation. It provides structured learning paths for backend developers, covering everything from interview preparation to building high-performance network servers and understanding operating system fundamentals.

The guide distinguishes itself by offering in-depth, hands-on tutorials that walk through real-world implementations, including building a Redis-like server from scratch, designing custom network protocols, and constructing remote control tools. It also delves into advanced topics such as shellcode injection, kernel module development, and the architecture of the Linux kernel, providing a mental model for how the kernel operates as a responsive, object-based system.

Beyond core C++ and kernel topics, the repository covers a broad range of supporting areas including memory management strategies, concurrency and synchronization patterns, network communication diagnostics, and performance optimization techniques. It also includes material on modern C++ language features, standard library usage, and software architecture patterns like the reactor model and event-driven design.

The documentation is organized as a series of guides and tutorials, with practical code examples and step-by-step explanations that trace execution paths through both user-space and kernel-space code.
- [bytebytegohq/system-design-101](https://awesome-repositories.com/repository/bytebytegohq-system-design-101.md) (83,491 ⭐) — This project is a centralized engineering knowledge repository that provides a structured curriculum for mastering system design, architectural patterns, and fundamental software development workflows. It serves as a professional development resource for engineers, offering foundational knowledge and real-world case studies to support the design of scalable, secure, and efficient distributed systems.

The repository distinguishes itself through a visual-first approach to knowledge synthesis, distilling complex technical concepts into high-density graphical diagrams and succinct illustrations. By employing cross-domain concept mapping and modular topic decomposition, it connects disparate engineering disciplines—such as infrastructure, security, and application layers—into granular, self-contained modules that facilitate rapid mental modeling and targeted learning.

The content covers a broad spectrum of technical domains, including API and web development, database scaling strategies, networking protocols, and DevOps deployment pipelines. These educational assets are organized as a static, version-controlled repository, allowing users to consume technical insights asynchronously at their own pace.
- [cstack/db_tutorial](https://awesome-repositories.com/repository/cstack-db-tutorial.md) (10,464 ⭐) — This project is an educational implementation of a relational database engine written in C. It functions as a SQLite clone, demonstrating the internal mechanics of a database system through a C-based systems project that focuses on manual memory management and file I/O.

The engine is distinguished by its use of a bytecode virtual machine, which executes database operations by compiling SQL statements into low-level instructions. It utilizes a B-tree database engine to organize records in a balanced tree structure, ensuring efficient insertion, search, and range scanning.

The system covers core database internals, including SQL compilation workflows, disk-based persistence, and B-tree indexing with recursive node traversal and node splitting. It also implements data cursor management for navigating result sets.
- [forem/forem](https://awesome-repositories.com/repository/forem-forem.md) (22,726 ⭐) — Forem is an open-source platform designed for building and managing technical communities. It functions as a social publishing engine that enables members to share long-form content, participate in threaded discussions, and engage through social interactions. The platform provides tools for organizations to maintain branded profiles, host community hackathons, and facilitate collaborative learning through structured educational tracks.

Beyond its social features, Forem integrates advanced capabilities for AI agent workflow orchestration and codebase knowledge graphing. It allows developers to map project architecture, analyze dependency relationships, and automate complex coding tasks using autonomous agents. The system includes specialized infrastructure for LLM context optimization, such as token compression and persistent memory management, to improve the efficiency and performance of agent-driven development.

The platform supports a modular architecture that allows for extensibility through plugins and custom configuration. It includes comprehensive administrative tools for managing user permissions, moderating content, and tracking community engagement metrics. Forem is designed to be self-hosted, providing full control over deployment, data storage, and community governance.
- [peiyuanix/build-your-own-zerotier](https://awesome-repositories.com/repository/peiyuanix-build-your-own-zerotier.md) (603 ⭐) — Build your own layer-2 virtual switch in less than 300 lines of code
- [clickhouse/clickhouse](https://awesome-repositories.com/repository/clickhouse-clickhouse.md) (48,229 ⭐) — ClickHouse is a high-performance, columnar analytical database designed for real-time query execution and large-scale data aggregation. It functions as a distributed data warehouse capable of processing petabytes of information, while also providing an embedded engine that integrates directly into applications for native query capabilities without external dependencies. The system is built to handle high-throughput ingestion and complex analytical workloads, delivering millisecond-level latency for interactive dashboards and operational monitoring.

The platform distinguishes itself through advanced storage and execution techniques, including vectorized query processing and a merge tree storage engine that maintains performance during massive insertions. It features adaptive subcolumn mapping for semi-structured data and supports native vector search for machine learning and generative AI applications. To facilitate efficient data movement, the engine utilizes zero-copy shared memory buffers, minimizing overhead when interacting with external analytical tools or processing diverse file formats like Parquet, JSON, and Arrow.

Beyond its core storage and processing capabilities, the project provides a comprehensive suite of tools for observability, security, and data integration. It includes built-in support for natural language querying, automated workflow orchestration for AI agents, and extensive diagnostic features for query plan inspection. The platform also offers robust cloud infrastructure management, including support for private networking, compliant deployment strategies, and integrated billing consolidation.
- [tidwall/buntdb](https://awesome-repositories.com/repository/tidwall-buntdb.md) (4,834 ⭐) — BuntDB is an embedded key-value store for Go applications, providing in-memory storage with optional disk persistence. It structures data using a B-tree for ordered key-value access and an R-tree for spatial indexing, allowing both range scans and geometric intersection queries. Support for indexing on nested JSON document fields enables efficient lookups by values within JSON objects, and per-key time-to-live (TTL) expiration automatically removes stale entries.

The store uses copy-on-write transaction isolation, ensuring each transaction sees a consistent snapshot and changes are applied atomically as a single unit that commits or rolls back entirely. Durability is achieved through an append-only write-ahead log that records all mutations to a file, which is replayed on restart and periodically compacted. Custom B-tree indexes can be created on keys, values, JSON fields, or composite fields with support for multi-column ordering, descending order, and locale-specific collation, while the R-tree spatial index stores points and rectangles for intersection queries.

The system groups multiple read and write operations into atomic transactions and offers automatic data expiration via per-key TTL timers with a background sweeper. Together, these capabilities make it a compact embedded database suitable for range queries, geospatial lookups, and JSON field-indexed access within single Go programs.
- [lukemathwalker/build-your-own-jira-with-rust](https://awesome-repositories.com/repository/lukemathwalker-build-your-own-jira-with-rust.md) (0 ⭐) — You will be working through a series of test-driven exercises, or koans, to learn Rust while building your own JIRA clone!
- [danistefanovic/build-your-own-x](https://awesome-repositories.com/repository/danistefanovic-build-your-own-x.md) (516,495 ⭐) — Master programming by recreating your favorite technologies from scratch.
- [workiva/go-datastructures](https://awesome-repositories.com/repository/workiva-go-datastructures.md) (7,901 ⭐) — go-datastructures is a collection of thread-safe and lock-free data structures designed for high-performance concurrent applications in Go. It provides a modular library of specialized algorithmic toolsets, including a lock-free collection library and an immutable data structure library.

The project distinguishes itself through a suite of persistent AVL trees and hash array mapped tries that use branch-copying to preserve previous versions. It also implements non-blocking hash maps, queues, and tries that enable linearizable snapshots and concurrent updates without the use of mutual exclusion locks.

The library covers a broad range of capability areas, including dimensional indexing for range querying, graph algorithms optimized with Fibonacci heaps, and fast integer set operations using sparse bitmaps. It further includes ordered collections such as skiplists and B+ trees, as well as multithreaded bucket sorting for large datasets.

The toolset also provides synchronization primitives like thread-safe ring buffers and event broadcasting mechanisms for coordinating execution between Go routines.
- [braydie/howtobeaprogrammer](https://awesome-repositories.com/repository/braydie-howtobeaprogrammer.md) (16,218 ⭐) — HowToBeAProgrammer is a comprehensive software engineering career guide and professional development framework. It serves as a curated-knowledge repository and handbook designed to help programmers acquire technical habits and social competencies necessary for professional advancement.

The project distinguishes itself by integrating technical craftsmanship with a detailed manual for technical leadership and organizational navigation. It provides specific strategies for career progression, such as compensation negotiation, promotion readiness, and the management of professional boundaries to prevent burnout.

The guide covers a broad surface of engineering capabilities, including system performance optimization, technical debugging and testing, and software architecture. It also provides extensive resources on project management, quality assurance, and professional communication for interacting with non-technical stakeholders.

Content is organized into modular educational modules and supports multi-language localization to make its professional and technical advice accessible to a global audience.
- [etcd-io/bbolt](https://awesome-repositories.com/repository/etcd-io-bbolt.md) (9,573 ⭐) — bbolt is an ACID-compliant embedded key-value store for Go applications. It persists all data in a single memory-mapped file on disk, organizing information using B+ trees to facilitate sorted key iteration and efficient range queries.

The project distinguishes itself through a hierarchical data organization model, allowing buckets to be nested within other buckets to create a tree-like structure. It employs a single-writer, multi-reader locking mechanism and copy-on-write transactions to ensure serializable isolation and data integrity.

The system includes comprehensive data management capabilities, such as unique identifier generation, cursor-based iteration, and hot backup generation. Maintenance tools are provided for database compaction, consistency verification, and the repair of corrupted pages.

Command-line utilities are available for querying database content and inspecting internal structural metadata.
- [thoughtworks/build-your-own-radar](https://awesome-repositories.com/repository/thoughtworks-build-your-own-radar.md) (2,549 ⭐) — This project is a technology radar visualization tool and dockerized static site generator. It transforms JSON or CSV datasets into an interactive technology map used to track the adoption status and maturity of tools and techniques across an organization.

The tool enables enterprise architecture mapping by organizing portfolios of technologies into categories and maturity levels. It supports custom technical taxonomies, allowing the definition of specialized rings and quadrants to match specific organizational evaluation criteria.

The system covers automated radar generation and technology lifecycle tracking, using visual indicators to show how tools move between evaluation and adoption phases. It handles data ingestion from spreadsheets or public URLs and maps polar coordinate data into a visual layout of concentric rings.

The application is delivered as a portable container image for consistent deployment across different environments.
- [datahub-project/datahub](https://awesome-repositories.com/repository/datahub-project-datahub.md) (12,141 ⭐) — DataHub is a metadata management platform designed to unify technical, operational, and business context across diverse data ecosystems. By utilizing a graph-based metadata model and an event-driven ingestion architecture, it creates a centralized source of truth that maps complex data relationships, lineage, and ownership. This foundational framework enables organizations to maintain a synchronized view of their data landscape, supporting both human-led discovery and automated data operations.

The platform distinguishes itself through its focus on grounding artificial intelligence and autonomous agents in verified enterprise context. It provides specialized capabilities to inject provenance-aware lineage, business definitions, and quality signals into AI prompts, ensuring that generated insights are accurate and trustworthy. Through a policy-as-code governance engine, it enforces access controls and compliance rules directly within the metadata graph, allowing for programmatic oversight of data assets across hybrid environments.

Beyond its core identity, the project offers a comprehensive suite of tools for data discovery, observability, and lifecycle management. It includes features for automated lineage extraction, impact analysis, and semantic search, enabling users to navigate data dependencies and resolve quality issues efficiently. The platform also supports collaborative workflows, allowing teams to manage business glossaries, certify data assets, and automate access requests through integrated communication channels.

DataHub is built to scale, utilizing a distributed architecture that allows storage, search, and graph processing layers to operate independently. It provides standardized interfaces and a bridge-based connector framework to facilitate integration with heterogeneous data sources and external AI agent frameworks.
- [mostafa-samir/how-machine-learning-works](https://awesome-repositories.com/repository/mostafa-samir-how-machine-learning-works.md) (0 ⭐) — This repository contains the code accompanying the work done in How Machine Learning Works by Mostafa Samir, Manning Publications. All the code is written with python.
- [mbdavid/litedb](https://awesome-repositories.com/repository/mbdavid-litedb.md) (9,410 ⭐) — LiteDB is a serverless, embedded NoSQL document database for .NET applications. It persists data into a single portable file, functioning as a BSON data store that resides within the application process rather than running as a separate server.

The system is ACID compliant, utilizing write-ahead logging to ensure atomic, consistent, isolated, and durable transactions. It includes built-in encryption to provide secure local data storage and protect files on disk from unauthorized access.

The project covers object-document mapping to convert classes into document formats, indexed search capabilities via B-tree indexing, and specialized streaming for large binary objects. It also provides a dedicated administrative studio for visual data administration and modification.
- [fosrl/pangolin](https://awesome-repositories.com/repository/fosrl-pangolin.md) (21,255 ⭐) — Pangolin is a zero-trust remote access platform designed to provide secure, identity-aware connectivity to private network resources. It functions as a cloud-native network controller that orchestrates encrypted tunnels, traffic routing, and access policies across distributed environments. By leveraging WireGuard for secure data transport, the platform enables authenticated access to internal web applications, terminal sessions, and remote desktops without exposing services to the public internet.

The platform distinguishes itself through a declarative infrastructure model that synchronizes network state using version-controlled manifests. It supports complex connectivity requirements through peer-to-peer NAT traversal, which facilitates direct encrypted connections between nodes, with automatic fallback to server-based relaying when necessary. Additionally, it provides browser-based access to remote resources, eliminating the need for local client software for many common administrative and service-access tasks.

Beyond its core tunneling capabilities, the platform includes a comprehensive suite of tools for traffic management, security, and observability. It features granular access control policies based on user identity, geolocation, and network attributes, alongside automated certificate management and multi-factor authentication. The system also provides extensive monitoring, audit logging, and alerting capabilities to track infrastructure health and security events across multi-site deployments.

Pangolin is designed for containerized and multi-site environments, offering flexible deployment options through standard packaging and automated reconciliation workflows.
- [tokenrove/build-your-own-shell](https://awesome-repositories.com/repository/tokenrove-build-your-own-shell.md) (496 ⭐) — Guidance for mollusks (WIP)
- [cloudflare/workerd](https://awesome-repositories.com/repository/cloudflare-workerd.md) (8,346 ⭐) — workerd is a serverless edge runtime designed for executing lightweight, distributed functions at the network edge. It utilizes a V8-based JavaScript engine to provide fast startup and low memory overhead, while maintaining a WebAssembly-compatible execution environment that allows modules to run alongside JavaScript for high-performance computational tasks.

The runtime supports isolate-based multi-tenancy to run multiple independent execution contexts within a single process. It implements an event-driven execution model that triggers code based on network requests or scheduled events and includes support for privileged socket inheritance to operate under unprivileged user accounts.

The project covers a broad set of capabilities including serverless API development, AI inference deployment using GPU hardware and vector databases, and automated browser orchestration for web scraping. Additional functionality encompasses global state management via SQL databases and key-value stores, background job scheduling with message queues, and the delivery of static assets through a content delivery network.

Development is supported by a command-line interface for project management, custom build pipelines, and tools for pinning runtime behavior to specific dates to ensure consistency.
- [falkordb/falkordb](https://awesome-repositories.com/repository/falkordb-falkordb.md) (3,437 ⭐) — FalkorDB is a high-performance graph database management system and vector graph database. It serves as a knowledge graph construction tool and a GraphRAG knowledge store, integrating structured property graphs with vector search to provide grounded context for large language models. The engine is designed as a multi-tenant graph engine, capable of hosting thousands of isolated datasets within a single instance.

The system distinguishes itself by using linear algebra for query execution, treating relationship tensors as matrix multiplications to achieve low-latency multi-hop traversals. It utilizes sparse-matrix graph storage and vectorized traversals to process thousands of relationships simultaneously. These capabilities are combined with hybrid vector-graph indexing to unify semantic similarity search with structural graph exploration.

The platform covers a broad range of capabilities, including GraphRAG orchestration, AI agent memory implementation, and advanced graph analytics such as community detection and centrality ranking. It supports OpenCypher query execution and provides connectivity via the Bolt and RESP protocols. Additional functionality includes automated ontology loading, temporal data tracking, and real-time binary replication for high availability.

The database supports migration from Neo4j and can be deployed as a distributed cluster or as an embedded graph engine.
- [cloudquery/cloudquery](https://awesome-repositories.com/repository/cloudquery-cloudquery.md) (6,438 ⭐) — CloudQuery is a cloud infrastructure ETL tool and multi-cloud data pipeline designed to collect, synchronize, and normalize resource metadata from various cloud providers and SaaS platforms. It functions as a centralized asset inventory manager and security posture manager, extracting configuration and state data into relational databases, data lakes, or data warehouses.

The system distinguishes itself by transforming complex, nested cloud API responses into flat relational tables, enabling the use of standard SQL for asset querying and analysis. It employs a modular plugin system for data extraction and driver-based adapters for destination-agnostic loading, allowing metadata to be pushed into diverse storage backends.

The platform covers several broad capability areas, including cloud security posture management, FinOps cost optimization, and infrastructure compliance auditing. It utilizes SQL-based transformation pipelines to implement security frameworks, detect configuration drift, and identify underutilized resources. Additionally, the tool provides event-driven responses to fire webhooks or alerts when policy violations occur.
- [infaaa/build-your-own-x-vibe-coding](https://awesome-repositories.com/repository/infaaa-build-your-own-x-vibe-coding.md) (80 ⭐) — Master programming by recreating your favorite technologies from scratch with vibe coding.
- [walkccc/clrs](https://awesome-repositories.com/repository/walkccc-clrs.md) (5,060 ⭐) — This repository is a comprehensive collection of fully worked solutions to exercises and problems from the standard algorithms textbook by Cormen, Leiserson, Rivest, and Stein (CLRS). It serves as an educational reference for algorithm design and analysis, providing step-by-step reasoning, pseudocode, and mathematical proofs for a wide range of topics.

The content spans core computer science areas: algorithm analysis with asymptotic notation, recurrence solving, and amortized cost analysis; data structure implementation and operations for binary search trees, red-black trees, B-trees, Fibonacci heaps, hash tables, and more; graph algorithms covering traversal, shortest paths, minimum spanning trees, connectivity, and topological sorting; dynamic programming and greedy approaches for optimization problems; plus sorting, order statistics, and string/sequence algorithms.

The site is built as a static website using Markdown-driven content with KaTeX-rendered mathematical notation, organized via file-based routing for easy browsing of solutions by chapter and exercise.
- [npm/how-to-npm](https://awesome-repositories.com/repository/npm-how-to-npm.md) (0 ⭐) — A module to teach you how to module.
- [basecamp/handbook](https://awesome-repositories.com/repository/basecamp-handbook.md) (6,603 ⭐) — This project is a public company employee handbook that serves as a centralized reference for internal policies, organizational standards, and corporate governance for a distributed workforce. It functions as an operational guide and culture manifesto, detailing the shared values and social norms used to align a global team.

The handbook defines a remote-first operational model that emphasizes asynchronous communication and a distributed work infrastructure. It specifies unique organizational practices such as cycle-based development intervals, a customer-facing support rotation for all employees, and a compensation model based on industry market deciles.

The documentation covers a broad surface of human resources and operational capabilities. This includes detailed career frameworks with competency matrices for engineering, design, and support roles, as well as comprehensive benefits administration covering health insurance, retirement contributions, and paid leave. It also outlines corporate device management standards, including security baselines and remote wiping procedures for company hardware.
- [mariadb/server](https://awesome-repositories.com/repository/mariadb-server.md) (7,196 ⭐) — This project is an open source relational database management system and SQL database designed for storing and managing structured data. It functions as a relational database for ensuring consistency and reliability, while also operating as a vector database for storing and querying high-dimensional vector embeddings.

The system incorporates a columnar storage engine to optimize analytical query processing and large-scale data aggregation. It further enables vector similarity search, allowing users to find similar items by querying vector embeddings.

The software covers a broad capability surface including relational data management, analytical query execution, and database telemetry collection for gathering hardware and configuration statistics.
- [vasanthk/how-web-works](https://awesome-repositories.com/repository/vasanthk-how-web-works.md) (16,731 ⭐) — This project is a technical educational guide focused on browser architecture and the internal processes used to render web pages. It provides a detailed breakdown of the web request lifecycle, from the initial networking phase to the final visual output on a screen.

The guide covers specific technical sequences including the DNS resolution process across browser, operating system, and ISP caches, and the establishment of secure connections through the TLS handshake. It also details the communication flow between clients and servers using the HTTP protocol and server-side request handling.

The material explains the browser rendering pipeline, specifically how HTML and CSS are parsed to construct the Document Object Model and render tree. This includes the process of style resolution, recursive layout calculation, and the final painting of pixels using stacking contexts and layers.
- [anthropics/claude-code](https://awesome-repositories.com/repository/anthropics-claude-code.md) (132,728 ⭐) — Anthropic's terminal-native AI coding agent.
- [spacejam/sled](https://awesome-repositories.com/repository/spacejam-sled.md) (8,928 ⭐) — Sled is an embedded key-value store and ACID-compliant database designed for high-performance data persistence. It functions as a log-structured storage engine that organizes data using B+ trees to support efficient range queries and prefix scans.

The engine implements a zero-copy data store model, utilizing epoch-based reclamation to provide direct references to cached values without memory allocations. It distinguishes itself through a combination of write-ahead logging, page cache optimizations to reduce write amplification on flash storage, and serializable transactions for atomic multi-key updates.

The library covers a broad range of capabilities, including crash recovery through checkpointing, disk storage defragmentation, and binary stream replication. It also provides reactive data observation via key-prefix event subscriptions and supports custom merge logic for atomic value transformations.
- [aishwaryanr/awesome-generative-ai-guide](https://awesome-repositories.com/repository/aishwaryanr-awesome-generative-ai-guide.md) (24,755 ⭐) — This project is a community-driven knowledge repository and technical learning resource focused on the field of generative artificial intelligence. It serves as a centralized hub for developers and practitioners to access curated research, tutorials, and foundational concepts necessary for building and deploying modern artificial intelligence applications.

The platform distinguishes itself through a collaborative, distributed contribution model that aggregates diverse learning materials into a structured, searchable knowledge base. It covers a wide range of specialized topics, including retrieval-augmented generation, large language model training, fine-tuning techniques, and agentic workflows. Beyond technical skill development, the repository functions as a professional development hub, offering interview preparation resources and guidance for those pursuing careers in the artificial intelligence industry.

The content is organized through a hierarchical taxonomy, allowing users to navigate complex subjects such as system evaluation, multimodal models, and security tools. The repository provides access to comprehensive code notebooks and structured tutorials, all maintained as static documentation within a version control system to ensure accessibility and ease of discovery.
- [realm/realm-swift](https://awesome-repositories.com/repository/realm-realm-swift.md) (16,608 ⭐) — This is a mobile object database and NoSQL local data store that replaces relational tables with a schema-based model. It functions as a reactive data store, using live object observations and change notifications to trigger automatic user interface refreshes.

The system provides built-in mobile cloud data synchronization to keep local datasets consistent with a remote server across multiple devices. It also includes security features for encrypted local storage, protecting sensitive on-disk data using at-rest encryption keys and fine-grained access control.

Broad capabilities include object-oriented data management, type-safe querying, and schema migration. The project supports geospatial data querying for location-based searches, as well as direct data binding for reactive user interface updates.
- [apache/gravitino](https://awesome-repositories.com/repository/apache-gravitino.md) (2,866 ⭐) — Gravitino is a federated metadata lake and unified data catalog designed to manage tables, files, and AI models across diverse data sources and cloud storage. It serves as a centralized interface for governing schemas, access controls, and tagging across relational databases, messaging queues, and object stores.

The project distinguishes itself by unifying the management of AI assets, such as machine learning models and their version lineages, alongside traditional tabular data. It also implements the Iceberg REST specification to provide a standardized metadata server and proxy for lakehouse tables across different compute engines.

The system covers a broad range of capabilities, including federated metadata management for relational and streaming sources, role-based access control with credential vending, and data lineage tracking using the OpenLineage standard. It further provides automation for table maintenance, metadata lookup caching for performance, and a Model Context Protocol server for AI tool integration.

Deployment options include Kubernetes Helm charts, standalone REST servers, and containerized local sandboxes.
- [ghuntley/how-to-build-a-coding-agent](https://awesome-repositories.com/repository/ghuntley-how-to-build-a-coding-agent.md) (5,145 ⭐) — This repository is a reference implementation and guided tutorial for building an AI coding agent that combines conversational interaction with file system manipulation and sandboxed shell execution. The agent uses a large language model as its core decision-making component, operating within a turn-based conversational loop where it can generate responses or invoke tools, and tool results are fed back into the dialogue. It provides primitives for reading, writing, and listing files on the local filesystem, as well as searching code using regular expressions.

The agent’s capabilities are extended through a plugin-based tool system, where each tool is defined by a name, a JSON Schema input specification, and a handler function. Shell commands run inside a sandboxed environment that isolates system access and enforces resource limits, enabling safe automation. A file system abstraction layer unifies file operations across the operating system, keeping the agent platform-agnostic.

The project covers the full development workflow for an AI coding agent, including automated code editing, regex-powered code search, and a customizable tool plugin framework. The architecture is designed around a conversational agent loop, LLM integration, and a plugin-based tool system as its foundational components.

The repository includes a step-by-step guide and a complete reference template for implementing an interactive chat agent with filesystem and shell access.
- [louischatriot/nedb](https://awesome-repositories.com/repository/louischatriot-nedb.md) (13,540 ⭐) — NeDB is a JavaScript embedded NoSQL document store designed for Node.js and the browser. It functions as an in-memory data store with the option to persist documents to a local file system, ensuring data survives application restarts.

The project utilizes a MongoDB-compatible API to perform data operations, allowing it to serve as a lightweight document indexing system and a persistent file database without requiring a separate database server.

Capabilities include querying, inserting, updating, and deleting documents, as well as the ability to create indexes on specific fields to accelerate retrieval and enforce uniqueness. The system also supports sorting, pagination, and the implementation of expiration timers for automatic data removal.
- [fireproof-storage/mcp-database-server](https://awesome-repositories.com/repository/fireproof-storage-mcp-database-server.md) (32 ⭐) — Store and load JSON documents from LLM tool use
- [realm/realm-java](https://awesome-repositories.com/repository/realm-realm-java.md) (11,464 ⭐) — Realm Java is a NoSQL mobile object database and reactive database engine. It provides a persistent local data store that saves native objects directly to disk, replacing traditional SQL storage and object-relational mapping layers.

The system functions as a real-time data synchronizer, coordinating local database changes with a cloud backend across multiple devices. It integrates a reactive engine that uses change listeners and asynchronous event streams to automatically update user interfaces when underlying data changes.

The project covers object-oriented data modeling, CRUD operations, and schema-based versioning for database migrations. It includes security features such as hardware-optimized database file encryption and user authentication for identity management. Additional capabilities include storage optimization through file compaction and build-time code stripping to remove unused classes.
- [agno-agi/agno](https://awesome-repositories.com/repository/agno-agi-agno.md) (40,717 ⭐) — Agno is an agent operating system designed to manage the lifecycle, tool execution, and persistent state of autonomous agents across distributed infrastructure. It provides a unified runtime environment that wraps diverse agent frameworks into a consistent, interoperable protocol, allowing developers to build and deploy complex multi-agent systems that coordinate tasks and delegate sub-processes.

The platform distinguishes itself through a robust governance and orchestration layer that includes human-in-the-loop approval gates, role-based access control, and a centralized API gateway. It features a shared cultural knowledge layer that enables agents to reflect on interactions and store universal principles across sessions, alongside persistent memory architectures that manage chat history and context retrieval.

The system supports a wide range of operational capabilities, including real-time response streaming, asynchronous background task management, and automated performance evaluation. It integrates with external systems through standardized interfaces and provides comprehensive observability tools to trace autonomous decision paths and monitor agent accuracy in production environments.

Developers can configure the system using typed classes or YAML files, and the platform exposes agents as secure, scalable web services with built-in middleware for authentication and request validation.
- [gofiber/storage](https://awesome-repositories.com/repository/gofiber-storage.md) (0 ⭐) — Premade storage drivers that implement the Storage interface, designed to be used with various Fiber middlewares.
- [vonng/ddia](https://awesome-repositories.com/repository/vonng-ddia.md) (22,648 ⭐) — This project serves as a comprehensive technical reference for the architecture and design of data-intensive applications. It provides a structured analysis of the fundamental principles required to build reliable, scalable, and maintainable software systems, covering the core trade-offs inherent in modern data infrastructure.

The repository explores the mechanics of distributed data management, including strategies for replication, partitioning, and achieving consensus across multiple nodes. It details the design of storage engines, indexing techniques, and transaction management models, while also examining the architectural patterns for both batch and stream processing pipelines.

Beyond foundational theory, the project covers the implementation of event-driven systems, including event sourcing, log-structured storage, and message brokering. It addresses the complexities of maintaining system consistency, enforcing transactional integrity, and managing derived data views in environments prone to network failures and concurrency challenges.

The documentation is available in multiple formats, including an exportable digital book version, to support study and reference across various devices.
- [bmad-code-org/bmad-method](https://awesome-repositories.com/repository/bmad-code-org-bmad-method.md) (49,528 ⭐) — BMAD-METHOD is a multi-agent orchestration framework designed to automate the entire software development lifecycle. It functions as a programmable engine that coordinates autonomous agents to handle complex tasks, ranging from initial requirement elicitation and project planning to code generation and system maintenance. By embedding architectural constraints into a central context file, the system ensures that all automated actions remain aligned with project goals and organizational standards.

The platform distinguishes itself through an adversarial review process, where a dual-agent system generates and critiques content to ensure robustness before finalization. It employs a multi-layer configuration model that allows teams to override global defaults with environment-specific settings, ensuring consistent execution across distributed workflows. Furthermore, the framework integrates evidence-based hypothesis testing to perform forensic debugging, systematically isolating root causes of system failures through rigorous verification.

Beyond its core orchestration capabilities, the project provides a structured methodology for collaborative governance and problem-solving. It supports the execution of modular workflow recipes, automated code fixes, and milestone validation to maintain project integrity throughout the development process. The system is designed for integration into scripted environments, supporting automated installation and the bundling of project assets for streamlined deployment.
- [cube-js/cube](https://awesome-repositories.com/repository/cube-js-cube.md) (20,251 ⭐) — Cube is a semantic data layer that provides a unified framework for defining business metrics, dimensions, and relationships across diverse data sources. By acting as a headless business intelligence engine, it transforms raw data into a governed model that can be queried via SQL, REST, and GraphQL interfaces. This architecture ensures consistent data definitions and logic across all downstream analytical applications and reporting tools.

The platform distinguishes itself through its integrated conversational AI capabilities, which allow users to explore data using natural language. It orchestrates these interactions by mapping questions to the underlying semantic model, ensuring that AI-generated insights remain accurate and context-aware. Furthermore, Cube is designed for multi-tenant environments, offering robust infrastructure isolation, row-level security, and dynamic context injection to ensure that data access is strictly governed and personalized for every user or tenant.

Beyond its core modeling and AI features, the platform includes a comprehensive suite of tools for performance optimization, including automated pre-aggregation caching and asynchronous query queuing. It supports a wide range of data sources and deployment models, from self-hosted containers to managed cloud environments. The system also provides extensive programmatic control over report management, dashboard publishing, and user identity synchronization, making it suitable for embedding interactive analytics directly into custom software applications.
- [ivopetiz/crypto-database](https://awesome-repositories.com/repository/ivopetiz-crypto-database.md) (0 ⭐) — Database to store all data from crypto exchanges, currently working with Binance, Bittrex, Cryptopia and Poloniex.
- [gyoogle/tech-interview-for-developer](https://awesome-repositories.com/repository/gyoogle-tech-interview-for-developer.md) (17,417 ⭐) — This project is a comprehensive technical interview preparation resource and computer science interview guide. It serves as an educational reference for developers to study core software engineering fundamentals and common coding patterns required for employment screenings.

The repository provides detailed guides and references covering data structures and algorithms, networking and security, operating systems, and web development. It specifically focuses on the implementation and complexity analysis of sorting, searching, and graph algorithms.

The material encompasses a wide breadth of computer science domains, including software engineering principles like SOLID and design patterns, language fundamentals across Java, C, and C++, and system architecture. It also covers database design and scaling, concurrency and multithreading, and frontend development lifecycles.

The project is primarily written in Java and is structured as a knowledge base for mastering technical interviews.
- [realm/realm-cocoa](https://awesome-repositories.com/repository/realm-realm-cocoa.md) (16,608 ⭐) — Realm-Cocoa is a NoSQL mobile database engine and reactive object database designed for local data storage on mobile devices. It serves as a non-relational alternative to Core Data and SQLite, storing data as objects rather than tables.

The system functions as an encrypted local store that protects sensitive application data using encryption. It provides reactive data synchronization, allowing application objects and user interfaces to update automatically when the underlying database changes.
- [denoland/deno](https://awesome-repositories.com/repository/denoland-deno.md) (107,110 ⭐) — Deno is a high-performance runtime for JavaScript and TypeScript that prioritizes security and developer productivity. Built on the V8 engine, it provides a secure execution environment that enforces a default-deny security model, requiring explicit user authorization for access to system resources like the file system, network, and environment variables. The runtime natively supports modern web-standard APIs, ensuring consistent behavior and portability across different environments.

What distinguishes Deno is its integrated approach to the software development lifecycle. It bundles essential utilities—including a formatter, linter, test runner, and dependency manager—directly into the runtime, eliminating the need for external build tools or complex transpilation steps. The platform features a universal module resolution system that supports remote HTTPS URLs, local paths, and standard package registries, all backed by lockfiles to ensure build determinism and supply chain security.

Beyond its core runtime capabilities, Deno includes a built-in, persistent key-value database engine that supports atomic transactions and reactive data monitoring. It also provides a robust compatibility layer for the Node.js ecosystem, allowing for the seamless execution of legacy modules and native binary addons. For multi-tenant or distributed applications, the runtime offers isolated sandbox environments that manage resource constraints and security boundaries, facilitating secure code execution in shared infrastructure.

The project is distributed as a single binary, providing a unified toolchain for managing dependencies, executing tasks, and configuring runtime security policies.