30 open-source projects similar to pingcap/talent-plan, ranked by how many features they have in common. Compare stars, activity and what each one does to find the best Talent Plan alternative.
This project serves as a comprehensive repository of best practices and documentation standards for managing open source software. It provides a foundational framework for establishing project governance, defining contributor roles, and structuring the lifecycle of collaborative software development. By centralizing knowledge on community building and operational transparency, it acts as a guide for launching, maintaining, and scaling healthy software projects. The project distinguishes itself by offering actionable strategies for the human and organizational aspects of software development t
This repository serves as the coordination hub for the Kubernetes community, focusing on open source contribution and project governance. It provides the structures necessary to manage the development of subprojects through a distributed governance model involving committees and working groups. The project manages community coordination by connecting contributors through a network of mailing lists and chat channels. It defines the requirements and responsibilities for contributor membership, including the process for becoming an official member or code reviewer. The repository utilizes a sta
This project is an educational implementation of a relational database engine written in C. It functions as a SQLite clone, demonstrating the internal mechanics of a database system through a C-based systems project that focuses on manual memory management and file I/O. The engine is distinguished by its use of a bytecode virtual machine, which executes database operations by compiling SQL statements into low-level instructions. It utilizes a B-tree database engine to organize records in a balanced tree structure, ensuring efficient insertion, search, and range scanning. The system covers co
This project is a comprehensive educational resource focused on the principles, patterns, and trade-offs required to design scalable, reliable, and high-performance distributed systems. It provides a structured curriculum that covers the fundamental architectural strategies necessary for building modern software infrastructure, ranging from high-level system decomposition to low-level networking and data management. The repository distinguishes itself by offering deep dives into complex architectural patterns, such as microservices-based decomposition, event-driven communication, and command-
This project provides educational materials and courseware focused on the theoretical and practical foundations of distributed systems design. It serves as a comprehensive curriculum covering the disciplines of consensus, data consistency, reliability engineering, and scalability. The instructional content focuses on achieving cluster agreement through consensus algorithms and managing system-wide state via coordination frameworks. It includes a dedicated guide to data theory, exploring replication strategies, consistency models, and data convergence. The courseware covers a broad capability
This project is a collection of Python programming scripts and educational mini-projects designed as a shared development environment. It serves as an open source code repository where developers can practice coding and explore data science concepts through hands-on implementation. The repository functions as a collaborative learning resource focused on the fork and pull request workflow. It utilizes a distributed version control system to coordinate community contributions and peer reviews of Python scripts.
The CppCoreGuidelines is a comprehensive software engineering standard that provides a curated framework of coding conventions and design principles for C++. It serves as an authoritative guide for writing safe, efficient, and maintainable code by establishing high-level architectural patterns and organizational principles for large-scale projects. The guidelines emphasize the use of a strong, static type system to ensure memory safety and enforce consistent resource management patterns. The project distinguishes itself by promoting the zero-overhead abstraction principle, ensuring that high-
This project is a community-maintained, open-access directory of high-quality public datasets. It serves as a centralized reference point for researchers, developers, and data scientists to locate reliable information sources across a wide spectrum of industries and scientific fields. By providing a structured index, the repository facilitates the discovery of data necessary for exploratory analysis, machine learning model training, and the development of data-intensive applications. The directory distinguishes itself through a lightweight, platform-agnostic approach to resource indexing that
MiniOB is an open-source educational relational database kernel designed for learning the internals of database systems. It implements a dual-engine storage architecture combining B+ Tree and LSM-Tree, supports SQL parsing and query execution, and provides transactional processing with multi-version concurrency control. The system communicates with clients using the MySQL wire protocol and includes a vector database extension for storing and querying high-dimensional vectors. The project distinguishes itself through its comprehensive coverage of core database concepts in a single, learnable c
Crossbeam is a concurrency toolkit for Rust providing low-level primitives for writing multi-threaded programs. It focuses on lock-free data structures and memory management primitives designed for shared-memory concurrent environments. The project includes a work-stealing scheduler that uses double-ended queues to balance workloads across multiple processor cores. This system enables the implementation of work-stealing deques to distribute tasks and prevent bottlenecks. The toolkit covers broader capabilities for parallel algorithm development, multi-threaded task scheduling, and general co
This is an educational relational database engine used in Carnegie Mellon University's database systems course. Students learn internals by implementing core components of a working database, including storage, indexing, concurrency control, and crash recovery. The system covers key database architecture: a B+ tree index for fast key-based lookups and range scans, a disk-oriented buffer pool that caches pages from disk, an iterator-based query execution model that composes physical operators, page-based storage for records, two-phase locking for coordinating concurrent transactions, and write
This project is a developer onboarding tool and GitHub issue discovery portal. It serves as a curated directory and contribution guide designed to match new programmers with beginner-friendly open source tasks based on their technical skills and experience level. The platform operates as a static project directory, using a serverless catalog of repositories stored in JSON files. It provides the ability to filter tasks by programming language and difficulty, enabling users to identify approachable starting points in active software projects. The system includes capabilities for repository cur
nix is a Unix system API library and Rust system programming interface that provides type-safe bindings for invoking low-level system calls. It serves as a low-level operating system wrapper and POSIX compatibility layer, allowing for kernel interactions and administrative tasks through safe wrappers around platform-specific APIs. The project provides a kernel device interface for controlling hardware devices, managing kernel modules, and configuring terminal interfaces. It differentiates itself by offering type-safe wrappers for memory mapping and zero-copy input-output operations to reduce
Dgraph is a distributed graph database designed to store and query highly connected data. It organizes information as nodes and edges to represent complex relationships between entities, providing a platform for managing and analyzing deeply linked datasets. The system functions as a horizontally scalable cluster that partitions data across multiple nodes to maintain performance and availability as information volume increases. It utilizes a specialized query language built for low-latency navigation of interconnected data points, allowing for the execution of complex queries across large-sca
ScyllaDB is a distributed NoSQL database engine designed for high-throughput data storage and low-latency performance at scale. It functions as a shard-aware platform that manages large-scale datasets across distributed clusters, providing a foundation for real-time applications that require consistent availability and operational stability. The system distinguishes itself through a shared-nothing architecture that distributes data across independent CPU cores to eliminate lock contention. It incorporates a user-space networking stack and an asynchronous event-driven engine to maximize hardwa
This project is a distributed, document-oriented database system designed to store information in flexible, hierarchical structures. It supports horizontal scaling through automated sharding and maintains high availability across global clusters using a multi-node replication protocol. By executing multi-document operations as atomic units, the system ensures data integrity and consistency across distributed environments. The platform distinguishes itself by integrating advanced vector-based indexing, which enables semantic similarity searches alongside traditional geospatial and lexical quer
TiDB is a horizontally scalable, distributed SQL database designed to provide consistent transactional storage and high-performance analytical processing within a single unified architecture. It utilizes a decoupled compute-storage design and a distributed key-value storage layer to ensure horizontal scalability and efficient range-based queries. By employing a consensus-based replication algorithm, the system maintains high availability and automatic failover across multiple nodes and geographical regions. The platform distinguishes itself through its hybrid transactional and analytical proc
LibSQL is a high-performance, distributed SQL database engine that extends SQLite to support remote network access, edge computing, and real-time synchronization. It functions as an embedded database library that integrates directly into application processes while providing the infrastructure to maintain consistency across multiple geographic regions. The platform distinguishes itself by enabling database interaction over standard HTTP protocols, allowing applications to query remote data sources in serverless and edge environments without requiring local filesystem access. It includes nativ
Orbit DB is a decentralized NoSQL database that utilizes conflict-free replicated data types to ensure eventual consistency across a network of nodes. It functions as a peer-to-peer data store that uses IPFS for content-addressing and synchronization, allowing for the maintenance of application state without a central server or authority. The system is built upon a cryptographically verifiable, immutable operation log, which serves as the foundation for custom decentralized data models. This architecture enables the implementation of various data storage patterns, including JSON document stor
ToyDB is a distributed SQL database that provides a system for storing and querying data across multiple nodes. It focuses on maintaining strong consistency and fault tolerance through the implementation of a distributed consensus algorithm. The project distinguishes itself by supporting historical data versioning, enabling time-travel queries to retrieve the state of the database from a specific point in the past. It utilizes multi-version concurrency control to manage ACID transactions and ensure data integrity during concurrent operations. The system covers relational data modeling with t
OrbitDB is a decentralized data storage system that enables the creation of serverless databases residing across a network of peers. It functions as a peer-to-peer database that integrates with a content-addressed storage layer to distribute and replicate data without a central server. The system utilizes conflict-free replicated data types to ensure eventual consistency and state convergence across distributed nodes. It maintains an immutable record of updates using a directed acyclic graph to preserve causal ordering and cryptographic integrity. Access is managed through a decentralized ide
Cassandra is a distributed NoSQL database and wide-column store designed for high availability and linear scalability. It functions as a fault-tolerant distributed system that utilizes an LSM-tree storage engine to optimize write throughput and manage massive datasets. The system is a CQL-compliant database, using a structured query language to manage and retrieve tabular data stored across multiple nodes. It organizes information into rows and columns based on a flexible schema and primary keys. The project provides capabilities for horizontal database scaling, distributed data partitioning
This project is a Chinese language translation of the original research paper detailing the Raft consensus protocol. It serves as a technical research translation and a consensus protocol guide, making the specifications of the Raft algorithm accessible to Chinese speakers. The documentation covers the core mechanisms of distributed systems, including leader election, log replication, and safety protocols. It provides a detailed explanation of how to maintain a single source of truth across multiple servers to achieve fault-tolerant cluster management. The material addresses distributed stat
Procs is a high-performance system process monitor and viewer written in Rust. It serves as a replacement for the ps command, providing a command-line interface to track active system processes and their associated CPU and memory metrics. The tool is distinguished by its container awareness, which maps system tasks to their corresponding Docker container names. It also features an interactive process tree to visualize parent and child relationships in a hierarchical layout and a JSON exporter for programmatic data analysis and automation. The monitoring surface includes real-time activity tr
cheats.rs is a Rust syntax reference and technical documentation resource provided as a static site. It serves as a curated collection of examples and patterns designed to assist with Rust language learning. The project covers a wide range of language constructs, including memory management, the use of generics, and the implementation of asynchronous logic. It provides guidance on defining data structures, managing memory references, and organizing code modules. Additional coverage includes patterns for control flow, pattern matching, and the use of macros, as well as instructions for perfor
DefinitelyTyped is a community-maintained type store and centralized JavaScript type registry. It serves as a repository of static TypeScript type declarations for third-party JavaScript libraries, providing the necessary metadata to enable compile-time safety and editor intelligence for external modules not originally written in TypeScript. The project operates as a collaborative ecosystem where contributors define, validate, and maintain type declarations through a structured review process. This involves mapping type definition versions to specific library and compiler releases to ensure s
Xi Editor is a high-performance text editor core written in Rust. It employs a decoupled architecture that separates core logic from the presentation layer using a JSON-based client-server protocol. The project features a language-agnostic plugin system that communicates with external extensions via JSON messages over pipes. It manages text buffers using a persistent rope data structure to enable efficient editing of very large files. The system supports asynchronous editor workflows by running expensive operations in background threads using data snapshots. This prevents background processi
InternetArchitect is an educational collection of documents and source code designed as a high concurrency architecture course. It serves as a distributed systems implementation guide, providing technical patterns and practical examples for designing scalable internet architectures that maintain stability under heavy traffic loads. The project focuses on high-performance database optimization and microservices design patterns. It covers strategies for reducing latency and increasing throughput via database sharding and proxy layers, as well as coordinating global state across distributed clus
Hazelcast is a distributed data platform that combines an in-memory data grid with a stream processing engine to support real-time analytics and event-driven applications. It functions as a partitioned, distributed key-value store that replicates data across cluster nodes to provide low-latency access and high availability. The platform also serves as a distributed SQL query engine, allowing users to execute standard SQL statements against both in-memory datasets and external data sources. What distinguishes Hazelcast is its use of a distributed consensus subsystem to maintain strongly consis
This project provides a development framework for writing loadable Linux kernel modules using the Rust programming language. It establishes a methodology for safe systems programming by enforcing memory and thread safety within the restricted execution environment of the kernel, allowing developers to extend operating system functionality while preventing common memory corruption errors. The framework distinguishes itself through automated generation of type-safe foreign function interfaces, which bridge high-level code with low-level kernel headers and system structures. It maps high-level s