2 repos
Techniques for dividing large datasets into smaller, manageable segments for parallel processing.
Distinguishing note: Focuses on the storage organization strategy rather than the indexing algorithm.
Explore 2 awesome GitHub repositories matching data & databases · Data Partitioning. Refine with filters or upvote what's useful.
Milvus is a specialized vector database engine designed for the indexing, management, and high-speed similarity retrieval of high-dimensional vector embeddings. It functions as a similarity search engine capable of identifying nearest neighbors within large-scale vector spaces, supporting the storage and retrieval of billions of data points while maintaining consistent performance. The system utilizes a distributed architecture that decouples storage, query, and coordination into independent services, allowing for horizontal scaling across clusters. It employs a global indexing mechanism that
Partitions data into immutable segments to optimize memory usage and parallel search performance.
This project is a comprehensive educational resource focused on the principles, patterns, and trade-offs required to design scalable, reliable, and high-performance distributed systems. It provides a structured curriculum that covers the fundamental architectural strategies necessary for building modern software infrastructure, ranging from high-level system decomposition to low-level networking and data management. The repository distinguishes itself by offering deep dives into complex architectural patterns, such as microservices-based decomposition, event-driven communication, and command-
Details the architectural pattern of horizontal partitioning to scale database performance.