Explora sistemas de gestión de bases de datos, frameworks de procesamiento de datos y soluciones de almacenamiento para arquitecturas de software modernas.
ScyllaDB is a distributed NoSQL database engine designed for high-throughput data storage and low-latency performance at scale. It functions as a shard-aware platform that manages large-scale datasets across distributed clusters, providing a foundation for real-time applications that require consistent availability and operational stability. The system distinguishes itself through a shared-nothing architecture that distributes data across independent CPU cores to eliminate lock contention. It incorporates a user-space networking stack and an asynchronous event-driven engine to maximize hardwa
ScyllaDB is a high-performance, distributed NoSQL database that natively supports vector search, time-series data, and self-hosted cluster deployments, making it a comprehensive solution for modern application data storage.
This project is a distributed, document-oriented database system designed to store information in flexible, hierarchical structures. It supports horizontal scaling through automated sharding and maintains high availability across global clusters using a multi-node replication protocol. By executing multi-document operations as atomic units, the system ensures data integrity and consistency across distributed environments. The platform distinguishes itself by integrating advanced vector-based indexing, which enables semantic similarity searches alongside traditional geospatial and lexical quer
MongoDB is a distributed, document-oriented database system that natively supports vector search, horizontal scaling, and self-hosting, making it a comprehensive solution for modern application data storage.
YugabyteDB is a distributed SQL database and relational data store designed for horizontal scalability and high availability across multiple nodes or regions. It functions as a cloud-native system that ensures continuous availability and supports PostgreSQL compatible query languages and drivers. The system includes specialized capabilities as a vector database for AI, utilizing high-dimensional indexing to perform similarity searches. It is engineered as a multi-region cloud database that synchronizes data across different geographic locations to maintain global availability. The project co
YugabyteDB is a distributed, self-hostable SQL database that natively supports vector search and horizontal scaling, making it a comprehensive solution for modern application development.
Cockroach is a distributed SQL database designed to scale horizontally across multiple nodes while maintaining strict ACID compliance and global data consistency. It functions as a relational database engine that automatically partitions data into ranges, rebalancing them across a cluster to accommodate growing storage and throughput requirements. By utilizing a distributed consensus protocol, the system ensures that all nodes agree on the order of operations, providing fault tolerance and continuous availability even in the event of hardware failures. The system distinguishes itself through
CockroachDB is a distributed, self-hostable SQL database that provides horizontal scalability and ACID compliance, making it a robust choice for modern application development.
OceanBase is a distributed SQL database designed for high availability and strong consistency across multiple nodes and regions. It functions as a hybrid transactional and analytical processing engine, allowing real-time analytics and transactions to execute on a single data copy. The system also serves as a vector database engine for indexing and querying vector data to power semantic search and recommendation systems. The platform features native compatibility layers for MySQL and Oracle, enabling the migration of legacy workloads without rewriting SQL code. It utilizes a Paxos-based distri
OceanBase is a distributed SQL database that natively supports vector search and hybrid transactional/analytical processing, making it a comprehensive solution for modern application data storage.
TiDB is a horizontally scalable, distributed SQL database designed to provide consistent transactional storage and high-performance analytical processing within a single unified architecture. It utilizes a decoupled compute-storage design and a distributed key-value storage layer to ensure horizontal scalability and efficient range-based queries. By employing a consensus-based replication algorithm, the system maintains high availability and automatic failover across multiple nodes and geographical regions. The platform distinguishes itself through its hybrid transactional and analytical proc
TiDB is a distributed, cloud-native SQL database that supports horizontal scaling, transactional and analytical processing, and integrated vector search, making it a comprehensive solution for modern application development.
LibSQL is a high-performance, distributed SQL database engine that extends SQLite to support remote network access, edge computing, and real-time synchronization. It functions as an embedded database library that integrates directly into application processes while providing the infrastructure to maintain consistency across multiple geographic regions. The platform distinguishes itself by enabling database interaction over standard HTTP protocols, allowing applications to query remote data sources in serverless and edge environments without requiring local filesystem access. It includes nativ
LibSQL is a distributed, SQLite-compatible SQL database engine that supports vector search and edge-native synchronization, making it a powerful storage solution for modern application development.
Chroma is a specialized vector database designed to index and retrieve high-dimensional data representations for semantic similarity search. It functions as a comprehensive platform for information retrieval, enabling the storage and management of unstructured documents alongside structured metadata. By mapping data into numerical representations, the system facilitates rapid similarity lookups across large datasets. The platform distinguishes itself through a hybrid search infrastructure that combines dense vector embeddings with sparse keyword and regular expression matching to balance sema
Chroma is a specialized vector database that provides robust document storage and retrieval capabilities, making it a highly effective solution for AI-driven application development despite lacking traditional SQL or time-series features.
This project is an open source relational database management system and SQL database designed for storing and managing structured data. It functions as a relational database for ensuring consistency and reliability, while also operating as a vector database for storing and querying high-dimensional vector embeddings. The system incorporates a columnar storage engine to optimize analytical query processing and large-scale data aggregation. It further enables vector similarity search, allowing users to find similar items by querying vector embeddings. The software covers a broad capability su
MariaDB is a robust, self-hostable relational database management system that provides comprehensive SQL support and has expanded its capabilities to include vector search, though it lacks native NoSQL document store and time-series specific engines compared to specialized multi-model databases.
SurrealDB is a multi-model database engine designed to store and query document, graph, relational, and vector data within a single ACID-compliant platform. It functions as an AI-native data store, integrating vector search, graph traversal, and machine learning model execution directly into its query layer. By providing a unified declarative query language, the platform eliminates the need for external middleware to synchronize data across different storage models. The platform distinguishes itself through its ability to manage agent memory and complex workflows natively. It allows developer
SurrealDB is a multi-model database that natively supports SQL, document storage, vector search, and distributed deployment, making it a comprehensive solution for modern application development.
OpenSearch is a distributed search and analytics engine designed for indexing, searching, and analyzing massive volumes of structured and unstructured data in real time. It functions as a comprehensive platform that integrates enterprise-grade search capabilities, a vector database for high-dimensional similarity lookups, and a unified observability suite for monitoring logs, metrics, and traces across complex distributed environments. The platform distinguishes itself through its support for agentic workflow automation, allowing users to orchestrate multi-agent tasks and integrate foundation
OpenSearch is a distributed search and analytics engine that functions as a powerful NoSQL document store and vector database, making it a highly capable data storage solution for modern application development.
Dolt is a relational database engine that integrates version control directly into the database management layer. It functions as a version-controlled SQL database that tracks every row and schema change using a commit-based history, allowing users to branch, merge, and audit data modifications. By implementing a wire-protocol-compatible server, the system enables standard SQL clients and tools to interact with versioned data as if they were connecting to a traditional relational database. The platform distinguishes itself by applying repository-style workflows to data management, including s
Dolt is a relational SQL database engine that provides unique version control capabilities, making it a robust choice for application development that requires data auditing and branching.
NeDB is a JavaScript embedded NoSQL document store designed for Node.js and the browser. It functions as an in-memory data store with the option to persist documents to a local file system, ensuring data survives application restarts. The project utilizes a MongoDB-compatible API to perform data operations, allowing it to serve as a lightweight document indexing system and a persistent file database without requiring a separate database server. Capabilities include querying, inserting, updating, and deleting documents, as well as the ability to create indexes on specific fields to accelerate
NeDB is a lightweight, embedded NoSQL document store for Node.js and browser environments that provides a simple, serverless way to manage persistent JSON data within an application.
This project is a reactive, offline-first NoSQL database engine designed for JavaScript applications. It provides a robust framework for managing application state by synchronizing data across browsers, mobile devices, and server-side runtimes. By treating local storage as the primary source of truth, it enables applications to remain functional without network connectivity, automatically reconciling changes with remote backends once a connection is restored. The database distinguishes itself through a modular architecture that supports cross-environment synchronization and high-performance d
This is a reactive, client-side NoSQL database engine that excels at local-first data synchronization and state management, though it functions as an embedded library for applications rather than a standalone server-side database management system.
Manticoresearch is a high-performance search engine and database designed for indexing and retrieving large datasets. It functions as a full-text search engine, a vector search database, and a SQL-based search database, providing a distributed search cluster architecture. The system provides an alternative to the Elasticsearch stack, offering a compatible API for indexing and searching structured and unstructured data. It distinguishes itself by supporting multiple retrieval methods, including vector matching for similarity search, geospatial queries, and traditional full-text ranking. The p
Manticore Search is a high-performance database and search engine that supports SQL, vector search, and distributed architecture, making it a robust choice for application data storage despite lacking native time-series specific features.
TDengine is a distributed time-series database designed for the high-speed ingestion, compression, and retrieval of timestamped metrics and sensor data. It functions as a SQL-compatible analytics engine, allowing users to perform complex operations on massive volumes of time-ordered information using standard relational syntax. The platform is built to serve as a backend foundation for industrial IoT environments, managing real-time data streams and device metadata through a cluster-based architecture. The system distinguishes itself through a distributed sharding architecture that uses consi
TDengine is a distributed, SQL-compatible database specifically optimized for time-series and IoT data, making it a robust choice for applications requiring high-throughput ingestion and storage of timestamped metrics.
Databend is a cloud-native data warehouse and OLAP database designed for large-scale analytics. It functions as a SQL-compliant engine and serverless analytics platform that separates compute from storage to allow for independent scaling. The system integrates vector database capabilities, indexing high-dimensional embeddings to enable semantic, hybrid, and full-text searches across massive datasets. It further distinguishes itself through serverless compute management that automatically scales resources based on demand and shuts them down during idle periods. The platform covers a broad set
Databend is a cloud-native OLAP database and data warehouse that provides SQL support and vector search capabilities, making it a powerful tool for large-scale analytical data storage.
Redis is an in-memory, key-value database designed to provide sub-millisecond latency for read and write operations. It functions as a versatile data platform, serving as a distributed cache, a message broker, a NoSQL document store, and a vector database. The system utilizes an event-driven, single-threaded loop to process requests efficiently, while maintaining data durability through append-only persistence logs and asynchronous snapshotting mechanisms. What distinguishes Redis is its ability to handle complex data structures—including strings, hashes, lists, sets, and sorted sets—alongsid
Redis is a high-performance, self-hostable, distributed key-value store that supports NoSQL document storage, vector search, and time-series data, though it lacks native SQL support.
rqlite is a distributed relational database that replicates SQLite data across a cluster using the Raft consensus algorithm. It functions as a fault-tolerant storage system that provides high availability and a web API for executing SQL queries and managing relational data without requiring native database drivers. The system distinguishes itself by using an HTTP SQL interface to expose database operations and cluster management. It features a real-time change data capture stream that pushes database mutations to external HTTP endpoints via webhooks and supports the scaling of read throughput
This is a distributed relational database that provides SQL support and high availability through SQLite replication, making it a robust self-hostable storage solution for application development.
AliSQL is a fork of MySQL by Alibaba that extends the relational database management system with enhancements for high performance, scalability, and enterprise-grade availability. It retains the core MySQL identity as a SQL-based database for storing, organizing, and retrieving structured data, while adding optimizations for large-scale transactional and analytical workloads. The project differentiates itself through a set of Alibaba-specific improvements, including a columnar engine for accelerating analytical queries directly on MySQL tables, and a distributed, shared-nothing NDB Cluster en
AliSQL is a high-performance, enterprise-grade fork of MySQL that provides robust SQL support, vector search, and document store capabilities, making it a powerful self-hostable database solution for application development.
LanceDB is a vector database and columnar data store designed to function as a versioned dataset manager and vector search engine. It serves as a high-performance backend for indexing and retrieving high-dimensional embeddings, providing the foundation for machine learning data pipelines. The system distinguishes itself through a combination of cloud-native object storage and immutable version tracking, allowing for data time-travel and reproducible AI experiments. It integrates hybrid search capabilities, merging dense vector similarity with BM25 full-text search and SQL-like scalar filters
LanceDB is a specialized vector database and columnar store that provides high-performance embedding retrieval and SQL-like filtering, making it a capable data storage solution for AI-driven application development.
Redis is a high-performance in-memory key-value store that functions as a distributed cache, message broker, and NoSQL database. It provides sub-millisecond read and write access to data stored in RAM and can operate as a vector database for indexing high-dimensional embeddings. The system supports a wide range of data storage and synchronization primitives, including the management of strings, hashes, lists, sets, and JSON documents. It enables real-time data operations through atomic transactions, hybrid persistence using snapshots and append-only logs, and high-availability configurations
Redis is a high-performance, distributed NoSQL database that supports vector search and self-hosting, though it lacks native SQL support and is primarily optimized for in-memory key-value storage rather than traditional relational data management.
Memvid is an embedded memory framework designed to provide persistent, versioned context for intelligent agents. It functions as a local vector database library that stores all data within a single binary file, removing the need for external database infrastructure or network dependencies. The system distinguishes itself by integrating in-process vector indexing with append-only versioning, allowing for high-speed semantic similarity searches alongside the ability to track and roll back state changes over time. It includes built-in transparent data encryption and masking to secure sensitive i
This is an embedded vector database library designed for in-process agent memory rather than a general-purpose database management system for broader application development.
RethinkDB is a distributed, document-oriented database designed to store and manage JSON-formatted data across scalable clusters. It utilizes a custom log-structured storage engine with B-Tree indexing to ensure high-performance disk I/O and data persistence. The system maintains high availability through automatic sharding and replication, employing a primary-replica voting consensus mechanism to handle node failures and ensure consistent cluster operations. A defining characteristic of the platform is its reactive changefeed engine, which allows applications to subscribe to live data update
RethinkDB is a distributed, self-hostable document-oriented database that excels at real-time data streaming, though it lacks native SQL support and specialized vector search capabilities.
DuckDB is an in-process analytical database engine designed to run directly within an application process. As a zero-dependency, embedded system, it provides enterprise-grade SQL data processing capabilities without the overhead of managing a dedicated database server. It is built to handle complex analytical and aggregation tasks by storing and retrieving information in columns, allowing for high-performance relational data manipulation. The engine distinguishes itself through a columnar vectorized execution model that maximizes CPU cache efficiency during query operations. It employs adapti
DuckDB is a high-performance, in-process analytical database engine that provides robust SQL support for application development, though it is designed as an embedded library rather than a distributed, multi-model server.
Pglite is a client-side relational database engine that runs a full-featured PostgreSQL instance directly within browser and Node.js environments. By leveraging WebAssembly, it provides a persistent SQL storage solution that enables complex data management and querying without requiring an external database server. The project distinguishes itself through a reactive SQL data layer that automatically synchronizes user interface components with live query results. It manages database operations using worker threads to prevent main-thread blocking and coordinates access across multiple browser t
Pglite is a full-featured PostgreSQL instance running in WebAssembly, providing a robust SQL-based storage solution for local-first and browser-based application development.