9 مستودعات
APIs for retrieving real-time state and health information from distributed clusters.
Distinguishing note: Focuses on querying cluster state rather than general data storage operations.
Explore 9 awesome GitHub repositories matching data & databases · Cluster Query Interfaces. Refine with filters or upvote what's useful.
Selenium is a comprehensive browser automation framework that provides a standardized interface for controlling web browsers to perform automated tasks, user interactions, and data extraction. It functions as a cross-browser testing tool, enabling developers to execute identical automation scripts across various browser engines and operating systems to ensure consistent application behavior. By implementing the WebDriver protocol, it maps high-level automation commands to browser-specific drivers using a standardized HTTP-based wire protocol. The project distinguishes itself through its distr
Provides a flexible interface to query node and cluster health status.
Apache Flink is a distributed processing engine designed for both high-throughput, low-latency data streams and finite batch workloads. It functions as a stateful stream processor and a SQL stream processing engine, providing a unified runtime to execute relational queries and event-based transformations. The system is distinguished by its ability to manage persistent operator state to ensure exactly-once processing guarantees and consistency during failures. It features specialized capabilities for complex event processing to detect temporal patterns and handles out-of-order events using eve
Enables writing and submitting processing queries directly to a cluster via a command-line interface.
Presto is a distributed SQL query engine designed for high-performance analytical processing across heterogeneous data sources. It functions as a data federation platform and massively parallel processing engine, allowing users to execute interactive queries against diverse storage systems without requiring data migration. By mapping remote metadata and structures to a unified relational namespace, it enables seamless cross-platform analysis through a standard SQL interface. The engine distinguishes itself through a pluggable connector architecture and a shared-nothing distributed processing
Retrieves real-time information and performance metrics about the running cluster using standard SQL queries.
VictoriaMetrics is a high-performance, scalable time series database and observability platform designed for long-term storage and analysis of metric, log, and trace data. It functions as a unified backend for monitoring ecosystems, offering full compatibility with industry-standard protocols and query languages. The system is built to handle massive data volumes through a distributed architecture that supports horizontal scaling and efficient data lifecycle management. The platform distinguishes itself through a storage engine that utilizes consistent hashing for data sharding and log-struct
Queries multiple storage nodes or lower-level clusters through a unified interface to provide a single view of distributed data.
dbt-core is a command-line framework for transforming data within a warehouse using modular SQL and version control. It functions as a data transformation engine that enables users to define data structures and business logic through declarative configuration files, which the system then compiles into executable code. By managing complex data dependencies through a directed acyclic graph, it ensures that transformation tasks execute in the correct order while maintaining a manifest-driven state to track lineage and execution history. The project distinguishes itself through an adapter-based d
Provides interfaces for querying warehouse metadata and lineage remotely.
The AWS Cloud Development Kit is an infrastructure-as-code framework that enables developers to define and provision cloud resources using familiar programming languages. By utilizing construct-based synthesis, it translates high-level, object-oriented code into declarative templates, allowing for the automated management of complex cloud environments through a centralized, code-driven control plane. The framework distinguishes itself through its ability to model infrastructure as a dependency-aware resource graph, ensuring that components are provisioned and updated in the correct order. It
Retrieves detailed state and configuration metadata for streaming clusters to monitor operational health.
Elasticsearch Head is a web-based graphical interface for monitoring and administering Elasticsearch clusters. It serves as a cluster management UI, a topology visualizer for nodes and shards, and a REST API client for sending HTTP requests and analyzing JSON responses. The tool distinguishes itself by providing a visual map of cluster topology to monitor data distribution and health. It includes a local proxy to enable administration of remote clusters that are not directly accessible and supports the injection of basic authentication headers for secure request handling. The platform covers
Executes searches against the cluster to retrieve real-time state and data in JSON or tabular formats.
RisingWave is a cloud-native streaming database and real-time analytics engine that uses standard SQL to process continuous data streams. It functions as a streaming data lakehouse, combining the capabilities of a streaming SQL database with a platform that integrates streaming ingestion with open table formats. The system is distinguished by its use of the PostgreSQL wire protocol, allowing it to integrate with existing SQL tools and drivers. It employs a decoupled compute and storage architecture, persisting streaming state and materialized views in cloud object storage to enable independen
Allows retrieval of database object definitions and system metadata using standard SQL queries against system catalogs.
Ignite هي منصة موزعة للبيانات والحوسبة في الذاكرة. تعمل كقاعدة بيانات SQL موزعة ومحرك تخزين مصمم لتخزين ومعالجة مجموعات البيانات الكبيرة في RAM لتقليل التأخير وزيادة سرعة الحساب. يتميز النظام بمحرك تخزين متعدد المستويات يدير وضع البيانات عبر الذاكرة والقرص لموازنة الوصول عالي السرعة مع السعة الكبيرة. يتميز بشبكة حوسبة موزعة تنفذ منطقاً مخصصاً مباشرة على العقد التي توجد فيها البيانات لتقليل حركة مرور الشبكة. توفر المنصة مجموعة واسعة من القدرات بما في ذلك إدارة معاملات ACID، واستعلام SQL القياسي، وعمليات القيمة المفتاحية. تدعم استيعاب البيانات بكميات كبيرة عبر التدفقات التفاعلية وتوفر دمجاً عبر لغات برمجة متعددة، وبرامج تشغيل قواعد بيانات قياسية، وواجهة برمجة تطبيقات REST. يمكن نشر النظام كمجموعة موزعة باستخدام حاويات أو تنسيقه عبر Kubernetes. تمت كتابة المشروع بلغة Java ويمكن تثبيته عبر أرشيفات ثنائية.
Retrieves real-time cluster metrics and internal system information using SQL queries against system tables.