17 open-source projects similar to apache/incubator-zeppelin, ranked by how many features they have in common. Compare stars, activity and what each one does to find the best Incubator Zeppelin alternative.
This project is a suite of software for radio interferometry imaging, specialized in the processing, analysis, and reconstruction of Very Long Baseline Interferometry (VLBI) observations. It provides tools for reconstructing images from interferometry data using regularized maximum likelihood methods and managing the end-to-end data processing pipeline from raw visibilities to final images. The software distinguishes itself with a dedicated interstellar scattering simulator that models thin-screen scattering effects and applies scattering kernels to radio images. It also features a radio imag
Azure Docs is the official technical documentation repository for Microsoft Azure, the cloud computing platform. It provides comprehensive guidance on the full spectrum of Azure services, covering everything from core infrastructure components like virtual machines, Kubernetes clusters, and serverless computing to platform services for AI, machine learning, data analytics, and storage. The documentation details how to provision, manage, and govern cloud resources at scale, including policy enforcement, identity management, and cost optimization. The documentation distinguishes Azure through i
Apache Spark is a unified distributed data processing engine designed for large-scale data analysis and computation graphs. It functions as a distributed machine learning framework, a graph processing system, a real-time stream processor, and a SQL analytics engine. The system enables the execution of distributed SQL querying, large-scale graph analysis, and real-time stream analytics across clusters of machines. It also provides a scalable environment for implementing machine learning algorithms and predictive model development on massive datasets. The engine incorporates relational query e
D3 is a modular library providing low-level primitives for creating data-driven visualizations. It functions as a flexible framework that allows for direct control over visual presentation by mapping abstract data dimensions to graphical properties, such as position, color, and size, without imposing predefined chart abstractions. The library distinguishes itself by offering specialized tools for complex data representation, including algorithmic layouts for hierarchical structures and geographic projection utilities for mapping spherical coordinates. It also includes a comprehensive suite fo
A High Dynamic Range (HDR) Histogram
Storm is a distributed stream processing framework and fault-tolerant compute engine designed for executing real-time continuous computations across a cluster of machines. It functions as a stateful stream processor and cluster topology manager, enabling the deployment and monitoring of distributed data flow configurations. The system ensures exactly-once semantics by utilizing transactional state management to guarantee that every message in a data stream is processed exactly one time. It further operates as a distributed RPC system, allowing for the integration of non-native languages throu
OpenRefine is a data cleaning tool and wrangling platform used to transform raw, messy datasets into consistent and structured formats. It operates as a Java-based data processor that runs a local server and provides a web browser interface for managing and manipulating data. The platform includes a data reconciliation engine for matching local entries against external knowledge bases to standardize entities. It also functions as a web data augmentation tool, allowing users to fetch and integrate information from external web sources to enrich their datasets. The system provides a transforma
Prisma1 is a TypeScript object-relational mapper and type-safe database client designed for interacting with relational databases. It functions as a system for declarative schema modeling, where database structures are defined in a single schema file that automatically synchronizes with the underlying database. The project provides a type-safe query builder that generates a custom client to ensure database queries match defined schema types at compile time. It also includes a database GUI administrator, providing a visual web interface for browsing, editing, and managing relational database r
Machine Learning Platform and Recommendation Engine built on Kubernetes
dplyr is an R data manipulation library that provides a grammar for transforming tabular data frames. It functions as an in-memory data frame processor and a relational data algebra tool, using a consistent set of verbs to filter, select, and summarize data. The project includes a SQL translation engine that converts high-level data manipulation expressions into optimized queries. This allows users to perform transformations directly on remote relational databases and cloud storage without pulling data locally. The library covers a broad range of tabular operations, including column mutation
ggplot2 is a data visualization library for R based on a formal grammar of graphics. It provides a declarative plotting framework that allows users to create complex graphics by combining geometric objects, statistical summaries, and coordinate systems. The system is distinguished by a layered approach to composition, where visualizations are built incrementally by stacking independent geometric, statistical, and coordinate layers. It utilizes a hierarchical styling engine to manage non-data elements such as backgrounds, fonts, and margins, and includes a multi-panel faceting tool for splitti
CMAK is a Kafka cluster management tool and web interface designed for the administration of brokers, topics, and partitions. It provides a centralized system for Kafka cluster governance, encompassing resource administration, access control, and data distribution optimization. The project features a management UI that allows for the creation, deletion, and update of topic configurations and partition counts. It includes a partition rebalancer for executing data reassignment and preferred replica elections to balance load across cluster nodes. The system provides observability through broker
Stream summarizer and cardinality estimator.
This project is an Android RPA framework designed for automating user interfaces and system tasks on rooted Android devices using Python and ADB. It provides a suite of tools for rooted device management, allowing for programmatic control of system settings, application lifecycles, and shell command execution via a remote API. The framework distinguishes itself through a combination of dynamic instrumentation and AI integration. It can inject scripts into running processes to hook Java interfaces and modifies application behavior in real time. Additionally, it supports large language model in