30 open-source projects similar to ray-project/ray, ranked by how many features they have in common. Compare stars, activity and what each one does to find the best Ray alternative.
Dask is a parallel computing framework and distributed task scheduler designed to scale Python data science workflows from single machines to large clusters. It functions as a cluster resource manager that orchestrates computational logic by representing tasks and their dependencies as directed acyclic graphs. This architecture allows the system to automate the distribution of workloads across available hardware while managing complex execution requirements. The project distinguishes itself through a lazy evaluation engine that defers data operations until they are explicitly requested, enabl
Modin is a distributed dataframe library and parallel data processing engine designed to handle large datasets that exceed system memory. It functions as a distributed computing framework that parallelizes data manipulation tasks across multiple CPU cores or clusters to increase throughput and avoid memory errors. The project mirrors the Pandas API, allowing for the distribution of data workflows without changing core code logic. It utilizes a pluggable backend interface, which enables users to switch between different distributed execution engines to optimize performance based on available h
SparkInternals is a technical reference and architecture guide detailing the internal design and implementation of the Apache Spark distributed computing engine. It serves as a study of big data engine analysis, focusing on how the system manages cluster execution and the interaction between driver nodes, executors, and workers. The project provides a detailed breakdown of how logical plans are converted into physical execution stages. It specifically analyzes the mechanics of data shuffle operations, memory management, and the coordination of distributed job scheduling. The documentation co
This project serves as a comprehensive technical reference for the architecture and design of data-intensive applications. It provides a structured analysis of the fundamental principles required to build reliable, scalable, and maintainable software systems, covering the core trade-offs inherent in modern data infrastructure. The repository explores the mechanics of distributed data management, including strategies for replication, partitioning, and achieving consensus across multiple nodes. It details the design of storage engines, indexing techniques, and transaction management models, whi
This project is a community-maintained directory that serves as a comprehensive index of software tools, frameworks, and educational materials. It functions as an open-source knowledge base, organizing diverse engineering domains and technical resources into a structured taxonomy to assist developers in discovering high-quality content. The directory distinguishes itself through a decentralized peer-review model, where independent contributors curate, verify, and update entries to ensure accuracy and relevance. All information is stored in a version-controlled, flat-file markdown format, whic
Daft is a distributed dataframe library and multimodal data processor designed to handle large-scale structured and unstructured data. It functions as a vectorized execution engine that processes tables alongside images, audio, and video, utilizing a unified schema to manage diverse data types. The project distinguishes itself by combining distributed data engineering with large-scale AI inference. It provides an AI data pipeline for batch-optimizing model prompts and generating high-dimensional text embeddings, while utilizing zero-copy memory sharing to execute custom Python functions witho
Orleans is a .NET distributed actor framework designed for building scalable, cloud-native applications. It implements a virtual actor model where entities with stable identities manage their own state and lifecycle across a cluster of servers. The framework provides a distributed state management system with ACID transaction support and a distributed pub/sub streaming engine for real-time data processing. It distinguishes itself through location-transparent routing, automatic actor activation and deactivation, and elastic cluster scaling that redistributes workloads during node failures. Th
Apache Spark is a unified distributed data processing engine designed for large-scale data analysis and computation graphs. It functions as a distributed machine learning framework, a graph processing system, a real-time stream processor, and a SQL analytics engine. The system enables the execution of distributed SQL querying, large-scale graph analysis, and real-time stream analytics across clusters of machines. It also provides a scalable environment for implementing machine learning algorithms and predictive model development on massive datasets. The engine incorporates relational query e
Hazelcast is a distributed data platform that combines an in-memory data grid with a stream processing engine to support real-time analytics and event-driven applications. It functions as a partitioned, distributed key-value store that replicates data across cluster nodes to provide low-latency access and high availability. The platform also serves as a distributed SQL query engine, allowing users to execute standard SQL statements against both in-memory datasets and external data sources. What distinguishes Hazelcast is its use of a distributed consensus subsystem to maintain strongly consis
This project is a build orchestration engine and development toolkit designed for managing large-scale monorepos. It provides a unified workspace environment that maps project relationships and dependencies, enabling the system to perform intelligent impact analysis and execute only the tasks affected by specific code changes. The system distinguishes itself through a persistent daemon that monitors file changes for near-instant feedback and a content-addressable caching mechanism that stores task outputs to prevent redundant computation across local and remote environments. It further suppor
protoactor-go is a framework for building concurrent and distributed systems in Go using the actor model. It provides a distributed actor system that enables isolated entities to communicate via asynchronous messaging and share state across a cluster. The framework implements a multi-language actor protocol, allowing interoperability between actors written in Go, C#, and Java. It further supports a virtual actor implementation, where actors are automatically instantiated across a network based on a unique identity. The system includes a supervision model for managing actor lifecycles and fau
Prefect is a workflow orchestration platform designed to define, schedule, and monitor complex data pipelines as Python code. It functions as a container-native engine that wraps individual tasks in isolated environments, ensuring consistent dependencies and resource allocation across diverse infrastructure. By utilizing a state-machine-based orchestration model, the system tracks execution progress through discrete transitions and persistent event logs to maintain reliable and observable task processing. The platform distinguishes itself through a decoupled worker-API architecture, which sep
Tokio is an asynchronous runtime for the Rust programming language, designed to manage and execute concurrent tasks efficiently. It provides a multi-threaded execution environment that schedules lightweight tasks across available processor cores, utilizing a work-stealing scheduler to balance computational load. By employing a poll-based execution model and waker-based notifications, the runtime drives asynchronous operations forward without requiring active polling loops, ensuring efficient resource utilization. The project distinguishes itself through a comprehensive suite of tools for high
This project provides a comprehensive technical guide and framework for engineering large-scale machine learning systems. It covers the full lifecycle of model development, focusing on the infrastructure and computational principles required to build, train, and serve generative AI models across distributed GPU clusters. The repository distinguishes itself by offering deep-dive tutorials and implementation strategies for complex system challenges. It emphasizes high-performance architectural primitives, such as collective communication orchestration, distributed tensor sharding, and static gr
Horovod is a distributed deep learning framework and gradient synchronizer designed to scale model training across multiple GPUs and compute nodes. It functions as a distributed training orchestrator and an elastic training engine, utilizing an MPI collective communication library to synchronize weights and gradients across TensorFlow, PyTorch, Keras, and MXNet models. The system distinguishes itself through dynamic elastic scaling, which allows it to adjust the number of active workers at runtime and recover from node failures. It optimizes communication efficiency using tensor fusion batchi
Nushell is a cross-platform shell and programming language designed to treat all input and output as structured data rather than raw text streams. By enforcing data types and command signatures, it provides a consistent environment for building robust, pipeline-oriented workflows. The shell allows users to chain commands that pass structured objects between stages, enabling complex data processing and automation tasks that remain predictable across different operating systems. What distinguishes the project is its focus on interactive data exploration and modular extensibility. Users can quer
Vaex is a high-performance Apache Arrow DataFrame library and out-of-core data processing engine designed to handle billion-row tabular datasets in Python. It functions as a lazy evaluation framework that defers computations and transformations until results are required, enabling the processing of datasets that exceed available system RAM by mapping files directly from disk. The project distinguishes itself as a tool for big data visualization and exploration, specifically integrated for use within interactive notebooks. It provides specialized capabilities for machine learning feature engin
Polars is a high-performance columnar data processing library designed for efficient analytical workflows. It functions as a structured data library that organizes information into typed columns, utilizing the Apache Arrow memory format to enable zero-copy data sharing and cache-friendly, vectorized operations. The engine is built to handle large-scale tabular datasets, providing both local and distributed analytical runtimes that scale from single-machine environments to multi-node clusters. The project distinguishes itself through a sophisticated lazy query engine that constructs abstract e
Accelerate is a PyTorch distributed training library that abstracts the boilerplate required to run models across multiple GPUs, TPUs, and CPUs. It functions as a deep learning model scaler and distributed hardware orchestrator, allowing the same training script to run on different hardware backends without modifying the core logic. The project provides a distributed training command line interface for configuring compute environments and launching jobs across single or multi-node clusters. It includes a mixed precision training framework to implement FP16 and BF16 precision, reducing memory
This project is a community-driven directory of software resources, libraries, and tools designed to support iOS application development. It serves as a centralized reference point for developers, organizing a vast ecosystem of third-party components into a searchable, structured index to facilitate discovery and project integration. The repository distinguishes itself through its collaborative curation model, which aggregates disparate utilities into a single, maintainable catalog. By leveraging a flat-file documentation structure, it provides a clear overview of the tools available for nati
Servo is a high-performance, memory-safe web rendering engine designed for cross-platform embedding. It provides a modular framework that allows developers to integrate web content rendering into native applications across desktop, mobile, and embedded systems. By enforcing strict process isolation and memory safety, the engine creates a secure execution environment for processing web content. The engine distinguishes itself through a task-based, parallelized architecture that decouples layout, style, and rendering processes to maximize responsiveness. It utilizes a hardware-abstracted graphi
xxl-job is a distributed task scheduling platform and job orchestrator designed to manage and trigger timed jobs across a cluster of remote executor nodes. It provides a centralized system for scheduling tasks, linking dependent jobs, and managing complex execution lifecycles through a relational database that persists configurations and logs. The platform distinguishes itself through a web-based interface for cron job management, allowing users to create and update scheduled tasks without modifying source code. It supports cross-language task execution by triggering logic on third-party exec
Quasar is a JVM concurrency framework that implements the actor model and a lightweight thread library. It provides isolated execution units that communicate via asynchronous message passing to eliminate shared mutable state. The project distinguishes itself through a distributed actor system capable of operating across multiple cluster nodes with location-transparent registries and actor state migration. It utilizes a work-stealing fiber scheduler to manage millions of lightweight threads, allowing tasks to suspend during non-blocking I/O operations without stalling underlying system threads
Akka is an actor model framework and distributed systems platform used to build concurrent and distributed applications. It provides a toolkit for managing multi-threaded state and behavior through asynchronous message passing, allowing developers to create concurrent applications without manual locks or synchronization. The system functions as a cluster management and event sourcing framework, automating the scaling and coordination of high-availability clusters. It enables the deployment of elastic services that coordinate workloads across multiple network nodes and ensures fault tolerance
Nano is a distributed application framework designed for building systems using an actor-based messaging model. It functions as a distributed actor framework that decouples components through asynchronous messaging to maintain state isolation across a server cluster. The system acts as a cluster message dispatcher and session-aware request router, tracking client state to route incoming messages to the specific agent holding the session data. It utilizes a distributed agent registry to coordinate the dispatching of messages between multiple application instances acting as agents. The framewo
SeaTunnel is a distributed data integration engine designed to synchronize structured and unstructured data across diverse sources and sinks. It functions as a multi-engine execution framework that can run data integration tasks across different distributed computing backends to optimize workload performance. The project is distinguished by a visual data pipeline designer for configuring workflows without manual code and a specialized change data capture tool for streaming incremental database updates. It also includes an enrichment pipeline that integrates large language models and embedding
Moleculer is a Node.js microservices framework designed for building distributed systems. It functions as a distributed service broker, task orchestrator, and service mesh framework, enabling a decentralized architecture with built-in service discovery and load balancing. The project differentiates itself through a pluggable transport layer supporting protocols such as NATS, Redis, TCP, and Kafka, as well as a dedicated microservices API gateway that maps external HTTP and WebSocket requests to internal service actions. It includes built-in fault tolerance mechanisms, including circuit breake
Linera is a multi-chain smart contract platform designed for horizontal scalability through a microchain-based distributed ledger. By partitioning state into independent, parallel chains that share a common validator set, the protocol enables high-performance execution of modular applications. The system utilizes a WebAssembly-based runtime to ensure secure, platform-independent execution of contract logic across the network. The platform distinguishes itself through an asynchronous messaging framework that coordinates state changes between chains by queuing messages for execution in subseque
The AWS Cloud Development Kit is an infrastructure-as-code framework that enables developers to define and provision cloud resources using familiar programming languages. By utilizing construct-based synthesis, it translates high-level, object-oriented code into declarative templates, allowing for the automated management of complex cloud environments through a centralized, code-driven control plane. The framework distinguishes itself through its ability to model infrastructure as a dependency-aware resource graph, ensuring that components are provisioned and updated in the correct order. It
Skynet is a distributed game server framework designed for building scalable online game backends. It utilizes distributed actor-based clusters and real-time network communication to manage high-concurrency session coordination across multiple nodes. The framework includes a cluster management orchestrator for coordinating services via cluster-wide messaging and dynamic configuration updates. It features a multi-protocol network gateway supporting TCP, UDP, and WebSockets, alongside a data encoding layer using BSON and Sproto serialization for efficient information transfer between distribute