30 open-source projects similar to open-mpi/ompi, ranked by how many features they have in common. Compare stars, activity and what each one does to find the best Ompi alternative.
This project provides a comprehensive technical guide and framework for engineering large-scale machine learning systems. It covers the full lifecycle of model development, focusing on the infrastructure and computational principles required to build, train, and serve generative AI models across distributed GPU clusters. The repository distinguishes itself by offering deep-dive tutorials and implementation strategies for complex system challenges. It emphasizes high-performance architectural primitives, such as collective communication orchestration, distributed tensor sharding, and static gr
NCCL is a high-performance communication library and distributed GPU computing framework designed for executing collective and point-to-point data exchanges across multiple GPUs in single or multi-node systems. It serves as an RDMA GPU transport layer and memory orchestrator, facilitating high-bandwidth synchronization of data and model gradients for distributed GPU training and inference. The library is distinguished by its ability to execute communication primitives directly from GPU kernels, removing the host CPU from the critical path. It utilizes topology-aware path selection to optimize
Boost is a collection of portable, high-performance source libraries that extend the C++ standard library. It provides a wide range of reusable components, data structures, and algorithms designed to add capabilities to the base language across different platforms. The project is distinguished by its extensive focus on compile-time template metaprogramming and generic programming. It implements advanced architectural patterns such as policy-based design, concept-based type validation, and the use of SFINAE for conditional template resolution to minimize runtime overhead. The library covers a
This project is a technical curriculum and set of educational resources focused on parallel programming, high-performance computing, and systems programming. It provides a structured course covering the implementation of parallel algorithms and multithreading techniques for processing large datasets. The project includes a systems programming guide for modern language features, a framework for lock-free concurrency patterns, and a manual for optimizing CPU and GPU performance through assembly analysis and cache management. The material covers hardware performance tuning, the implementation o
Dask is a parallel computing framework and distributed task scheduler designed to scale Python data science workflows from single machines to large clusters. It functions as a cluster resource manager that orchestrates computational logic by representing tasks and their dependencies as directed acyclic graphs. This architecture allows the system to automate the distribution of workloads across available hardware while managing complex execution requirements. The project distinguishes itself through a lazy evaluation engine that defers data operations until they are explicitly requested, enabl
This project is a comprehensive engineering framework and technical reference for managing, scaling, and optimizing distributed machine learning infrastructure. It provides a suite of methodologies and diagnostic tools designed to support large-scale model training and inference on high-performance computing clusters. The project distinguishes itself through a specialized diagnostic toolkit and infrastructure optimization suite that addresses the complexities of multi-node environments. It enables precise control over cluster resources, including hardware maintenance, network topology configu
klib is a comprehensive C standard library extension and data structure toolkit. It provides a set of fundamental tools for memory management, data organization, and general-purpose utility functions for standalone C applications. The project features specialized capabilities for bioinformatics sequence analysis, including the parsing of FASTA, FASTQ, and Newick formats and the implementation of Smith-Waterman sequence alignment and Hidden Markov Models. It also includes a mathematical computation library for numerical routines and expression evaluation, as well as a lightweight HTTP and FTP
RetroArch is a cross-platform emulation host and multi-system game emulator that serves as a frontend for the Libretro API. It coordinates video, audio, and user input to maintain the application state for various emulator cores, allowing it to run a wide variety of vintage gaming hardware and software engines. The platform distinguishes itself through a low-latency emulation model that uses run-ahead processing to reduce input lag. It also features a real-time state tracking system that enables gameplay rewinding by saving periodic snapshots of the emulator memory state. The environment inc
libgit2 is a portable, cross-platform C library that provides a programmatic interface for integrating Git version control directly into applications. It serves as a linkable implementation of Git internals, allowing developers to manage repositories and manipulate version control data without requiring a system installation of the Git command line tool. The library functions as an embedded API and object database manager capable of reading and writing commits, trees, blobs, and tags. It includes a network transport client to handle the transfer of repository data over protocols such as SSH a
MPack - A C encoder/decoder for the MessagePack serialization format / msgpack.orgC
https://github.com/json-c/json-c is the official code repository for json-c. See the wiki for release tarballs for download. API docs at http://json-c.github.io/json-c/
On the fly syntax checking for GNU Emacs
libhv is a high-performance C/C++ network library and event-driven I/O framework used to build TCP, UDP, SSL, HTTP, WebSocket, and MQTT clients and servers. It provides a non-blocking event loop for managing network sockets, timers, and system signals across multiple threads. The project is distinguished by its integrated support for specialized network roles, including a full HTTP web server with RESTful routing and middleware, an MQTT messaging client for IoT communication, and the ability to implement SOCKS5 and HTTP proxies. It also features a reliable UDP implementation to ensure ordered
sds is a C dynamic string library that provides a memory management wrapper for heap-allocated strings. It implements binary-safe storage by tracking string lengths explicitly, allowing the library to handle null characters within data. The library distinguishes itself through a memory architecture that uses interchangeable function pointers for allocation and freeing, enabling the integration of custom memory managers. It utilizes header-stored length tracking to provide constant-time length retrieval and maintains null-terminated buffer padding to ensure compatibility with standard C string
Gumbo-parser is a high-performance HTML5 parsing library written in pure C99. It transforms raw markup into a structured document tree by implementing the formal state-machine tokenization and error recovery rules defined in the HTML5 specification. The project serves as an HTML source mapping tool, linking parsed nodes back to their original byte offsets and pointers within the input buffer. This allows for the precise tracking of source locations for elements within the resulting parse tree. Beyond full document processing, the library handles isolated HTML fragments and provides a C-based
s2n is a C-based security library and TLS protocol implementation that serves as a secure network transport layer. It provides a modular cryptographic backend interface to encrypt data streams, manage handshakes, and handle mutual authentication between peers. The project focuses on post-quantum cryptography, integrating quantum-resistant key exchange and digital signatures to protect connections against future computing threats. It distinguishes itself through security hardening measures, such as memory-locked secret storage to prevent keys from being swapped to disk and timing-attack mitiga
An implementation of the MessagePack serialization format in C / msgpack.orgC
C library/compiler for the Cap'n Proto serialization/RPC protocol
gperftools is a collection of specialized tools for profiling CPU usage, detecting memory errors, and providing high-performance memory allocation. It provides a memory profiling toolkit for C++ applications, including a sampling CPU profiler and a heap profiler for analyzing consumption patterns. The project includes a high-performance memory allocator designed as a multi-threaded replacement for standard allocation to reduce contention and improve execution speed. It further provides a memory debugger to identify illegal memory access and double frees. The toolkit covers broad diagnostic c
This project is a suite of runtime diagnostic tools designed to detect memory leaks, concurrency races, and language-specification violations during software execution. It provides a collection of dynamic analysis tools that identify addressability issues, uninitialized memory usage, and memory safety bugs in applications. The toolset includes a thread safety analyzer to identify data races and deadlocks in concurrent code, as well as an undefined behavior sanitizer to detect operations that violate language specifications. The system covers broad capabilities in memory safety monitoring and
clib is a C language package manager and dependency manager used to install, update, and manage external C libraries and executable dependencies from remote repositories. It functions as a distribution tool for structuring source code and metadata to publish C libraries and a development toolkit for maintaining consistent build environments. The project provides a framework for C library distribution and dependency resolution, utilizing manifest files to track required library versions and ensure reproducible builds across different systems. It streamlines the C development workflow by managi
Parser combinators for binary formats, in C. Yes, in C. What? Don't look at me like that.