Why is apache/spark a recommended Distributed Compute Frameworks GitHub Repositories repository?

Analyzes and transforms continuous real-time data streams for immediate insight and analytics.

Why is zama-ai/fhevm a recommended Distributed Compute Frameworks GitHub Repositories repository?

Provides an asynchronous computation service that offloads resource-intensive encrypted operations to maintain scalability.

Why is serengil/deepface a recommended Distributed Compute Frameworks GitHub Repositories repository?

Handles asynchronous processing of data streams to support real-time facial analysis tasks.

Why is mementum/backtrader a recommended Distributed Compute Frameworks GitHub Repositories repository?

Synchronizes data streams of varying granularities to evaluate long-term trends alongside short-term price movements.

Why is nats-io/nats-server a recommended Distributed Compute Frameworks GitHub Repositories repository?

Supports push and pull patterns for consuming persistent log data at scale.

Why is ornicar/lila a recommended Distributed Compute Frameworks GitHub Repositories repository?

Implements a distributed computing engine specialized for evaluating chess positions and calculating optimal moves in parallel.

Why is lichess-org/lila a recommended Distributed Compute Frameworks GitHub Repositories repository?

Offloads computationally intensive move evaluations to a cluster of specialized servers for real-time tactical insights.

Why is pingcap/tikv a recommended Distributed Compute Frameworks GitHub Repositories repository?

Implements a coprocessor for executing filtering and aggregation logic directly on storage nodes to minimize network latency.

Why is rare-technologies/gensim a recommended Distributed Compute Frameworks GitHub Repositories repository?

Functions as a distributed computing engine for processing and transforming massive text corpora.

Why is aio-libs/aiohttp a recommended Distributed Compute Frameworks GitHub Repositories repository?

Handles high-volume network traffic through memory-efficient chunked data processing.

36 مستودعات

Awesome GitHub RepositoriesDistributed Compute Frameworks

High-performance engines designed for parallelized data processing across clusters, distinct from high-level workflow orchestration tools.

Explore 36 awesome GitHub repositories matching data & databases · Distributed Compute Frameworks. Refine with filters or upvote what's useful.

اعثر على أفضل المستودعات باستخدام الذكاء الاصطناعي.سنبحث عن أفضل المستودعات المطابقة باستخدام الذكاء الاصطناعي.

apache/spark
apache/spark
43,467عرض على GitHub
Apache Spark is a unified distributed data processing engine designed for large-scale data analysis and computation graphs. It functions as a distributed machine learning framework, a graph processing system, a real-time stream processor, and a SQL analytics engine. The system enables the execution of distributed SQL querying, large-scale graph analysis, and real-time stream analytics across clusters of machines. It also provides a scalable environment for implementing machine learning algorithms and predictive model development on massive datasets. The engine incorporates relational query e
Analyzes and transforms continuous real-time data streams for immediate insight and analytics.
Scalabig-datajavajdbc
عرض على GitHub43,467
zama-ai/fhevm
zama-ai/fhevm
25,215عرض على GitHub
fhevm is a full-stack blockchain framework designed to integrate Fully Homomorphic Encryption into smart contracts. It provides a platform for developing confidential smart contracts that can process encrypted data and execute private on-chain computations without decrypting the underlying information. The framework utilizes a coprocessor system to offload resource-intensive encrypted operations to an asynchronous service, improving blockchain performance and scalability. It incorporates a secure key management service based on multi-party computation and a zero-knowledge proof verifier to en
Provides an asynchronous computation service that offloads resource-intensive encrypted operations to maintain scalability.
Rustblockchainfheprivacy
عرض على GitHub25,215
serengil/deepface
serengil/deepface
22,226عرض على GitHub
Deepface is a comprehensive deep learning library for facial recognition and demographic analysis. It provides a modular pipeline that handles the entire lifecycle of facial processing, including detection, geometric alignment, and the transformation of facial images into high-dimensional numerical vector embeddings for identity verification and similarity comparison. The library distinguishes itself through a model ensemble approach, which combines predictions from multiple pre-trained neural networks to improve classification accuracy and reduce bias. It also integrates advanced security fe
Handles asynchronous processing of data streams to support real-time facial analysis tasks.
Pythonage-predictionarcfacedeep-learning
عرض على GitHub22,226
mementum/backtrader
mementum/backtrader
20,462عرض على GitHub
Backtrader is a Python framework designed for the development, backtesting, and live execution of algorithmic trading strategies. It provides a comprehensive environment for quantitative finance, allowing users to simulate trading logic against historical market data or connect directly to brokerage platforms for automated real-time trading. The project distinguishes itself through a unified event-driven architecture that treats backtesting and live trading with the same API. This consistency is supported by a flexible data-feed abstraction layer that normalizes diverse financial sources, ena
Synchronizes data streams of varying granularities to evaluate long-term trends alongside short-term price movements.
Pythonbacktestingmetaclasspython
عرض على GitHub20,462
nats-io/nats-server
nats-io/nats-server
20,076عرض على GitHub
NATS Server is a high-performance, lightweight messaging system designed for cloud-native applications, edge computing, and distributed microservices. It functions as a distributed publish-subscribe broker that routes messages using hierarchical, dot-separated subject strings, enabling decoupled communication between services without requiring centralized broker lookups. The system supports core messaging patterns including asynchronous publish-subscribe, request-reply, and load-balanced queue processing. The platform distinguishes itself through a decentralized architecture that eliminates t
Supports push and pull patterns for consuming persistent log data at scale.
Gocloudcloud-computingcloud-native
عرض على GitHub20,076
ornicar/lila
ornicar/lila
18,362عرض على GitHub
Lila is an open-source chess server and multiplayer platform designed for playing, analyzing, and streaming games. It functions as a comprehensive environment for hosting competitive play and managing player profiles. The platform integrates a distributed chess engine interface to evaluate complex positions and a collaborative analysis board that allows multiple users to study and coordinate insights in real time. It also includes an online tournament platform for organizing competitive events, simultaneous exhibitions, and structured player leagues. The system maintains a searchable game da
Implements a distributed computing engine specialized for evaluating chess positions and calculating optimal moves in parallel.
Scala
عرض على GitHub18,362
lichess-org/lila
lichess-org/lila
18,362عرض على GitHub
Lila is a comprehensive, open-source chess gaming platform designed for real-time multiplayer interaction, competitive tournament management, and deep strategic analysis. It provides a global environment where users can engage in live matches, participate in structured competitions, and access extensive archives of historical game data for research and study. The platform distinguishes itself through a highly scalable architecture that utilizes actor-model concurrency and event-sourced game states to ensure precise match reconstruction and fault tolerance. It integrates distributed engine eva
Offloads computationally intensive move evaluations to a cluster of specialized servers for real-time tactical insights.
Scalachessfree-softwarefunctional-programming
عرض على GitHub18,362
pingcap/tikv
pingcap/tikv
16,724عرض على GitHub
TiKV is a cloud-native distributed transactional key-value store and storage engine. It provides a distributed database designed for horizontal scalability and strong consistency across a cluster of physical nodes. The system uses a Raft-based consensus mechanism to maintain data availability and state synchronization. It ensures ACID compliance for distributed transactions through a two-phase commit workflow and manages data distribution via multi-Raft sharding. The engine handles massive datasets using automated range splitting and cluster load balancing to distribute data across different
Implements a coprocessor for executing filtering and aggregation logic directly on storage nodes to minimize network latency.
Rust
عرض على GitHub16,724
rare-technologies/gensim
RaRe-Technologies/gensim
16,442عرض على GitHub
Gensim is an unsupervised natural language processing toolkit designed for topic modeling, word embedding training, and the processing of large-scale text corpora. It provides a framework for discovering latent themes and semantic structures in text without the need for labeled data. The toolkit is distinguished by its ability to handle datasets that exceed system memory through iterator-based data streaming from disk. It also supports distributed model training, allowing complex modeling tasks to be executed across computer clusters. The library covers a broad range of analysis capabilities
Functions as a distributed computing engine for processing and transforming massive text corpora.
Python
عرض على GitHub16,442
aio-libs/aiohttp
aio-libs/aiohttp
16,351عرض على GitHub
This project is an asynchronous network framework for Python that provides both a client and a server for HTTP communication. It is designed to handle high-concurrency network operations by leveraging cooperative multitasking, allowing for the management of thousands of simultaneous connections without the overhead of traditional thread-per-request models. The framework distinguishes itself through its focus on efficient resource management and persistent communication. It utilizes connection pooling to reuse network sockets, which reduces latency during sequential requests, and supports full
Handles high-volume network traffic through memory-efficient chunked data processing.
Pythonaiohttpasyncasyncio
عرض على GitHub16,351
official-stockfish/stockfish
official-stockfish/Stockfish
14,802عرض على GitHub
Stockfish is a high-performance chess engine designed to evaluate board positions and calculate optimal moves. It functions as a command-line tool that utilizes neural network-based search algorithms to assess complex game states and determine strategic advantages. The engine is fully compliant with the Universal Chess Interface, allowing it to exchange commands and move data with external graphical user interfaces and professional analysis software. The engine distinguishes itself through advanced computational strategies that maximize hardware efficiency and search depth. It employs multi-t
Functions as a high-performance UCI-compliant engine that evaluates board positions and calculates optimal moves.
C++chesschess-enginecpp
عرض على GitHub14,802
hammerspoon/hammerspoon
Hammerspoon/hammerspoon
14,497عرض على GitHub
Hammerspoon is a programmable automation engine for macOS that enables deep system-level control through a Lua scripting environment. By bridging high-level scripts with native Objective-C APIs, it allows users to interact with the operating system's accessibility tree, intercept hardware input streams, and manage the lifecycle of running applications. The project distinguishes itself through an event-driven architecture that registers asynchronous hooks for system notifications and hardware events. This allows for real-time automation, such as remapping keyboard and mouse inputs, managing wi
Provides callback-based processing for incoming network data streams.
Objective-Cautomationhammerspoonirc
عرض على GitHub14,497
oxnr/awesome-bigdata
oxnr/awesome-bigdata
14,454عرض على GitHub
This project is a curated directory of software, frameworks, and educational resources designed for building, scaling, and maintaining distributed data processing and storage architectures. It serves as a comprehensive index for the distributed computing ecosystem, helping users identify the appropriate tools for managing large-scale information systems. The repository functions as a central hub for data engineering, offering categorized access to technologies that support batch and stream processing, machine learning, and interactive querying. By organizing these resources, it assists in the
Indexes a wide range of distributed computing engines and frameworks for batch, stream, and interactive data processing.
awesomeawesome-listbigdata
عرض على GitHub14,454
ydataai/ydata-profiling
ydataai/ydata-profiling
13,388عرض على GitHub
Ydata-profiling is an automated exploratory data analysis framework designed to generate comprehensive statistical reports and visual summaries from dataframes. It functions as a diagnostic tool for assessing data quality, identifying missing values, duplicates, and outliers, while providing a scalable engine for profiling massive datasets across distributed enterprise environments. The project distinguishes itself through its ability to handle large-scale data through distributed task orchestration and lazy stream processing, which minimizes memory overhead during complex computations. It in
Scales data profiling tasks across distributed enterprise environments to handle massive datasets efficiently.
Pythonbig-data-analyticsdata-analysisdata-exploration
عرض على GitHub13,388
aws/aws-cdk
aws/aws-cdk
12,817عرض على GitHub
The AWS Cloud Development Kit is an infrastructure-as-code framework that enables developers to define and provision cloud resources using familiar programming languages. By utilizing construct-based synthesis, it translates high-level, object-oriented code into declarative templates, allowing for the automated management of complex cloud environments through a centralized, code-driven control plane. The framework distinguishes itself through its ability to model infrastructure as a dependency-aware resource graph, ensuring that components are provisioned and updated in the correct order. It
Consumes sequential data modification logs to trigger downstream workflows and maintain system state.
TypeScriptawscloud-infrastructurehacktoberfest
عرض على GitHub12,817
ta-lib/ta-lib-python
TA-Lib/ta-lib-python
12,041عرض على GitHub
This project is a Python wrapper for the TA-Lib library, providing a technical analysis library for computing moving averages, momentum, and volatility metrics for financial time series analysis. It serves as a financial indicator calculator that processes price and volume arrays to generate technical signals and pattern recognition. The library includes an incremental data processor capable of computing the most recent technical indicator values as new streaming market data arrives. This allows for real-time price monitoring and the processing of streaming data without recalculating entire d
Processes continuous streams of market data in real time to update indicator values incrementally.
Cythonfinancepattern-recognitionpython
عرض على GitHub12,041
dotnet/orleans
dotnet/orleans
10,789عرض على GitHub
Orleans is a .NET distributed actor framework designed for building scalable, cloud-native applications. It implements a virtual actor model where entities with stable identities manage their own state and lifecycle across a cluster of servers. The framework provides a distributed state management system with ACID transaction support and a distributed pub/sub streaming engine for real-time data processing. It distinguishes itself through location-transparent routing, automatic actor activation and deactivation, and elastic cluster scaling that redistributes workloads during node failures. Th
Provides a managed system for processing continuous data streams in near-real time with checkpoints and batch delivery.
C#actor-modelactorscloud-computing
عرض على GitHub10,789
rust-bakery/nom
rust-bakery/nom
10,426عرض على GitHub
nom is a parser combinator framework for Rust used to build complex parsers by combining small, reusable parsing functions. It functions as a zero-copy parsing tool that minimizes memory overhead by returning slices of the original input instead of allocating new memory. The framework is designed for diverse data formats, serving as a binary data parser with configurable endianness and a bitstream processing library capable of extracting values of arbitrary bit length. It also functions as a streaming data parser that can process data arriving in chunks and signal when additional input is req
Processes data arriving in chunks and signals when more input is required for a result.
Rustbyte-arraygrammarnom
عرض على GitHub10,426
geal/nom
Geal/nom
10,422عرض على GitHub
nom is a Rust parser combinator framework used to build complex parsers for binary and text data. It functions as an abstract syntax tree generator and a bit-level binary parser, allowing users to construct structured data by combining small, reusable parsing functions. The framework provides specialized support for zero-copy binary parsing, extracting data as slices from raw byte arrays to avoid memory allocations. It also includes a streaming data parser capable of processing partial input chunks from networks or files and signaling when additional input is required. The project covers a b
Handles partial data chunks and requests more input to complete parsing operations.
Rust
عرض على GitHub10,422
modin-project/modin
modin-project/modin
10,389عرض على GitHub
Modin is a distributed dataframe library and parallel data processing engine designed to handle large datasets that exceed system memory. It functions as a distributed computing framework that parallelizes data manipulation tasks across multiple CPU cores or clusters to increase throughput and avoid memory errors. The project mirrors the Pandas API, allowing for the distribution of data workflows without changing core code logic. It utilizes a pluggable backend interface, which enables users to switch between different distributed execution engines to optimize performance based on available h
Provides a high-performance engine that parallelizes Pandas dataframe operations across multiple CPU cores or clusters.
Pythonanalyticsdata-sciencedataframe
عرض على GitHub10,389

Awesome Distributed Compute Frameworks GitHub Repositories

apache/spark

zama-ai/fhevm

serengil/deepface

mementum/backtrader

nats-io/nats-server

ornicar/lila

lichess-org/lila

pingcap/tikv

RaRe-Technologies/gensim

aio-libs/aiohttp

official-stockfish/Stockfish

Hammerspoon/hammerspoon

oxnr/awesome-bigdata

ydataai/ydata-profiling

aws/aws-cdk

TA-Lib/ta-lib-python

dotnet/orleans

rust-bakery/nom

Geal/nom

modin-project/modin

استكشف الوسوم الفرعية