15 个仓库
Techniques that improve computational performance by distributing tasks across multiple CPU cores or utilizing vector instructions.
Explore 15 awesome GitHub repositories matching software engineering & architecture · Parallel Processing. Refine with filters or upvote what's useful.
This project is a general-purpose command-line filter that provides an interactive interface for processing standard input streams. It enables real-time fuzzy searching, data selection, and transformation, allowing users to navigate complex information or file systems directly within their terminal. By utilizing a pipe-oriented architecture, it integrates into existing shell pipelines and workflows to facilitate efficient data exploration. What distinguishes this tool is its highly extensible, event-driven design that allows for deep integration with external processes. It supports asynchrono
Work queues distribute search tasks across multiple CPU cores to maximize computational throughput.
Tesseract is a neural network-based optical character recognition engine designed to convert scanned images and digital documents into machine-readable, searchable text. It functions as both a command-line utility for automating large-scale digitization workflows and a cross-platform library that can be embedded into desktop, mobile, or server-side applications. By utilizing long short-term memory networks, the engine provides robust text extraction across more than one hundred languages and dozens of scripts. The project distinguishes itself through a sophisticated document layout analysis f
Distribute recognition workloads across multiple CPU cores using multi-threading to accelerate large-scale document processing tasks.
This project is a comprehensive, curated directory of high-quality libraries, tools, and educational resources for C and C++ development. It serves as an ecosystem discovery index, helping developers navigate the vast landscape of third-party components, frameworks, and technical documentation available for the language. The collection is distinguished by its focus on high-performance systems programming and technical mastery. It provides deep coverage of specialized domains including SIMD-accelerated data processing, compile-time template metaprogramming, and asynchronous event-driven archit
Highlights performance-critical libraries that leverage processor-level instructions to execute parallel operations on data.
ripgrep is a command-line utility designed for searching through large file trees and source code repositories. It functions as a recursive text processor that traverses directories to locate and display matching patterns, serving as a high-performance alternative to traditional search tools. The tool distinguishes itself through a focus on execution speed and intelligent file handling. It utilizes a finite automata-based regular expression engine to ensure linear time complexity and employs hardware-level acceleration for literal byte sequence scanning. By integrating with version control sy
Distributes search workloads across multiple CPU cores to maximize throughput during intensive text processing tasks.
This is a Python facial recognition library designed to detect, encode, and identify human faces in images and video. It functions as a biometric identification tool that converts facial features into numerical encodings to compare and match identities. The library provides a computer vision command line interface for batch processing face detection and recognition tasks across image directories. It also supports a GPU accelerated vision API that utilizes CUDA and NVIDIA hardware to increase the speed of facial analysis and identification. Its capabilities cover human face detection and faci
Distributes image processing tasks across multiple CPU cores or GPU hardware to increase total throughput.
This project is a Chinese text segmentation library and tokenizer designed to split Chinese sentences into individual words. It serves as a natural language processing tool for splitting characters into words, tagging parts of speech, and extracting keywords using statistical analysis. The library distinguishes itself through support for custom dictionary configuration and vocabulary file management, allowing users to override default segmentation rules for domain-specific accuracy. It also includes a TF-IDF keyword extractor to identify significant words and core topics within documents. Th
Implements multi-process parallel execution to distribute heavy text segmentation workloads across multiple CPU cores.
Facefusion is a modular framework designed for automated image and video manipulation, specializing in tasks such as face swapping, enhancement, and restoration. It functions as a computer vision processing pipeline that chains independent machine learning modules to perform complex transformations, including facial animation, age modification, and lip synchronization. The system is built to handle both real-time interactive feeds and large-scale batch processing tasks. The platform distinguishes itself through a highly extensible architecture that supports custom processing modules and inter
Balances processing speed and system resources by adjusting concurrent execution threads.
ImageMagick is a comprehensive software suite for the creation, editing, composition, and conversion of digital images. It functions as both a command-line utility for batch processing and automation, and as a programming library that allows developers to integrate advanced image manipulation capabilities into external applications. The project is distinguished by its modular architecture, which supports hundreds of image formats through a pluggable coder system and external delegate libraries. It is designed for high-performance environments, utilizing memory-mapped pixel caching, stream-ori
Supports large-scale processing by offloading storage to remote servers and parallelizing tasks across hardware.
YAPF 是一个 Python 代码格式化程序和样式合规工具。它作为一个基于 AST 的重格式化程序运行,使用具体语法树来确保源文件之间的结构一致性和统一的视觉呈现。 该引擎利用基于惩罚的布局优化器,通过计算不同格式化选择的数值成本来确定最佳换行符。它采用多进程代码处理器将多个文件的格式化分布到多个 CPU 核心上。 该工具涵盖了通过就地文件修改、差异分析和处理部分代码片段进行的源代码重格式化。它包括一个用于管理样式预设、布局规则和项目级设置的基于规则的配置系统。 集成能力包括用于持续集成流水线的样式合规验证、Git 钩子自动化以及基于编辑器的保存时格式化功能。
Improves performance by distributing the formatting of multiple Python files across several CPU cores.
Numba 是一个即时(JIT)编译器,可在运行时将高级 Python 函数转换为优化的机器码。通过利用 LLVM 编译器基础设施,它提供了一个加速数值数据处理和数学计算的框架,使性能水平可与静态编译语言相媲美。 该项目通过其基于类型推断的特化能力脱颖而出,它能生成针对执行期间使用的特定数据类型量身定制的机器指令。它采用延迟编译流水线,将转换推迟到调用时刻,从而在保持跨不同处理器架构和操作系统的一致性能的同时,最大限度地减少启动开销。 除了核心编译外,该工具包还通过将迭代操作和数组表达式分布到多个 CPU 核心和图形处理器上,提供了对硬件加速的广泛支持。它利用向量化和并行化策略来最大限度地提高大规模数值数据集的吞吐量,使开发人员能够直接从标准代码中针对专用硬件进行优化。
Distributes iterative operations across multiple processor cores and utilizes SIMD instructions to maximize throughput.
Rector is an automated PHP refactoring and modernization tool designed to upgrade language versions and modernize syntax using predefined rules. It functions as a static analysis engine that inspects code structures and types to identify refactoring targets without executing the code. The project provides a framework for defining custom transformation logic to automate project-specific changes. It distinguishes itself by offering specialized capabilities for migrating legacy or custom frameworks to modern alternatives and converting docblock annotations into native language attributes. The s
Reduces processing time for large codebases by running tasks simultaneously across multiple CPU cores.
Trape is a browser-based remote access tool and exploit framework designed for gathering device geolocation, hardware profiles, and network data. It functions as an open-source intelligence platform and a system for executing custom scripts and triggering browser vulnerabilities to capture credentials or monitor device activity. The project features a real-time geolocation tracker capable of retrieving precise physical coordinates and monitoring individual movement, including silent acquisition that bypasses standard location prompts. It further provides a network tunneling service to make lo
Distributes intensive computing operations across multiple CPU cores to increase processing efficiency.
This repository is a collection of reference implementations and sample projects demonstrating data-oriented design using the Unity Entities package. It provides a suite of examples for implementing an entity component system that separates game data into components and logic into systems. The collection includes specialized demonstrations for rendering large volumes of entities via graphics pipelines, implementing high-performance collision and rigid body dynamics through data-oriented physics, and managing multiplayer state synchronization using the network framework for entities. It also p
Distributes computationally expensive tasks across multiple CPU cores using a job-based system.
This project is a Python data science curriculum and programming tutorial collection. It provides a structured set of educational notebooks and scripts designed to teach data analysis, machine learning, and deep learning. The repository serves as a learning path for building and tuning predictive models, including regression, decision trees, and neural networks. It includes a data visualization guide for creating financial time-series plots and a multiprocessing reference for implementing parallel task execution and shared memory synchronization. The curriculum covers broader capability area
Implements parallel task execution using process pools and shared memory synchronization in Python.
Coconut 是一种编译为 Python 的函数式编程语言。它作为一个源到源编译器,将高级函数式语法转换为兼容的 Python 代码,以保持运行时兼容性。 该语言引入了一种用于模式匹配和解构复杂数据结构的逻辑系统。它提供了一种尾调用优化机制,以防止深度递归函数调用期间的栈溢出错误,并采用惰性求值引擎来推迟计算,直到明确需要结果为止。 该项目包括对代数数据类型、管道运算符和部分应用的支持。它还提供了一个通过跨多个 CPU 核心分发映射操作来进行并行数据处理的框架。
Improves computational performance by distributing mapping tasks across multiple CPU cores.