15 repository-uri
Techniques that improve computational performance by distributing tasks across multiple CPU cores or utilizing vector instructions.
Explore 15 awesome GitHub repositories matching software engineering & architecture · Parallel Processing. Refine with filters or upvote what's useful.
This project is a general-purpose command-line filter that provides an interactive interface for processing standard input streams. It enables real-time fuzzy searching, data selection, and transformation, allowing users to navigate complex information or file systems directly within their terminal. By utilizing a pipe-oriented architecture, it integrates into existing shell pipelines and workflows to facilitate efficient data exploration. What distinguishes this tool is its highly extensible, event-driven design that allows for deep integration with external processes. It supports asynchrono
Work queues distribute search tasks across multiple CPU cores to maximize computational throughput.
Tesseract is a neural network-based optical character recognition engine designed to convert scanned images and digital documents into machine-readable, searchable text. It functions as both a command-line utility for automating large-scale digitization workflows and a cross-platform library that can be embedded into desktop, mobile, or server-side applications. By utilizing long short-term memory networks, the engine provides robust text extraction across more than one hundred languages and dozens of scripts. The project distinguishes itself through a sophisticated document layout analysis f
Distribute recognition workloads across multiple CPU cores using multi-threading to accelerate large-scale document processing tasks.
This project is a comprehensive, curated directory of high-quality libraries, tools, and educational resources for C and C++ development. It serves as an ecosystem discovery index, helping developers navigate the vast landscape of third-party components, frameworks, and technical documentation available for the language. The collection is distinguished by its focus on high-performance systems programming and technical mastery. It provides deep coverage of specialized domains including SIMD-accelerated data processing, compile-time template metaprogramming, and asynchronous event-driven archit
Highlights performance-critical libraries that leverage processor-level instructions to execute parallel operations on data.
ripgrep is a command-line utility designed for searching through large file trees and source code repositories. It functions as a recursive text processor that traverses directories to locate and display matching patterns, serving as a high-performance alternative to traditional search tools. The tool distinguishes itself through a focus on execution speed and intelligent file handling. It utilizes a finite automata-based regular expression engine to ensure linear time complexity and employs hardware-level acceleration for literal byte sequence scanning. By integrating with version control sy
Distributes search workloads across multiple CPU cores to maximize throughput during intensive text processing tasks.
This is a Python facial recognition library designed to detect, encode, and identify human faces in images and video. It functions as a biometric identification tool that converts facial features into numerical encodings to compare and match identities. The library provides a computer vision command line interface for batch processing face detection and recognition tasks across image directories. It also supports a GPU accelerated vision API that utilizes CUDA and NVIDIA hardware to increase the speed of facial analysis and identification. Its capabilities cover human face detection and faci
Distributes image processing tasks across multiple CPU cores or GPU hardware to increase total throughput.
This project is a Chinese text segmentation library and tokenizer designed to split Chinese sentences into individual words. It serves as a natural language processing tool for splitting characters into words, tagging parts of speech, and extracting keywords using statistical analysis. The library distinguishes itself through support for custom dictionary configuration and vocabulary file management, allowing users to override default segmentation rules for domain-specific accuracy. It also includes a TF-IDF keyword extractor to identify significant words and core topics within documents. Th
Implements multi-process parallel execution to distribute heavy text segmentation workloads across multiple CPU cores.
Facefusion is a modular framework designed for automated image and video manipulation, specializing in tasks such as face swapping, enhancement, and restoration. It functions as a computer vision processing pipeline that chains independent machine learning modules to perform complex transformations, including facial animation, age modification, and lip synchronization. The system is built to handle both real-time interactive feeds and large-scale batch processing tasks. The platform distinguishes itself through a highly extensible architecture that supports custom processing modules and inter
Balances processing speed and system resources by adjusting concurrent execution threads.
ImageMagick is a comprehensive software suite for the creation, editing, composition, and conversion of digital images. It functions as both a command-line utility for batch processing and automation, and as a programming library that allows developers to integrate advanced image manipulation capabilities into external applications. The project is distinguished by its modular architecture, which supports hundreds of image formats through a pluggable coder system and external delegate libraries. It is designed for high-performance environments, utilizing memory-mapped pixel caching, stream-ori
Supports large-scale processing by offloading storage to remote servers and parallelizing tasks across hardware.
YAPF este un instrument de formatare a codului Python și de conformitate cu stilul. Acesta operează ca un reformator bazat pe AST care utilizează arbori de sintaxă concreți pentru a asigura consistența structurală și o prezentare vizuală uniformă în fișierele sursă. Motorul utilizează un optimizator de layout bazat pe penalități pentru a determina cele mai bune întreruperi de linie prin calcularea costurilor numerice pentru diferite alegeri de formatare. Utilizează un procesor de cod multi-proces pentru a distribui formatarea mai multor fișiere pe mai multe nuclee CPU. Instrumentul acoperă reformatarea codului sursă prin modificări in-place ale fișierelor, analiza diferențelor și procesarea fragmentelor de cod parțiale. Include un sistem de configurare bazat pe reguli pentru gestionarea presetărilor de stil, regulilor de layout și setărilor la nivel de proiect. Capabilitățile de integrare includ verificarea conformității stilului pentru pipeline-uri de integrare continuă, automatizarea git hook-urilor și funcționalitatea de format-on-save bazată pe editor.
Improves performance by distributing the formatting of multiple Python files across several CPU cores.
Numba este un compilator just-in-time care traduce funcțiile Python de nivel înalt în cod mașină optimizat la runtime. Prin utilizarea infrastructurii de compilare LLVM, acesta oferă un framework pentru accelerarea procesării datelor numerice și a calculelor matematice, permițând niveluri de performanță comparabile cu limbajele compilate static. Proiectul se distinge prin capacitatea sa de a efectua specializarea bazată pe inferența de tipuri, care generează instrucțiuni mașină adaptate tipurilor de date specifice utilizate în timpul execuției. Acesta folosește un pipeline de compilare leneșă (lazy) care amână traducerea până în momentul invocării, minimizând timpul de pornire și menținând o performanță consistentă pe diverse arhitecturi de procesoare și sisteme de operare. Dincolo de compilarea de bază, toolkit-ul oferă suport extins pentru accelerarea hardware prin distribuirea operațiunilor iterative și a expresiilor de tip array pe mai multe nuclee CPU și unități de procesare grafică. Utilizează strategii de vectorizare și paralelizare pentru a maximiza debitul pentru seturi de date numerice la scară largă, permițând dezvoltatorilor să vizeze hardware specializat direct din codul standard.
Distributes iterative operations across multiple processor cores and utilizes SIMD instructions to maximize throughput.
Rector is an automated PHP refactoring and modernization tool designed to upgrade language versions and modernize syntax using predefined rules. It functions as a static analysis engine that inspects code structures and types to identify refactoring targets without executing the code. The project provides a framework for defining custom transformation logic to automate project-specific changes. It distinguishes itself by offering specialized capabilities for migrating legacy or custom frameworks to modern alternatives and converting docblock annotations into native language attributes. The s
Reduces processing time for large codebases by running tasks simultaneously across multiple CPU cores.
Trape is a browser-based remote access tool and exploit framework designed for gathering device geolocation, hardware profiles, and network data. It functions as an open-source intelligence platform and a system for executing custom scripts and triggering browser vulnerabilities to capture credentials or monitor device activity. The project features a real-time geolocation tracker capable of retrieving precise physical coordinates and monitoring individual movement, including silent acquisition that bypasses standard location prompts. It further provides a network tunneling service to make lo
Distributes intensive computing operations across multiple CPU cores to increase processing efficiency.
This repository is a collection of reference implementations and sample projects demonstrating data-oriented design using the Unity Entities package. It provides a suite of examples for implementing an entity component system that separates game data into components and logic into systems. The collection includes specialized demonstrations for rendering large volumes of entities via graphics pipelines, implementing high-performance collision and rigid body dynamics through data-oriented physics, and managing multiplayer state synchronization using the network framework for entities. It also p
Distributes computationally expensive tasks across multiple CPU cores using a job-based system.
This project is a Python data science curriculum and programming tutorial collection. It provides a structured set of educational notebooks and scripts designed to teach data analysis, machine learning, and deep learning. The repository serves as a learning path for building and tuning predictive models, including regression, decision trees, and neural networks. It includes a data visualization guide for creating financial time-series plots and a multiprocessing reference for implementing parallel task execution and shared memory synchronization. The curriculum covers broader capability area
Implements parallel task execution using process pools and shared memory synchronization in Python.
Coconut este un limbaj de programare funcțional care compilează în Python. Funcționează ca un compilator sursă-la-sursă, traducând sintaxa funcțională de nivel înalt în cod Python compatibil pentru a menține compatibilitatea la runtime. Limbajul introduce un sistem logic pentru pattern matching și destructurarea structurilor de date complexe. Oferă un mecanism pentru optimizarea apelurilor terminale (tail call optimization) pentru a preveni erorile de stack overflow în timpul apelurilor recursive profunde și utilizează un motor de evaluare leneșă (lazy evaluation) pentru a amâna calculele până când rezultatele sunt explicit necesare. Proiectul include suport pentru tipuri de date algebrice, operatori de pipeline și aplicare parțială. De asemenea, oferă un framework pentru procesarea paralelă a datelor prin distribuirea operațiunilor de mapare pe mai multe nuclee CPU.
Improves computational performance by distributing mapping tasks across multiple CPU cores.