17 dépôts
Mechanisms for executing multiple HTTP requests in parallel using capped pools to optimize throughput.
Distinct from HTTP Request Managers: Specifically covers parallel request pooling and batching, whereas the parent covers general HTTP parameter management.
Explore 17 awesome GitHub repositories matching web development · Concurrent Request Pooling. Refine with filters or upvote what's useful.
Guzzle is a PHP HTTP client used for sending synchronous and asynchronous requests to web services. It serves as a concurrent HTTP request manager, an HTTP stream handler, and a middleware-based HTTP pipeline. The project is a PSR-7 compliant client, utilizing standardized PHP interfaces for requests, responses, and streams. The library differentiates itself through a customizable functional handler stack that allows for the interception and modification of the request and response lifecycle. It features an adapter-based transport system that enables swapping between network implementations,
Manages multiple network requests in parallel using a capped pool to optimize total throughput.
This project is a comprehensive educational guide and framework for building web scrapers using Python. It provides a course-based approach to data extraction, combining a Python crawler framework with tutorials on web reverse engineering and network traffic analysis. The project distinguishes itself by covering advanced extraction challenges, including the decryption of obfuscated JavaScript and the bypass of anti-scraping measures. It specifically addresses mobile application scraping through the simulation of user interactions and the interception of network traffic. The capability surfac
Implements parallel HTTP request execution using capped pools to optimize data collection throughput.
Puma is a concurrent HTTP server for Ruby applications that implements the Rack interface. It operates as a clustered web server, using a combination of worker processes and threads to handle multiple simultaneous web connections via TCP ports or UNIX domain sockets. The server features a master-worker process model that utilizes multiple CPU cores and employs copy-on-write preloading to reduce memory usage. It supports zero-downtime restarts through socket-handover capabilities, allowing application updates without dropping pending network requests. The project includes a token-authenticate
Implements parallel HTTP request processing using a capped pool of worker threads and processes to maximize throughput.
This project is a JavaScript-based HTTP download manager and accelerator designed to increase file transfer rates within a web browser. It functions as a multi-threaded download accelerator that retrieves remote resources by splitting files into smaller segments for simultaneous downloading. The tool utilizes HTTP range requests to fetch multiple file segments in parallel, maximizing available network bandwidth. It manages this process through a client-side blob buffer and memory management system that stores binary data before assembling the segments into the final file. The system covers c
Implements capped pools for executing multiple HTTP requests in parallel to optimize throughput.
Fetches multiple Instagram user profiles or media items in parallel using concurrency primitives.
Iron est un framework web Rust utilisé pour construire des applications web concurrentes et des API. Il fonctionne comme un serveur HTTP concurrent et fournit un répartiteur de routes sans état pour mapper les chemins d'URL entrants et les globs vers des fonctions de gestion spécifiques. Le projet est centré autour d'un pipeline de requêtes basé sur des middlewares, qui permet d'étendre le cycle requête-réponse via des plugins et des modificateurs. Il utilise un conteneur d'état thread-safe pour stocker la mémoire d'application partagée accessible à travers tous les gestionnaires et middlewares concurrents. Le framework couvre de larges domaines de fonctionnalités, notamment le routage d'API dynamique, l'hébergement de fichiers statiques et la gestion de l'état de session web. Il inclut également des outils pour logger le trafic HTTP et parser les corps de requête et les paramètres d'URL.
Processes multiple simultaneous HTTP requests using a concurrent architecture to ensure high scalability.
Faraday is an HTTP client library for Ruby that sends requests and processes responses through a middleware pipeline with pluggable adapters. Its core identity is built around a middleware-pipeline architecture where HTTP requests and responses flow through a chain of components that can modify, log, or transform data before reaching the backend, combined with an adapter-based backend abstraction that delegates HTTP execution to interchangeable backends like Net::HTTP or Typhoeus. The library distinguishes itself through a parallel-execution engine that dispatches multiple HTTP requests concu
Run multiple HTTP requests concurrently to reduce total wait time.
Trafilatura is a Python library and command-line tool for extracting clean, structured text and metadata from web pages. It downloads HTML content, identifies the main body of text, and strips away navigation, ads, and other boilerplate, returning the core article content along with fields like title, author, date, and URL. The tool can also extract user comments and test whether a page contains extractable text, making it a general-purpose web text extraction library. What distinguishes Trafilatura from simpler extractors is its configurable extraction pipeline, which offers high-speed, high
Manages concurrent network requests with a configurable worker pool, deduplicating URLs and handling rate limits.
Hakrawler is a command-line web spider tool designed for security reconnaissance, built to crawl target websites and extract hyperlinks along with JavaScript file references. As a focused reconnaissance utility, it collects every discoverable URL and script source from a given domain, mapping the attack surface for penetration testing and vulnerability assessment. The tool differentiates itself through its concurrent architecture: a fixed-size goroutine pool fetches pages in parallel, while CSS selectors parse HTML to extract anchor and script references. A depth-aware recursion limiter preve
Ships a fixed-size goroutine pool that fetches pages in parallel, throttling concurrency for balanced speed and politeness.
resumable.js est une bibliothèque JavaScript pour la gestion des téléchargements de fichiers volumineux utilisant l'API File HTML5. Elle fonctionne comme un transmetteur de données par morceaux et un gestionnaire de téléchargement reprenable, divisant les fichiers en segments plus petits pour assurer une livraison fiable vers un serveur distant. La bibliothèque se distingue par sa capacité à reprendre la progression du téléchargement après des interruptions réseau ou des redémarrages du navigateur. Elle y parvient grâce au transfert de données reprenable et à la vérification des morceaux côté serveur, qui contrôle l'existence des segments sur le serveur pour éviter toute transmission de données redondante. Le système gère les téléchargements de fichiers simultanés et la mise en file d'attente des requêtes pour optimiser la bande passante. Il inclut des capacités de contrôle de l'état du téléchargement — permettant aux utilisateurs de mettre en pause, reprendre ou annuler les transferts — et un suivi de progression piloté par événements pour les fichiers individuels et les lots. De plus, il fournit des primitives pour lier la capture de fichiers aux éléments HTML via des boutons de navigation et des interfaces de glisser-déposer.
Manages a limited pool of simultaneous HTTP requests to optimize bandwidth and prevent server congestion.
Grequests is an asynchronous HTTP batcher and Gevent-based client library used to execute large sets of network requests simultaneously. It functions as a concurrent request wrapper for the Requests library, enabling non-blocking operations to reduce the total time spent waiting for server responses. The project provides a task-pool execution model to handle batch network operations, such as high-throughput web scraping and API polling. It can stream responses as they arrive via a generator, allowing for immediate data processing without waiting for the entire batch to complete. The library
Sends multiple network requests at once using Gevent and capped pools to complete large workloads faster.
Waybackurls is a command-line OSINT tool that retrieves every known URL for a given domain from the Wayback Machine archive. It functions as a domain reconnaissance utility, discovering forgotten API endpoints, legacy pages, and hidden files by querying the public web archive API. The tool processes domains independently and statelessly, reading domain names from standard input and streaming discovered URLs line-by-line to standard output. This design enables seamless integration into Unix command pipelines, allowing users to chain waybackurls with other tools for filtering, sorting, and furt
Fetches archive data by managing a pool of parallel HTTP requests to the Wayback Machine API for faster throughput.
Typhoeus is a Ruby wrapper for libcurl that functions as a session-based HTTP client. It provides an interface for making both synchronous and asynchronous network requests. The project acts as a parallel request manager, using a managed queue to execute multiple network requests concurrently. It further distinguishes itself as a mocking tool for stubbing requests with predefined responses and as a caching layer that stores responses to avoid redundant network calls. The library covers a broad range of capabilities including session cookie management, response body streaming for large files,
Executes multiple network requests concurrently using a managed queue to optimize throughput.
ShuiZe_0x727 is an open-source intelligence gathering framework and attack surface management tool. It functions as an asset discovery engine and cyber intelligence aggregator designed to identify internet-facing assets, map network infrastructure, and visualize total network exposure. The project integrates vulnerability scanning and sensitive data leak detection to identify security weaknesses and unauthorized access points. It employs a combination of network space API queries, certificate log analysis, and public repository scanning to extract leaked credentials, API keys, and internal ad
Implements a concurrent request pipeline using pooled HTTP and DNS queries to accelerate network mapping.
Ali is an HTTP load testing tool and traffic generator used to measure system performance and stability by sending high volumes of requests to a target URL. It functions as a performance metrics exporter and real-time visualizer, simulating client behavior through configurable request bodies, custom headers, and HTTP/2 support. The project provides real-time performance monitoring via interactive charts that plot latency and request percentiles as data is collected. It identifies system bottlenecks through a combination of live performance plotting and the exportation of raw data points and a
Spawns multiple simultaneous workers to send high-volume HTTP requests for performance testing.
JobSpy is a job board scraper and listing aggregator designed to extract employment opportunities from multiple websites and compile them into a unified dataset. It functions as a job search automation tool that programmatically collects vacancies based on keywords, locations, and specific filters. The project serves as a web scraping framework that utilizes proxy routing and user-agent rotation to bypass rate limits and avoid server-side blocking during data extraction. It includes infrastructure for concurrent request aggregation and schema-based data normalization to ensure consistent form
Implements parallel HTTP request execution using capped pools to optimize data collection throughput.
Distributes hundreds of simultaneous HTTP requests across a thread pool to rapidly verify profile existence.