ProxyPool is a proxy pool manager that automatically collects, validates, and serves HTTP proxies from multiple sources through a web API. At its core, it runs scheduled background processes that scrape free and paid proxy websites, test each proxy's availability against configurable target URLs using asynchronous HTTP clients, and store the results in a Redis-backed sorted set where proxies are scored and ranked by reliability.
The system distinguishes itself through a pluggable crawler architecture that allows users to add new proxy sources by writing a simple class with target URLs and a parsing method to extract host and port pairs. It also provides a lightweight HTTP API endpoint that returns a random high-scoring proxy, enabling client applications to fetch a verified proxy without direct database access. The collection, testing, and API server run as separate processes on configurable intervals, ensuring the pool remains continuously refreshed without manual intervention.
The project covers automated proxy collection from multiple sources, asynchronous health verification that updates scores based on response success or failure, and a sorted-set scoring model where proxies gain points on successful verification and lose points on failure. The API layer exposes a single endpoint for fetching a random working proxy, while the background scheduler manages the entire lifecycle of collection, testing, and serving.