# python3webspider/proxypool

**Attribution required: if you use, quote, or summarise this content, you must credit and link back to [awesome-repositories.com](https://awesome-repositories.com/repository/python3webspider-proxypool).**

6,223 stars · 2,235 forks · Python · MIT

## Links

- GitHub: https://github.com/Python3WebSpider/ProxyPool
- Homepage: https://proxypool.scrape.center
- awesome-repositories: https://awesome-repositories.com/repository/python3webspider-proxypool.md

## Topics

`flask` `http` `proxy` `proxypool` `redis` `webspider`

## Description

ProxyPool is a proxy pool manager that automatically collects, validates, and serves HTTP proxies from multiple sources through a web API. At its core, it runs scheduled background processes that scrape free and paid proxy websites, test each proxy's availability against configurable target URLs using asynchronous HTTP clients, and store the results in a Redis-backed sorted set where proxies are scored and ranked by reliability.

The system distinguishes itself through a pluggable crawler architecture that allows users to add new proxy sources by writing a simple class with target URLs and a parsing method to extract host and port pairs. It also provides a lightweight HTTP API endpoint that returns a random high-scoring proxy, enabling client applications to fetch a verified proxy without direct database access. The collection, testing, and API server run as separate processes on configurable intervals, ensuring the pool remains continuously refreshed without manual intervention.

The project covers automated proxy collection from multiple sources, asynchronous health verification that updates scores based on response success or failure, and a sorted-set scoring model where proxies gain points on successful verification and lose points on failure. The API layer exposes a single endpoint for fetching a random working proxy, while the background scheduler manages the entire lifecycle of collection, testing, and serving.

## Tags

### Networking & Communication

- [Proxy Pool Automation](https://awesome-repositories.com/f/networking-communication/proxy-pool-automation.md) — Automates the entire lifecycle of discovering, validating, and maintaining a pool of proxy servers via a web API.
- [Single Proxy Fetches](https://awesome-repositories.com/f/networking-communication/direct-url-retrievals/url-dataset-fetches/proxy-configuration-fetches/single-proxy-fetches.md) — Provides a dedicated endpoint to fetch a single validated proxy from the pool for immediate use. ([source](https://cdn.jsdelivr.net/gh/python3webspider/proxypool@master/README.md))
- [Proxy API Servers](https://awesome-repositories.com/f/networking-communication/http-proxies/proxy-api-servers.md) — Exposes a lightweight HTTP API endpoint that returns a random high-scoring proxy for client applications.
- [Proxy Directory APIs](https://awesome-repositories.com/f/networking-communication/proxy-directory-apis.md) — Exposes a lightweight HTTP endpoint that returns a random high-scoring proxy for client consumption without database access. ([source](https://cuiqingcai.com/7048.html))
- [Proxy List APIs](https://awesome-repositories.com/f/networking-communication/proxy-list-apis.md) — Provides a lightweight web endpoint that returns a single verified proxy from the pool without direct database access. ([source](https://cuiqingcai.com/7048.html))
- [Asynchronous Proxy Testers](https://awesome-repositories.com/f/networking-communication/proxy-node-aggregators/proxy-node-performance-testers/asynchronous-proxy-testers.md) — Provides an asynchronous proxy tester that validates proxy availability against configurable target URLs.
- [Proxy Source Aggregators](https://awesome-repositories.com/f/networking-communication/proxy-source-aggregators.md) — Combines diverse proxy links from multiple sources into a single stream using pluggable crawler methods. ([source](https://cuiqingcai.com/7048.html))

### Part of an Awesome List

- [Proxy Availability Checks](https://awesome-repositories.com/f/awesome-lists/devops/tasks-and-scheduling/proxy-availability-checks.md) — Automates the verification of proxy availability against a target URL using an async HTTP client and updates scores based on response status. ([source](https://cuiqingcai.com/7048.html))

### Development Tools & Productivity

- [Pool Refresh Schedulers](https://awesome-repositories.com/f/development-tools-productivity/background-task-schedulers/continuous-process-executors/pool-refresh-schedulers.md) — Runs collector, tester, and API server as separate background processes on configurable intervals for continuous pool refresh.

### DevOps & Infrastructure

- [Reliability Scoring Systems](https://awesome-repositories.com/f/devops-infrastructure/devops/operational-reliability/performance-tuning/proxy/reliability-scoring-systems.md) — Assigns numeric scores to proxies, raising them on successful verification and lowering them on failure for priority selection. ([source](https://cuiqingcai.com/7048.html))
- [Proxy Health Verifications](https://awesome-repositories.com/f/devops-infrastructure/infrastructure/operational-observability-access/service-health-monitoring/dependency-health-verification/proxy-health-verifications.md) — Asynchronously tests proxy availability against a target URL and updates reliability scores based on response success or failure.
- [Redis Sorted Set Priority Ordering](https://awesome-repositories.com/f/devops-infrastructure/job-priority-management/redis-sorted-set-priority-ordering.md) — Stores proxies in a Redis sorted set with numeric scores that increase on success and decrease on failure for priority-based selection.
- [Proxy Retrieval Endpoints](https://awesome-repositories.com/f/devops-infrastructure/rest-api-endpoint-management/single-endpoint-apis/proxy-retrieval-endpoints.md) — Exposes a single endpoint that returns a random high-scoring proxy for client consumption without database access.

### Software Engineering & Architecture

- [Pluggable Scraper Architectures](https://awesome-repositories.com/f/software-engineering-architecture/pluggable-scraper-architectures.md) — Implements a pluggable scraper architecture where new proxy sources are added via a common registration interface.
- [Independent Process Launches](https://awesome-repositories.com/f/software-engineering-architecture/process-separation/independent-process-launches.md) — Launches scraper, verifier, and web server in independent background processes on separate schedules for automated pool maintenance. ([source](https://cuiqingcai.com/7048.html))
- [Proxy Crawler Frameworks](https://awesome-repositories.com/f/software-engineering-architecture/proxy-crawler-frameworks.md) — Ships a pluggable crawler framework that allows custom classes to extract proxy IP and port pairs from any target website.

### Testing & Quality Assurance

- [Proxy Verification Pipelines](https://awesome-repositories.com/f/testing-quality-assurance/http-request-clients/async-http-test-clients/proxy-verification-pipelines.md) — Provides an async pipeline that verifies proxy availability and updates scores based on response success or failure.

### Web Development

- [Crawler Extensions](https://awesome-repositories.com/f/web-development/crawler-extensions.md) — Allows adding new proxy sources by writing a class that defines target URLs and a parse method to extract host and port. ([source](https://cdn.jsdelivr.net/gh/python3webspider/proxypool@master/README.md))
