# wzdnzd/aggregator

**Attribution required: if you use, quote, or summarise this content, you must credit and link back to [awesome-repositories.com](https://awesome-repositories.com/repository/wzdnzd-aggregator).**

6,689 stars · 5,479 forks · Python · Apache-2.0

## Links

- GitHub: https://github.com/wzdnzd/aggregator
- Homepage: https://github.com/wzdnzd/aggregator
- awesome-repositories: https://awesome-repositories.com/repository/wzdnzd-aggregator.md

## Topics

`proxypool`

## Description

This project is a proxy aggregation platform designed to collect and verify free proxy server lists from web platforms, social media, and public repositories. It functions as a crawler framework that gathers proxy data and subscription links, a validation tool for testing server liveness, and a synchronization service for distributing the results.

The system uses a plugin-based architecture that allows for the integration of custom Python scripts to handle diverse web source structures. It also includes utilities to transform raw proxy data into standardized configuration formats compatible with various client applications.

Core capabilities include automated account registration using custom email patterns and coupons to secure provider access, and the use of regular expressions to filter noise and spam from collected lists. The platform utilizes multi-threaded batch processing to increase the throughput of data collection and connectivity probes.

Aggregated proxy lists can be synchronized and saved to external cloud storage and public pastebins for remote access and distribution.

## Tags

### Part of an Awesome List

- [Proxy Node Aggregators](https://awesome-repositories.com/f/awesome-lists/data/proxy-node-aggregators.md) — Crawls web platforms, social media feeds, and public repositories to automatically gather lists of free proxy servers. ([source](https://github.com/wzdnzd/aggregator/blob/main/README.md))
- [Script-Based Extensions](https://awesome-repositories.com/f/awesome-lists/devtools/python-crawling-frameworks/script-based-extensions.md) — Integrates Python scripts as plugins to implement specialized crawling logic for unique web sources. ([source](https://github.com/wzdnzd/aggregator/blob/main/README_EN.md))
- [Pattern-Based Result Filtering](https://awesome-repositories.com/f/awesome-lists/data/regular-expressions/pattern-based-result-filtering.md) — Uses regular expressions to filter proxy names and links, removing spam and noise from collected lists. ([source](https://github.com/wzdnzd/aggregator/blob/main/README_EN.md))

### Networking & Communication

- [Connectivity Verifiers](https://awesome-repositories.com/f/networking-communication/connectivity-verifiers.md) — Implements real-time network probes to verify server liveness and traffic performance for the proxy pool.
- [Proxy Connectivity Testing](https://awesome-repositories.com/f/networking-communication/proxy-connectivity-testing.md) — Tests the activity and performance of collected proxies to ensure only functional servers are retained. ([source](https://github.com/wzdnzd/aggregator/blob/main/.gitmodules))
- [Proxy Connectivity Checkers](https://awesome-repositories.com/f/networking-communication/proxy-managers/proxy-connectivity-checkers.md) — Provides utilities to verify the active status and traffic routing of collected proxy servers.
- [Proxy Node Aggregators](https://awesome-repositories.com/f/networking-communication/proxy-node-aggregators.md) — Crawls web platforms and social media to gather and maintain a comprehensive pool of free proxy servers.
- [Subscription Link Crawlers](https://awesome-repositories.com/f/networking-communication/proxy-servers/clash-configuration-managers/subscription-link-crawlers.md) — Crawls social media, search engines, and public repositories to identify and aggregate proxy server subscription links. ([source](https://github.com/wzdnzd/aggregator/blob/main/README_EN.md))
- [Format Converters](https://awesome-repositories.com/f/networking-communication/proxy-configuration-formats/format-converters.md) — Transforms raw proxy data into standardized configuration formats compatible with different client applications.

### Software Engineering & Architecture

- [Source Logic Integration](https://awesome-repositories.com/f/software-engineering-architecture/crawl-logic-orchestration/source-logic-integration.md) — Integrates custom plugins into the crawling architecture to support new sources and collection rules. ([source](https://github.com/wzdnzd/aggregator))
- [Plugin-Based Logic Extensions](https://awesome-repositories.com/f/software-engineering-architecture/plugin-based-logic-extensions.md) — Uses a plugin-based architecture to load custom Python scripts for handling diverse web source structures.
- [Proxy Crawler Frameworks](https://awesome-repositories.com/f/software-engineering-architecture/proxy-crawler-frameworks.md) — Implements a plugin-based architecture using Python scripts to automate proxy gathering from diverse web sources.
- [Multi-Threaded Request Handling](https://awesome-repositories.com/f/software-engineering-architecture/high-throughput-task-processing/network-request-processing/multi-threaded-request-handling.md) — Utilizes multi-threaded execution to increase the throughput of high-volume proxy data collection and validation.
- [Multi-Threaded Batch Processing](https://awesome-repositories.com/f/software-engineering-architecture/multi-threaded-batch-processing.md) — Utilizes multi-threaded execution to increase the throughput of high-volume proxy data collection and connectivity probes. ([source](https://github.com/wzdnzd/aggregator))

### Content Management & Publishing

- [Regex Content Filtering](https://awesome-repositories.com/f/content-management-publishing/community-content-feeds/feed-content-filtering/regex-content-filtering.md) — Applies regular expression patterns to cleanse raw crawled data and remove noise from proxy lists.
- [Proxy Configuration Converters](https://awesome-repositories.com/f/content-management-publishing/content-processing-transformation/document-processing-conversion/proxy-configuration-converters.md) — Transforms raw proxy data into standardized configuration formats compatible with various client applications. ([source](https://github.com/wzdnzd/aggregator/blob/main/README_EN.md))

### Data & Databases

- [Proxy Address Storage](https://awesome-repositories.com/f/data-databases/proxy-address-storage.md) — Syncs aggregated proxy lists to remote cloud services and external storage for distribution.
- [Remote Synchronization](https://awesome-repositories.com/f/data-databases/proxy-address-storage/remote-synchronization.md) — Pushes aggregated proxy lists to external cloud services and pastebins for remote access and distribution.
- [Remote Distribution Backends](https://awesome-repositories.com/f/data-databases/remote-distribution-backends.md) — Saves aggregated proxy lists to external storage services like gists or pastebins for remote access. ([source](https://github.com/wzdnzd/aggregator))

### DevOps & Infrastructure

- [Remote Server Synchronization](https://awesome-repositories.com/f/devops-infrastructure/remote-server-synchronization.md) — Pushes aggregated proxy results to external cloud services and pastebins via API-driven exports.

### Security & Cryptography

- [Automated Account Registrations](https://awesome-repositories.com/f/security-cryptography/automated-account-registrations.md) — Provides automated registration of provider accounts using custom email patterns and coupons to secure proxy access.
