30 open-source projects similar to proxifly/free-proxy-list, ranked by how many features they have in common. Compare stars, activity and what each one does to find the best Free Proxy List alternative.
ProxyPool is a proxy pool manager that automatically collects, validates, and serves HTTP proxies from multiple sources through a web API. At its core, it runs scheduled background processes that scrape free and paid proxy websites, test each proxy's availability against configurable target URLs using asynchronous HTTP clients, and store the results in a Redis-backed sorted set where proxies are scored and ranked by reliability. The system distinguishes itself through a pluggable crawler architecture that allows users to add new proxy sources by writing a simple class with target URLs and a p
This project is a Python-based proxy pool manager that collects, validates, and serves free proxy IP addresses through an HTTP API. It consists of an automated scraper to gather addresses from multiple online sources, a persistent database-backed store for organization, and a delivery interface for retrieving validated proxies. The system features a pluggable scraper architecture that allows for the integration of custom discovery methods and source expansion via generator functions. It employs decorator-based validation logic, enabling the definition of custom connectivity and HTTPS criteria
PROXY-List is a public proxy aggregator that provides data structures for storing and aggregating publicly available HTTP and SOCKS proxy server addresses. It serves as a source for retrieving network traffic routing lists used to mask origin IP addresses during web requests. The project utilizes a data pipeline to automatically scrape, poll, and serialize proxy lists from multiple public websites. This infrastructure ensures the availability of active servers through scheduled periodic polling and automated content refreshes, delivering the resulting lists as plain text files. These capabil
NoMoreWalls is a proxy subscription aggregator and configuration generator designed to bypass internet censorship. It functions as a tool for collecting public proxy nodes and compiling them into subscription files compatible with the Clash network core. The project differentiates itself by providing a local system for fetching, filtering, and provisioning curated proxy server configurations from multiple sources. It includes mechanisms to exclude specific proxy protocols and limit imported nodes to improve network stability and performance. The system manages network traffic routing through
Crawlee-python is a web crawling framework for building scalable scrapers using Python. It serves as a comprehensive tool for web scraping automation, providing a system to extract structured data from websites using both lightweight HTTP requests and headless browser automation. The framework is distinguished by its anti-bot evasion capabilities, which include browser fingerprint impersonation and tiered proxy rotation to bypass detection systems and solve challenges such as Cloudflare. It also incorporates artificial intelligence for autonomous website navigation and schema-based data extra
IPProxyPool is an HTTP proxy pool manager that crawls, validates, and serves a rotating list of functional proxy addresses via a programmatic API. It integrates proxy scraping, connectivity validation, and persistent database storage to provide a managed source of IP addresses for network requests. The system uses a plugin-based scraping architecture to collect IP addresses from multiple external websites and an asynchronous validation queue to test these candidates in parallel. It differentiates its pool by assigning numeric stability scores to proxies through periodic health checks and conn
CyberScraper-2077 is an AI-powered web scraping tool that uses large language models to extract and structure data from websites into organized formats. It functions as an LLM web scraper and AI content parser, transforming unstructured raw web text into specific data schemas. The project distinguishes itself through a suite of anonymity and evasion tools, including proxy rotation, SOCKS-based identity masking, and the ability to route traffic through the Tor network to access hidden onion services. It further includes a bot detection bypass system that employs stealth parameters and custom n
ProxyBroker is a tool for scraping public HTTP and SOCKS proxy addresses, validating their connectivity, and managing a curated pool of functional proxies. It consists of a proxy scraper for discovery, a validation engine to check anonymity and response times, and a pool manager to maintain a filtered queue of servers. The project includes a local rotating proxy server that acts as a single entry point, automatically distributing incoming network traffic across a pool of validated external proxies. This infrastructure allows for the rotation of IP addresses to maintain resilience during web d
node-crawler is a programmable web crawler for Node.js that manages request queues and automates data extraction. It functions as a rate-limited HTTP client and a headless HTML parser, providing the infrastructure to visit large sets of URLs asynchronously while preventing duplicate processing through task deduplication. The project distinguishes itself through a proxy rotation manager that cycles user agents and proxy servers to bypass access restrictions. It utilizes the HTTP/2 protocol to improve request performance and server compatibility during large-scale scraping operations. The syst
This project is a specialized TikTok API scraper and data extractor. It functions as a proxy-based web scraper designed to collect user metadata, video posts, and trend feeds, while providing a webhook data pipeline to route scraped information to external URLs via HTTP requests. The tool includes a watermark-free video downloader that saves high-definition content to local storage. It employs cryptographic request signing for server authentication and utilizes session cookie authentication combined with proxy rotation to manage network traffic and avoid rate limits. Capabilities cover bulk
Sub-Store is a proxy subscription management server that aggregates multiple subscription links into a single unified stream for distribution to various clients. It functions as a transformation pipeline that filters, modifies, and reformats proxy node metadata. The system acts as a cross-platform format converter to ensure compatibility across diverse client applications. It includes an encryption decryption gateway that uses private keys to handle age-standard encrypted subscription content and a cache-layered aggregator to reduce external requests. The server provides capabilities for dyn
Crawlee is a web scraping framework designed for building scalable, reliable, and distributed data extraction pipelines. It provides a unified interface for managing headless browser automation and lightweight HTTP requests, allowing developers to handle complex web navigation, dynamic content rendering, and large-scale data collection within a single, modular architecture. The project distinguishes itself through its resource-aware concurrency controller, which dynamically scales task execution based on real-time CPU and memory usage to prevent host machine exhaustion. It also features a rob
Pholcus is a distributed web crawling system designed for large-scale data scraping. It employs a master-worker distribution model to coordinate high-concurrency scraping tasks across a network of remote client nodes, enabling both horizontal and vertical data collection. The system features a hot-loadable rule engine that allows extraction and navigation logic to be updated at runtime without restarting the process. It handles dynamic content through headless browser integration and bypasses bot detection using proxy rotation, automated user authentication, and simulated human behavior. The
cloudscraper is a Python library designed to bypass Cloudflare anti-bot protections by resolving JavaScript challenges and mimicking browser fingerprints. It functions as a specialized tool for accessing websites that employ automated security systems to block scripts and headless browsers. The project differentiates itself through the use of interchangeable JavaScript runtimes, such as Node.js or V8, to execute challenge code and obtain security clearance tokens. It employs a fingerprint rotation engine and HTTP request emulation to rotate browser headers and device identifiers, mimicking hu
JobSpy is a job board scraper and listing aggregator designed to extract employment opportunities from multiple websites and compile them into a unified dataset. It functions as a job search automation tool that programmatically collects vacancies based on keywords, locations, and specific filters. The project serves as a web scraping framework that utilizes proxy routing and user-agent rotation to bypass rate limits and avoid server-side blocking during data extraction. It includes infrastructure for concurrent request aggregation and schema-based data normalization to ensure consistent form
AutoMergePublicNodes is a system for scraping public proxy nodes from across the internet and merging them into a unified list. It serves as an aggregator and converter that transforms these collected nodes into specialized subscription formats for various network routing clients. The project includes dedicated generators for creating configuration files and subscriptions compatible with Clash, Clash Meta, Sing-Box, and V2Ray. It handles the conversion of aggregated lists into Base64 and other standardized formats required by these specific network applications. The software also provides a
This project is a collection of Python scripts and tools designed for web scraping, browser automation, and large-scale data extraction. It provides a set of implementations for retrieving information from websites and private APIs, including tools for multimedia downloading and social media data archiving. The toolset includes specialized mechanisms for bypassing anti-scraping measures through IP proxy pool rotation and multi-threaded crawlers. It also features capabilities for simulating browser sessions to handle authentication, intercepting session cookies, and decrypting network payloads
This is a collection of Python scripts designed for extracting data from popular Chinese websites and mobile applications. It functions as a multi-platform data extraction toolkit, capable of automating tasks such as downloading videos from platforms like Bilibili and Douyin, scraping product reviews and images from e-commerce sites like Taobao and JD.com, and booking train tickets on the 12306 railway system. The project distinguishes itself through its focus on automating specific, high-value tasks within the Chinese internet ecosystem. It includes capabilities for solving Chinese CAPTCHA c
Piped is a privacy-focused video streaming service and self-hosted media proxy. It allows users to watch video and audio content without advertisements, user tracking, or the requirement of official accounts. The project utilizes a decentralized server network to distribute workloads and rotate outbound IP addresses, which helps bypass regional content restrictions and prevent provider blocks. It includes the ability to identify and skip sponsored segments within media files for a cleaner viewing experience. The service provides a JSON API for third-party integration to fetch video streams,
meta-rules-dat is a collection of binary-encoded network datasets used to identify and categorize traffic for routing on resource-constrained devices. It provides a structured domain categorization list and a geographic IP routing dataset to map network traffic to specific countries or service providers. The project utilizes trie-based lookup data and compact binary serialization to enable high-performance prefix matching and fast domain-to-category resolution. To minimize memory and storage overhead, it employs stripped-down GeoIP mapping that removes non-essential metadata. The datasets co
musicdl is a command line music downloader and library manager designed for searching and downloading audio tracks and playlists from streaming platforms to local storage. It functions as a tool for music library archiving, allowing for the bulk acquisition of media and the organization of local audio collections. The project includes an AI lyric transcriber that uses machine learning models to generate text lyrics from audio files, supporting synchronized playback where lyrics are highlighted based on playback timestamps. To maintain access to streaming platforms, it employs a network proxy
Translumo is an optical character recognition screen translator and multi-engine orchestrator. It extracts text from active application windows in real time to translate content into different languages, facilitating the localization of software that lacks official translation options. The system distinguishes itself by combining results from several recognition engines and using machine learning to determine the most accurate text extraction. It also functions as a proxy rotating gateway, cycling through IP addresses to prevent translation services from blocking high-volume requests. The pr
Maskphish is a comprehensive security toolkit that integrates capabilities for digital forensics, network vulnerability scanning, open-source intelligence, penetration testing, and social engineering. It functions as a multi-purpose framework for automating reconnaissance and executing security audits across diverse network environments. The project features a specialized phishing and social engineering toolkit used for cloning websites, masking URLs, and deploying deceptive pages to capture user credentials. It also includes a remote access Trojan builder for generating platform-specific exe
This project provides a collection of structured, binary-encoded routing datasets designed for proxy software to automate network traffic management. By mapping domain names and IP addresses to specific functional categories, it enables proxy clients to make granular, policy-based connection decisions. The repository serves as a centralized source for routing metadata, ensuring that traffic steering logic remains consistent across various networking implementations. The project distinguishes itself through an automated aggregation pipeline that processes community-maintained datasets into a u
This project is a reference library of architectural blueprints, study materials, and design patterns for building scalable, high-availability distributed systems. It serves as a technical guide for scalability engineering, providing structural solutions for common engineering challenges. The repository focuses on distributed systems design, covering essential patterns for data replication, consensus algorithms, and transaction management. It distinguishes itself by offering detailed blueprints for specialized domains, including real-time data streaming, large-scale data storage, and high-ava
Subconverter is a network utility designed to translate, merge, and filter proxy subscription configurations. It functions as a service that converts proxy links between various client-specific formats, ensuring compatibility across different applications and platforms. By providing a unified interface for managing diverse connection sources, the tool enables consistent network policy application and streamlined configuration management. The project distinguishes itself through an HTTP-based transformation interface that processes subscription data dynamically. It utilizes an in-memory pipeli
This project is a network node aggregator and proxy subscription manager designed to bypass network restrictions. It functions as a tool for importing and synchronizing remote server lists and configuration files to maintain updated network node connectivity. The system includes a regional app store gateway that utilizes a rotating pool of verified accounts to download location-restricted applications from various regional stores. It also provides a curated collection of free and stable subscription links for connectivity testing and accessing blocked content. The tool provides a configurati
SSTap-Rule is a routing rule set and game traffic accelerator designed to optimize connectivity and reduce latency for online games. It provides a collection of curated network routing configurations specifically for the SSTap client to ensure game data is directed through optimized network paths. The project utilizes geo-based routing configurations and geographic datasets to balance network routing accuracy and processing efficiency. This allows for the steering of internet traffic based on location to improve connection stability and speed. The system covers broader capabilities in networ
Shadowrocket is a proxy client application for mobile devices that functions as a multi-protocol proxy manager and a rule-based traffic router. It acts as a programmable network gateway, utilizing a virtual network interface to route system-level traffic through secure tunnels. The project distinguishes itself through a programmable environment that executes JavaScript scripts and modules to automate DNS resolution and handle complex network request logic. It further provides an HTTPS traffic inspector capable of decrypting encrypted traffic using custom certificates to modify headers and rew
This project is a repository of proxy server addresses, providing a regularly updated dataset of HTTP, SOCKS4, and SOCKS5 endpoints. It functions as a directory for applications that require managed network routing, offering curated connection details to facilitate traffic management and infrastructure tasks. The system distinguishes itself through an automated harvesting pipeline that refreshes the proxy list on an hourly schedule. This process includes geospatial metadata enrichment, which appends location data to each entry, allowing for regional routing and the simulation of local user ac