What are the best open-source alternatives to Python Spider?

30 open-source projects similar to jack-cherish/python-spider, ranked by shared features. Top picks: johnserf-seed/tiktokdownload, drawrowfly/tiktok-scraper, nanmicoder/crawlertutorial, awesome-selfhosted/awesome-selfhosted, thespeedx/proxy-list, ultrafunkamsterdam/nodriver, apify/crawlee-python, testersunshine/12306, speedyapply/jobspy, proxifly/free-proxy-list.

Is johnserf-seed/tiktokdownload a good alternative to Python Spider?

TikTokDownload is a configurable batch video downloader for TikTok and Douyin that strips watermarks and supports automated downloads from user profiles, likes, and collections. It functions as a social media content archiving tool, enabling users to download videos and audio from these platforms f…

Is drawrowfly/tiktok-scraper a good alternative to Python Spider?

This project is a specialized TikTok API scraper and data extractor. It functions as a proxy-based web scraper designed to collect user metadata, video posts, and trend feeds, while providing a webhook data pipeline to route scraped information to external URLs via HTTP requests. The tool includes…

Is nanmicoder/crawlertutorial a good alternative to Python Spider?

CrawlerTutorial is a comprehensive Python web scraping tutorial and framework designed for extracting data from static and dynamic websites. It functions as a web data extraction pipeline and an HTTP request orchestrator, covering the full lifecycle of scraping applications from initial fetching to…

Is awesome-selfhosted/awesome-selfhosted a good alternative to Python Spider?

This project is a community-curated directory of open-source software designed for deployment in private server environments and home labs. It serves as a comprehensive resource for discovering independent, self-hosted alternatives to mainstream cloud services, enabling users to maintain full data…

Is thespeedx/proxy-list a good alternative to Python Spider?

PROXY-List is a public proxy aggregator that provides data structures for storing and aggregating publicly available HTTP and SOCKS proxy server addresses. It serves as a source for retrieving network traffic routing lists used to mask origin IP addresses during web requests. The project utilizes…

Is ultrafunkamsterdam/nodriver a good alternative to Python Spider?

nodriver is an asynchronous Chromium browser automation framework that provides headless control and web scraping capabilities. It functions as a Chrome DevTools Protocol client, allowing for granular engine control by attaching directly to the browser's debug port without the need for external dri…

Is apify/crawlee-python a good alternative to Python Spider?

Crawlee-python is a web crawling framework for building scalable scrapers using Python. It serves as a comprehensive tool for web scraping automation, providing a system to extract structured data from websites using both lightweight HTTP requests and headless browser automation. The framework is…

Is testersunshine/12306 a good alternative to Python Spider?

This project is a railway booking automation tool designed to monitor ticket inventory and execute purchases on the 12306 platform. Its primary purpose is to secure high-demand train tickets by automating the login, booking, and checkout processes. The system utilizes automated captcha solving and…

Is speedyapply/jobspy a good alternative to Python Spider?

JobSpy is a job board scraper and listing aggregator designed to extract employment opportunities from multiple websites and compile them into a unified dataset. It functions as a job search automation tool that programmatically collects vacancies based on keywords, locations, and specific filters.…

Is proxifly/free-proxy-list a good alternative to Python Spider?

This project is a public proxy aggregator and directory providing curated lists of validated HTTP and SOCKS proxy servers. It features a machine-readable API service and tools designed for anonymous network routing and the automated rotation of outgoing IP addresses. The system distinguishes itsel…

Back to jack-cherish/python-spider

Open-source alternatives to Python Spider

30 open-source projects similar to jack-cherish/python-spider, ranked by how many features they have in common. Compare stars, activity and what each one does to find the best Python Spider alternative.

johnserf-seed/tiktokdownload
Johnserf-Seed/TikTokDownload
8,673View on GitHub
TikTokDownload is a configurable batch video downloader for TikTok and Douyin that strips watermarks and supports automated downloads from user profiles, likes, and collections. It functions as a social media content archiving tool, enabling users to download videos and audio from these platforms for offline viewing or personal backup. The project distinguishes itself through a modular download pipeline that combines audio extraction, batch scheduling, config-driven workflows, cookie-based authentication, URL parsing, paginated API scraping, and watermark removal. It uses a settings file to c
Pythonapidouyinplugin
View on GitHub8,673
drawrowfly/tiktok-scraper
drawrowfly/tiktok-scraper
5,120View on GitHub
This project is a specialized TikTok API scraper and data extractor. It functions as a proxy-based web scraper designed to collect user metadata, video posts, and trend feeds, while providing a webhook data pipeline to route scraped information to external URLs via HTTP requests. The tool includes a watermark-free video downloader that saves high-definition content to local storage. It employs cryptographic request signing for server authentication and utilizes session cookie authentication combined with proxy rotation to manage network traffic and avoid rate limits. Capabilities cover bulk
TypeScript
View on GitHub5,120
nanmicoder/crawlertutorial
NanmiCoder/CrawlerTutorial
4,262View on GitHub
CrawlerTutorial is a comprehensive Python web scraping tutorial and framework designed for extracting data from static and dynamic websites. It functions as a web data extraction pipeline and an HTTP request orchestrator, covering the full lifecycle of scraping applications from initial fetching to final data storage. The project provides specialized guidance on anti-bot bypass techniques and web API reverse engineering. It includes methods for evading browser detection through identity masking and proxy rotation, as well as techniques for identifying hidden API endpoints by analyzing network
Python
View on GitHub4,262

Open-source alternatives to Python Spider

Johnserf-Seed/TikTokDownload

drawrowfly/tiktok-scraper

NanmiCoder/CrawlerTutorial

awesome-selfhosted/awesome-selfhosted

TheSpeedX/PROXY-List

ultrafunkamsterdam/nodriver

apify/crawlee-python

testerSunshine/12306

speedyapply/JobSpy

proxifly/free-proxy-list

btjawa/BiliTools

avinashkranjan/Amazing-Python-Scripts

MikeChongCan/scylla

cv-cat/Spider_XHS

lining0806/PythonSpiderNotes

kanasimi/work_crawler

RayWangQvQ/BiliBiliToolPro

alsotang/node-lessons

jiji262/douyin-downloader

VeNoMouS/cloudscraper

apify/crawlee

hongyangAndroid/okhttputils

goldze/MVVMHabit

shengqiangzhang/examples-of-web-crawlers

jhy/jsoup

codelucas/newspaper

jdepoix/youtube-transcript-api

Integuru-AI/Integuru

nICEnnnnnnnLee/BilibiliDown

code4craft/webmagic