JobSpy

Open-source alternatives to JobSpy

Similar open-source projects, ranked by how many features they share with JobSpy.

apify/crawlee-python
apify/crawlee-python
8,097View on GitHub
Crawlee-python is a web crawling framework for building scalable scrapers using Python. It serves as a comprehensive tool for web scraping automation, providing a system to extract structured data from websites using both lightweight HTTP requests and headless browser automation. The framework is distinguished by its anti-bot evasion capabilities, which include browser fingerprint impersonation and tiered proxy rotation to bypass detection systems and solve challenges such as Cloudflare. It also incorporates artificial intelligence for autonomous website navigation and schema-based data extra
Pythonapifyautomationbeautifulsoup
View on GitHub8,097
oxylabs/how-to-scrape-amazon-product-data
oxylabs/how-to-scrape-amazon-product-data
2,511View on GitHub
This project is an Amazon web scraper and e-commerce data extractor designed to retrieve product names, prices, and ratings. It functions as a headless browser crawler that converts unstructured web content from product listings into structured JSON and CSV formats. The tool incorporates anti-bot bypass capabilities to circumvent CAPTCHAs and security challenges. It achieves this through the use of residential proxy integration, automatic proxy rotation, and the modification of browser fingerprints to simulate human interaction patterns. The system provides broad web scraping capabilities, i
amazonamazon-scraperpython
View on GitHub2,511
bda-research/node-crawler
bda-research/node-crawler
6,785View on GitHub
node-crawler is a programmable web crawler for Node.js that manages request queues and automates data extraction. It functions as a rate-limited HTTP client and a headless HTML parser, providing the infrastructure to visit large sets of URLs asynchronously while preventing duplicate processing through task deduplication. The project distinguishes itself through a proxy rotation manager that cycles user agents and proxy servers to bypass access restrictions. It utilizes the HTTP/2 protocol to improve request performance and server compatibility during large-scale scraping operations. The syst
TypeScriptcheeriocrawlerextract-data
View on GitHub6,785
jack-cherish/python-spider
Jack-Cherish/python-spider
19,660View on GitHub
This is a collection of Python scripts designed for extracting data from popular Chinese websites and mobile applications. It functions as a multi-platform data extraction toolkit, capable of automating tasks such as downloading videos from platforms like Bilibili and Douyin, scraping product reviews and images from e-commerce sites like Taobao and JD.com, and booking train tickets on the 12306 railway system. The project distinguishes itself through its focus on automating specific, high-value tasks within the Chinese internet ecosystem. It includes capabilities for solving Chinese CAPTCHA c
Pythonpythonpython-spiderpython3
View on GitHub19,660

See all 30 alternatives to JobSpy

speedyapplyJobSpy

Features

Open-source alternatives to JobSpy

apify/crawlee-python

oxylabs/how-to-scrape-amazon-product-data

bda-research/node-crawler

Jack-Cherish/python-spider

Star history

Open-source alternatives to JobSpy

apify/crawlee-python

oxylabs/how-to-scrape-amazon-product-data

bda-research/node-crawler

Jack-Cherish/python-spider