What are the best open-source alternatives to Crawlee?

30 open-source projects similar to apify/crawlee, ranked by shared features. Top picks: apify/crawlee-python, omkarcloud/botasaurus, camel-ai/camel, andeya/pholcus, getmaxun/maxun, browserbase/stagehand, lorien/web-scraping, bda-research/node-crawler, gocolly/colly, mendableai/firecrawl.

Is apify/crawlee-python a good alternative to Crawlee?

Crawlee-python is a web crawling framework for building scalable scrapers using Python. It serves as a comprehensive tool for web scraping automation, providing a system to extract structured data from websites using both lightweight HTTP requests and headless browser automation. The framework is…

Is omkarcloud/botasaurus a good alternative to Crawlee?

Botasaurus is a Python web scraping framework and headless browser automation system used to build scalable data extraction tools. It functions as a web data extraction tool and OCR document parser, converting website content, images, and PDF files into structured formats such as JSON, CSV, and Exc…

Is camel-ai/camel a good alternative to Crawlee?

This project is a comprehensive framework for building and managing autonomous agent systems. It provides a unified architecture for orchestrating multi-agent societies, where specialized agents collaborate through roleplay to decompose and solve complex tasks. The system integrates language models…

Is andeya/pholcus a good alternative to Crawlee?

Pholcus is a distributed web crawling system designed for large-scale data scraping. It employs a master-worker distribution model to coordinate high-concurrency scraping tasks across a network of remote client nodes, enabling both horizontal and vertical data collection. The system features a hot…

Is getmaxun/maxun a good alternative to Crawlee?

Maxun is an open-source web scraping and automation platform designed to transform dynamic website content into structured data. By leveraging artificial intelligence to interpret natural language prompts, the system identifies page elements and extracts information without requiring manual selecto…

Is browserbase/stagehand a good alternative to Crawlee?

Stagehand is an AI-native browser automation framework that enables developers to build reliable web automations using a hybrid of natural language instructions and deterministic TypeScript code.

Is lorien/web-scraping a good alternative to Crawlee?

This project is a comprehensive resource directory for web data extraction, providing a curated collection of tools and libraries for parsing data, automating browsers, and managing network operations. It serves as a guide for extracting structured information from HTML, XML, JSON, and PDF formats.…

Is bda-research/node-crawler a good alternative to Crawlee?

node-crawler is a programmable web crawler for Node.js that manages request queues and automates data extraction. It functions as a rate-limited HTTP client and a headless HTML parser, providing the infrastructure to visit large sets of URLs asynchronously while preventing duplicate processing thro…

Is gocolly/colly a good alternative to Crawlee?

Colly is a high-performance web scraping framework designed for the automated extraction of structured data from websites. It provides a programmable toolkit that manages the complexities of large-scale data collection, including concurrent request orchestration, automatic cookie handling, and robo…

Is mendableai/firecrawl a good alternative to Crawlee?

Firecrawl is a headless browser automation tool and web crawling engine designed to extract structured data from the web. It functions as an API that transforms raw website content and documents into clean markdown and JSON formats to serve as context for large language models. The project disting…

Back to apify/crawlee

Open-source alternatives to Crawlee

30 open-source projects similar to apify/crawlee, ranked by how many features they have in common. Compare stars, activity and what each one does to find the best Crawlee alternative.

apify/crawlee-python
apify/crawlee-python
8,097View on GitHub
Crawlee-python is a web crawling framework for building scalable scrapers using Python. It serves as a comprehensive tool for web scraping automation, providing a system to extract structured data from websites using both lightweight HTTP requests and headless browser automation. The framework is distinguished by its anti-bot evasion capabilities, which include browser fingerprint impersonation and tiered proxy rotation to bypass detection systems and solve challenges such as Cloudflare. It also incorporates artificial intelligence for autonomous website navigation and schema-based data extra
Pythonapifyautomationbeautifulsoup
View on GitHub8,097
omkarcloud/botasaurus
omkarcloud/botasaurus
3,970View on GitHub
Botasaurus is a Python web scraping framework and headless browser automation system used to build scalable data extraction tools. It functions as a web data extraction tool and OCR document parser, converting website content, images, and PDF files into structured formats such as JSON, CSV, and Excel. The framework distinguishes itself by providing a scraper management interface that allows Python functions to be wrapped in a web-based UI or deployed as standalone desktop applications. This enables non-technical users to trigger extraction jobs and manage tasks via a graphical interface or RE
Pythonanti-botanti-detectanti-detect-browser
View on GitHub3,970
camel-ai/camel
camel-ai/camel
17,253View on GitHub
This project is a comprehensive framework for building and managing autonomous agent systems. It provides a unified architecture for orchestrating multi-agent societies, where specialized agents collaborate through roleplay to decompose and solve complex tasks. The system integrates language models with external environments, enabling agents to perform real-world actions through a standardized tool-calling abstraction layer. The framework distinguishes itself through its focus on iterative reasoning and data reliability. It employs automated feedback loops to refine agent outputs and self-eva
Pythonagentai-societiesartificial-intelligence
View on GitHub17,253

Open-source alternatives to Crawlee

apify/crawlee-python

omkarcloud/botasaurus

camel-ai/camel

andeya/pholcus

getmaxun/maxun

browserbase/stagehand

lorien/web-scraping

bda-research/node-crawler

gocolly/colly

mendableai/firecrawl

itsOwen/CyberScraper-2077

firecrawl/firecrawl

henrylee2cn/pholcus

binux/pyspider

code4craft/webmagic

ultrafunkamsterdam/nodriver

lining0806/PythonSpiderNotes

ultrafunkamsterdam/undetected-chromedriver

garrytan/gstack

lightpanda-io/browser

any4ai/AnyCrawl

AutomaApp/automa

projectdiscovery/katana

oxylabs/how-to-scrape-amazon-product-data

coder-hxl/x-crawl

browserbase/mcp-server-browserbase

hu17889/go_spider

Kr1s77/awesome-python-login-model

Skyvern-AI/skyvern

g1879/DrissionPage