30 open-source projects similar to segmentio/nightmare, ranked by how many features they have in common. Compare stars, activity and what each one does to find the best Nightmare alternative.
nodriver is an asynchronous Chromium browser automation framework that provides headless control and web scraping capabilities. It functions as a Chrome DevTools Protocol client, allowing for granular engine control by attaching directly to the browser's debug port without the need for external driver binaries. The framework is specifically designed as an anti-bot detection bypass tool. It modifies browser fingerprints and protocol headers to evade automated security systems, handle security warnings, and bypass common obstacles like insecure connection alerts. The system covers a broad rang
Puppeteer is a JavaScript library for programmatically controlling Chrome and Firefox through the Chrome DevTools Protocol or the WebDriver BiDi protocol. It launches and manages browser instances—typically without a visible user interface—to automate interactions with web pages, enabling navigation, clicking, typing, and data extraction entirely through code. The library distinguishes itself through deep integration with the Chromium embedding layer, allowing fine-grained process configuration with custom flags, permissions, and sandbox policies. It maintains multiple concurrent command stre
DrissionPage is a Python library designed for web automation, data scraping, and testing. It functions as a browser automation framework that communicates directly with the browser engine via the Chrome DevTools Protocol, allowing for precise control over browser instances and page states. The library distinguishes itself by providing a unified interface that combines full browser automation with raw HTTP request capabilities. This hybrid approach allows users to switch between lightweight network requests and heavy browser-based interactions within a single workflow. By wrapping asynchronous
php-webdriver is a WebDriver PHP client and browser automation framework that implements the W3C WebDriver standard. It serves as a programmatic interface for controlling web browsers, executing JavaScript, and managing browser sessions in both headed and headless environments. The library functions as a Selenium protocol implementation, allowing PHP applications to communicate with browser drivers such as ChromeDriver or GeckoDriver. It provides the ability to automate user actions, navigate pages, and validate DOM elements for web UI testing. Its capabilities cover broad areas of browser i
Taiko is a browser automation framework and web end-to-end testing library used to perform programmatic user actions and verify application behavior. It functions as a headless browser testing tool capable of simulating real interactions and asserting page states in Chromium and Firefox. The project includes a browser interaction recorder that captures live actions and exports them as executable JavaScript automation scripts. It also serves as a web accessibility auditor, analyzing pages to detect accessibility violations and ensure compliance with inclusive design standards. The framework c
Crawlee-python is a web crawling framework for building scalable scrapers using Python. It serves as a comprehensive tool for web scraping automation, providing a system to extract structured data from websites using both lightweight HTTP requests and headless browser automation. The framework is distinguished by its anti-bot evasion capabilities, which include browser fingerprint impersonation and tiered proxy rotation to bypass detection systems and solve challenges such as Cloudflare. It also incorporates artificial intelligence for autonomous website navigation and schema-based data extra
Dev-browser is a browser automation framework and headless browser controller that provides a sandboxed script runner for executing web tasks. It functions as a vision-based web automator and a specialized interface for large language models, enabling the navigation and interaction of web pages within isolated execution environments. The project distinguishes itself by converting complex web pages into simplified representations and coordinate-based maps, allowing AI agents to analyze layouts and perform actions based on pixel locations. It employs a mapping system that assigns unique identif
bb-browser is an authenticated web scraper and browser automation CLI that also functions as an MCP server for AI coding tools. It treats the browser as a programmable runtime environment, enabling AI agents to control a live Chrome instance through a standard protocol while leveraging existing login sessions for authenticated actions. The project distinguishes itself through a dual CLI and MCP interface, allowing both direct command-line control and AI-driven browser manipulation. It includes a parallel multi-platform query engine that executes simultaneous searches across multiple websites,
Playwright-cli is a command line interface for executing web tasks and automating browser interactions using the Playwright framework. It serves as a browser binary manager for downloading and installing specific browser engines and their required system dependencies, as well as a tool for running automated test suites across multiple engines to verify application behavior. The utility functions as a browser session controller, managing browser profiles and persistent storage states via the command line. It enables the execution of automation suites across different browser engines and config
BrowserMCP is a browser automation bridge that connects AI tools to a live browser session through a local proxy server. It implements a standardized protocol for sending commands like click, type, and navigate to a real browser instance running on the user's machine, while keeping all browsing data on the device. The project distinguishes itself by preserving user sessions and fingerprints across automation tasks. It attaches to the user's existing browser profile to maintain cookies, logins, and authentication state, and uses the real browser's user agent, viewport, and extension context to
Crawlee is a web scraping framework designed for building scalable, reliable, and distributed data extraction pipelines. It provides a unified interface for managing headless browser automation and lightweight HTTP requests, allowing developers to handle complex web navigation, dynamic content rendering, and large-scale data collection within a single, modular architecture. The project distinguishes itself through its resource-aware concurrency controller, which dynamically scales task execution based on real-time CPU and memory usage to prevent host machine exhaustion. It also features a rob
This project is a Model Context Protocol server that enables Large Language Models to control Playwright browsers for web automation, scraping, and end-to-end testing. It functions as a programmable interface for executing JavaScript, capturing screenshots, and interacting with web elements across multiple browser engines. The server exposes browser automation capabilities as a set of standardized tools that models can discover and invoke. It supports session-based browser isolation to ensure unique contexts for each client connection and provides a transport layer using either standard input
This project is a comprehensive educational guide and framework for building web scrapers using Python. It provides a course-based approach to data extraction, combining a Python crawler framework with tutorials on web reverse engineering and network traffic analysis. The project distinguishes itself by covering advanced extraction challenges, including the decryption of obfuscated JavaScript and the bypass of anti-scraping measures. It specifically addresses mobile application scraping through the simulation of user interactions and the interception of network traffic. The capability surfac
Steel is a cloud browser automation platform that provides a REST API for launching and controlling remote Chrome browser sessions. It enables programmatic browsing and web scraping using standard automation tools like Puppeteer, Playwright, and Selenium, connecting to cloud-hosted browser instances via WebSocket and the Chrome DevTools Protocol. The platform supports both headless and headful browser sessions, with language-specific SDKs for TypeScript and Python. The service distinguishes itself through comprehensive anti-detection capabilities, including residential proxy rotation, CAPTCHA
Puppeteer is a browser automation library that provides a programmatic interface for controlling web browsers to execute tasks, simulate user interactions, and perform end-to-end testing. It functions as a headless browser controller, managing browser lifecycles, isolated session contexts, and remote connections to facilitate stable, automated web-based workflows. The library distinguishes itself through its deep integration with the Chrome DevTools Protocol, utilizing a bidirectional message bus to execute commands and receive real-time event notifications. It supports advanced automation pa
Pageres is an automated web page capturer and command line interface that renders HTML content and websites into images. It uses a headless Chromium browser to generate screenshots of full pages or specific elements across multiple screen resolutions. The tool allows for the simulation of different device dimensions to verify responsive design and the creation of visual snapshots for UI regression testing. It supports the capture of protected pages by passing custom HTTP headers, cookies, and basic authentication credentials. The system includes capabilities for page content manipulation thr
Nightwatch is a Node.js test automation tool and W3C WebDriver test framework designed for executing functional test suites and verifying system behavior. It provides a suite of utilities for web browser automation, native mobile application testing, and REST API validation. The project includes specialized tools for visual regression testing, which compares current screenshots against baseline images to detect unexpected changes. It also features an accessibility auditing tool to check user interface elements against established standards for compliance. The framework covers a broad range o
Protractor is a WebDriver-based end-to-end testing framework and browser automation tool. It serves as a frontend integration test suite used to verify web application flows by simulating user behavior and executing JavaScript within a browser. The framework is specifically designed for testing Angular applications, providing specialized locators and synchronization tools that align with the framework lifecycle. It distinguishes itself through automatic test step synchronization, which pauses execution until pending page tasks are completed to ensure stable browser execution. The tool covers
WebDriverIO is a Node.js test automation framework used for automating functional tests across web browsers and mobile applications. It acts as a WebDriver protocol client that manages remote browser sessions and executes commands against WebDriver and Appium servers to perform end-to-end testing. The framework is distinguished by its ability to control both native and hybrid mobile applications and its support for running automated suites across local machines, remote grids, and cloud device providers. It includes specialized capabilities for coordinating multi-browser interactions and estab
This project is a Model Context Protocol server that connects large language models to web scraping and crawling tools. It functions as a bridge, allowing LLM clients to utilize a web crawling engine and scraping utilities to extract and process web data. The server integrates a markdown web converter that transforms dynamic web pages and PDF documents into clean markdown to optimize consumption by AI models. It also provides a browser automation interface for controlling headless sessions and bypassing access restrictions. The system covers broad capabilities including large-scale website d
Obscura is a web scraping infrastructure and headless browser server designed for AI agents. It provides a system for AI models to control browser sessions, interact with websites, and extract web data using a WebSocket implementation of the Chrome DevTools Protocol. The project focuses on bot detection evasion by randomizing browser fingerprints, masking native functions, and blocking tracking scripts to mimic human behavior. It further secures identities through a traffic layer that routes network requests via HTTP or SOCKS5 proxies. The system supports large-scale data extraction through
requests-html is a Python HTML parsing library and web scraping framework. It functions as an asynchronous HTTP client and a JavaScript rendering engine designed to fetch and parse web pages for structured data extraction. The project integrates a headless browser to execute JavaScript, allowing it to retrieve dynamically generated content that standard HTML parsers cannot see. It provides tools for automated data extraction using CSS selectors and XPath expressions to isolate specific text or attributes from HTML structures. The framework covers network operations including asynchronous pag
TabFS is a browser automation tool and virtual filesystem that maps browser tabs and page elements to files. It provides a FUSE-based bridge that allows external scripts and tools to interact with a browser's JavaScript runtime through standard file system operations. The system enables the execution of JavaScript expressions and the manipulation of DOM elements by reading and writing synthetic files. Users can control tab state, create new tabs, and inspect metadata such as URLs and titles using a file manager or shell. The project covers capabilities for browser automation, including acces
MediaCrawler is an automated web scraping framework designed to extract public posts, comments, and creator metadata from various social media platforms. It functions as a headless browser automator, utilizing real browser instances to render dynamic content and execute the client-side scripts necessary for interacting with modern web interfaces. The system distinguishes itself through a focus on session persistence and network flexibility. It supports remote debugging to reuse active browser sessions and cookies, which helps minimize the risk of triggering platform security challenges. To ma
This project is an automation framework that connects large language models to web browsers via the Chrome DevTools Protocol for autonomous task execution. It functions as a bridge between intelligent agents and browser engines, allowing for the direct control of browser sessions and profiles. The framework features a self-healing agent capable of generating and executing custom scripts during runtime to resolve failures and optimize browser tasks. It supports stealthy deployment through the use of integrated proxies and captcha solvers to bypass bot detection and security mitigations. The s
php-webdriver is a browser automation library and PHP language binding for the Selenium WebDriver protocol. It serves as a web application testing tool that allows for the programmatic control of web browsers to simulate user interactions and navigate web pages. The project implements the WebDriver protocol to manage browser sessions and execute automated functional tests. It enables integration with Selenium servers to perform automated web testing and support headless browser workflows. The library provides capabilities for configuring browser properties and managing the lifecycle of remot
OpenBrowser is an AI web agent toolkit and automation framework designed to translate natural language instructions into executable browser workflows. It functions as a headless browser controller and orchestrator, enabling the creation of autonomous agents that navigate websites, interact with elements, and extract data using plain English commands. The system features a sandboxed execution environment that utilizes domain whitelists and memory limits to ensure secure web interaction. It distinguishes itself through a command-line interface for triggering autonomous tasks with configurable m
Mechanize is a Ruby library for web browser automation and headless browser emulation. It allows for programmatically navigating websites and simulating human behavior without a graphical user interface. The library provides an automated interface for populating and submitting web forms, including text fields, checkboxes, and file uploads. It manages stateful sessions by automatically storing and sending cookies across multiple requests to maintain user authentication and identity. Additional capabilities include web data scraping, the ability to download remote web content, and the maintena
PythonSpiderNotes is a comprehensive instructional resource and framework for building web crawlers and extracting data using the Python programming language. It provides a set of methods for parsing unstructured HTML and JSON data into structured formats for persistent storage. The project includes detailed guides and tutorials on browser automation for retrieving dynamic content, as well as a framework for data extraction. It specifically covers anti-bot bypass techniques, such as rotating proxies and spoofing headers, to avoid IP blocks and detection systems. The capability surface extend