30 open-source projects similar to browser-use/browser-use, ranked by how many features they have in common. Compare stars, activity and what each one does to find the best Browser Use alternative.
This project is an agentic framework designed to enable autonomous web navigation and browser automation. It functions as a controller that translates natural language instructions into deterministic browser actions, allowing agents to interact with websites, perform data extraction, and manage complex authentication flows. By leveraging accessibility trees and semantic element resolution, the framework mimics human-like navigation, moving beyond brittle DOM selectors to interact reliably with modern web interfaces. The framework distinguishes itself through its focus on secure, scalable exec
Puppeteer is a browser automation library that provides a programmatic interface for controlling web browsers to execute tasks, simulate user interactions, and perform end-to-end testing. It functions as a headless browser controller, managing browser lifecycles, isolated session contexts, and remote connections to facilitate stable, automated web-based workflows. The library distinguishes itself through its deep integration with the Chrome DevTools Protocol, utilizing a bidirectional message bus to execute commands and receive real-time event notifications. It supports advanced automation pa
This project is a comprehensive framework for building and managing autonomous agent systems. It provides a unified architecture for orchestrating multi-agent societies, where specialized agents collaborate through roleplay to decompose and solve complex tasks. The system integrates language models with external environments, enabling agents to perform real-world actions through a standardized tool-calling abstraction layer. The framework distinguishes itself through its focus on iterative reasoning and data reliability. It employs automated feedback loops to refine agent outputs and self-eva
Skyvern is an autonomous web navigation agent and browser-based workflow orchestrator that uses large language models to execute multi-step tasks on websites. By translating natural language instructions into actionable browser commands, the framework enables the automation of complex user workflows, including data extraction and interface interaction, without manual intervention. The platform distinguishes itself through a focus on secure, self-hosted infrastructure and stealth-oriented execution. It utilizes containerized browser isolation to maintain consistent environments and employs pro
Stagehand is an AI-native browser automation framework that enables developers to build reliable web automations using a hybrid of natural language instructions and deterministic TypeScript code.
Playwright is a comprehensive browser automation framework designed for end-to-end testing and web workflow automation. It provides a unified API to drive web applications across multiple browser engines, enabling developers to simulate complex user interactions, perform web scraping, and validate application behavior in consistent, isolated environments. The framework distinguishes itself through a web-first testing paradigm that prioritizes stability and resilience. By utilizing an auto-waiting actionability engine and accessibility-tree-based locators, it eliminates common sources of test
This project is an automation framework that connects large language models to web browsers via the Chrome DevTools Protocol for autonomous task execution. It functions as a bridge between intelligent agents and browser engines, allowing for the direct control of browser sessions and profiles. The framework features a self-healing agent capable of generating and executing custom scripts during runtime to resolve failures and optimize browser tasks. It supports stealthy deployment through the use of integrated proxies and captcha solvers to bypass bot detection and security mitigations. The s
Mastra is an orchestration framework designed for building, deploying, and managing autonomous AI agents and multi-agent systems. It provides a comprehensive suite of primitives for creating resilient AI applications, including durable workflow orchestration, event-driven agent loops, and semantic memory management. By integrating these core components, the platform enables developers to build complex, multi-step processes that can reason about goals and execute tasks without manual intervention. The framework distinguishes itself through its focus on observability and secure, isolated execut
chromedp is a browser automation framework and driver that controls web browsers via the Chrome DevTools Protocol. It functions as a headless browser automation tool and web browser controller, enabling the programmatic management of browser sessions, targets, and network responses through a remote debugging interface. The project provides specialized capabilities for Chrome DevTools Protocol automation, including headless browser testing, web scraping and data extraction, and mobile device emulation. It also supports browser-based visual regression by capturing precise screenshots of web pag
This project is a Model Context Protocol server that connects large language models to web scraping and crawling tools. It functions as a bridge, allowing LLM clients to utilize a web crawling engine and scraping utilities to extract and process web data. The server integrates a markdown web converter that transforms dynamic web pages and PDF documents into clean markdown to optimize consumption by AI models. It also provides a browser automation interface for controlling headless sessions and bypassing access restrictions. The system covers broad capabilities including large-scale website d
This project is an MCP browser automation server that connects large language models to headless cloud browsers. It functions as an autonomous web workflow engine and an LLM web agent interface, enabling the translation of natural language instructions into browser actions and structured data retrieval. The system distinguishes itself through a managed headless browser cloud API that supports concurrent Chromium sessions with integrated stealth modes, CAPTCHA solving, and proxy traffic routing. It utilizes self-healing element selection to maintain automation resilience when page structures c
SWE-agent is an autonomous software engineering platform designed to automate repository maintenance and issue resolution. By orchestrating language models to navigate codebases, diagnose software bugs, and apply fixes, the framework functions as an autonomous agent capable of executing shell commands, editing source code, and managing pull requests within isolated, containerized environments. The platform distinguishes itself through its focus on end-to-end task autonomy and observability. It features a robust trajectory logging system that records every thought, action, and environment obse
Crawlee is a web scraping framework designed for building scalable, reliable, and distributed data extraction pipelines. It provides a unified interface for managing headless browser automation and lightweight HTTP requests, allowing developers to handle complex web navigation, dynamic content rendering, and large-scale data collection within a single, modular architecture. The project distinguishes itself through its resource-aware concurrency controller, which dynamically scales task execution based on real-time CPU and memory usage to prevent host machine exhaustion. It also features a rob
Playwright for Python is a browser automation framework designed for end-to-end testing, web scraping, and user interaction simulation. It functions as a headless browser controller that enables programmatic navigation, data extraction, and the execution of complex workflows across multiple rendering engines. The framework distinguishes itself through an actionability-aware interaction engine that automatically verifies element readiness before performing actions, significantly reducing test flakiness. It utilizes isolated browser contexts to maintain separate storage and cookies for parallel
Magentic-UI is an agentic UI toolkit and framework that enables large language models to interface with real-time browser environments, operating systems, and virtual machines. It provides a sandbox environment where models can execute instructions to manage local files and run shell commands. The project functions as a web interaction orchestrator and browser automation framework, allowing for the execution of end-to-end web workflows and form completions. It coordinates these actions through a system that translates natural language goals into executable sequences. The toolkit covers sever
php-webdriver is a WebDriver PHP client and browser automation framework that implements the W3C WebDriver standard. It serves as a programmatic interface for controlling web browsers, executing JavaScript, and managing browser sessions in both headed and headless environments. The library functions as a Selenium protocol implementation, allowing PHP applications to communicate with browser drivers such as ChromeDriver or GeckoDriver. It provides the ability to automate user actions, navigate pages, and validate DOM elements for web UI testing. Its capabilities cover broad areas of browser i
DrissionPage is a Python library designed for web automation, data scraping, and testing. It functions as a browser automation framework that communicates directly with the browser engine via the Chrome DevTools Protocol, allowing for precise control over browser instances and page states. The library distinguishes itself by providing a unified interface that combines full browser automation with raw HTTP request capabilities. This hybrid approach allows users to switch between lightweight network requests and heavy browser-based interactions within a single workflow. By wrapping asynchronous
Scrapy is a comprehensive framework designed for automated web data extraction and large-scale crawling. It operates on an asynchronous, event-driven engine that manages non-blocking network requests and data processing tasks, allowing for the efficient retrieval of structured information from web documents using path-based selectors. The system distinguishes itself through a highly modular architecture that supports complex data collection workflows. Users can implement custom middleware and signal handlers to intercept and modify request flows, while a priority-based scheduler manages concu
Crawl4AI is an AI-powered web crawling and data extraction engine designed to transform complex web content into structured formats. It functions as a headless browser orchestrator, enabling the navigation of dynamic websites, the execution of custom scripts, and the capture of visual assets like screenshots and PDFs. By integrating language models directly into the extraction workflow, the system converts raw HTML into clean, structured data or Markdown files optimized for downstream ingestion. The platform distinguishes itself through a distributed, self-hosted infrastructure that manages l
SWE-agent is a collection of autonomous agents designed for software engineering, competitive programming, and offensive cybersecurity operations. These agents utilize large language models to navigate codebases, interact with file systems, and use terminal interfaces to resolve GitHub issues or complete technical challenges. The system employs specialized agent modes that switch prompting strategies based on whether the task is a software bug, an algorithmic programming problem, or a security vulnerability. It includes dedicated capabilities for automated repository maintenance and offensive
Nightmare is an Electron-based browser automation library and headless browser controller. It provides the infrastructure to programmatically navigate web pages, interact with DOM elements, and execute JavaScript within a background browser instance. The project distinguishes itself by integrating a full Chromium instance within an Electron shell, allowing for the management of browser sessions, network proxy settings, and persistent storage partitions. It enables the capture of page states as PNG screenshots, PDF documents, or HTML files. The tool covers a broad range of capabilities includ
AgentGPT is a browser-based platform for deploying autonomous AI agents. It serves as a web-based orchestrator and self-hosted framework that allows users to configure agents that decompose high-level goals into smaller, actionable tasks for iterative execution. The system manages the full lifecycle of autonomous agents, from defining behaviors and parameters to overseeing goal-oriented task automation. It enables the deployment of agents that use a recursive loop of planning and analysis to reach a desired outcome. The platform includes a command line interface for bootstrapping the project
MemGPT is a memory management framework and external memory layer for large language models. It functions as a platform for building stateful AI agents that maintain a persistent identity and continuous context across multiple sessions. The system enables agents to bypass fixed context window limitations by using a virtual context windowing approach. This allows models to manage their own memory through internal commands to search, update, and delete stored information within a hierarchical structure of short-term working context and long-term archival storage. The framework provides a local
Puppeteer is a JavaScript library for programmatically controlling Chrome and Firefox through the Chrome DevTools Protocol or the WebDriver BiDi protocol. It launches and manages browser instances—typically without a visible user interface—to automate interactions with web pages, enabling navigation, clicking, typing, and data extraction entirely through code. The library distinguishes itself through deep integration with the Chromium embedding layer, allowing fine-grained process configuration with custom flags, permissions, and sandbox policies. It maintains multiple concurrent command stre
Playwriter is a browser automation framework and remote controller that manages stateful sessions and executes programmatic commands via the Chrome DevTools Protocol. It provides a system for controlling web browsers to interact with pages and extract data through both programmatic APIs and a command-line interface. The project features a visual element selector that generates screenshots with accessibility labels, mapping visual interface elements to programmatic selectors to help agents navigate. It supports remote browser control through WebSocket tunneling, allowing users to manage browse
This project is a comprehensive framework for developing, orchestrating, and deploying autonomous agents. It provides a structured environment for building agents that utilize reasoning loops to perform multi-step tasks, manage state through graph-based workflows, and interact with external tools. By mapping unstructured model outputs into typed schemas, the framework ensures reliable integration with downstream application logic. The platform distinguishes itself through a focus on production-grade reliability and security. It incorporates hybrid memory systems that combine vector embeddings
LaVague is an LLM web agent framework and large action model designed to translate natural language instructions into executable browser automation scripts. It functions as a multi-modal orchestrator that reasons over web page states and HTML content to automate multi-step tasks via a Selenium-based automation engine. The framework features a modular model provider layer, allowing users to swap between different language and vision models from providers such as Anthropic, Gemini, and Azure OpenAI. It employs a multi-modal world model to process screenshots and HTML structures, utilizing retri
BrowserMCP is a browser automation bridge that connects AI tools to a live browser session through a local proxy server. It implements a standardized protocol for sending commands like click, type, and navigate to a real browser instance running on the user's machine, while keeping all browsing data on the device. The project distinguishes itself by preserving user sessions and fingerprints across automation tasks. It attaches to the user's existing browser profile to maintain cookies, logins, and authentication state, and uses the real browser's user agent, viewport, and extension context to
Firecrawl is a web data extraction platform designed to convert unstructured web content into clean, LLM-ready formats like markdown or JSON. It functions as an autonomous web crawler and scraper, capable of mapping entire domains, performing recursive navigation, and executing complex data gathering tasks. By leveraging headless browser orchestration, the system handles dynamic, JavaScript-heavy pages to ensure comprehensive data capture. The platform distinguishes itself through its focus on agentic workflows, providing a programmatic interface that allows autonomous agents to perform live
n8n is a workflow automation platform that combines a visual interface with code-based extensibility to design, orchestrate, and manage automated processes. It provides a comprehensive suite of tools for data transformation, filtering, and storage, allowing users to build complex logic through conditional branching, looping, and sub-workflow execution. The platform supports both pre-built integration nodes and custom code execution in JavaScript or Python, enabling connectivity with a wide range of external services and APIs. The platform includes a suite of generative AI capabilities, such a