30 open-source projects similar to browser-use/web-ui, ranked by how many features they have in common. Compare stars, activity and what each one does to find the best Web Ui alternative.
Steel is a cloud browser automation platform that provides a REST API for launching and controlling remote Chrome browser sessions. It enables programmatic browsing and web scraping using standard automation tools like Puppeteer, Playwright, and Selenium, connecting to cloud-hosted browser instances via WebSocket and the Chrome DevTools Protocol. The platform supports both headless and headful browser sessions, with language-specific SDKs for TypeScript and Python. The service distinguishes itself through comprehensive anti-detection capabilities, including residential proxy rotation, CAPTCHA
BrowserMCP is a browser automation bridge that connects AI tools to a live browser session through a local proxy server. It implements a standardized protocol for sending commands like click, type, and navigate to a real browser instance running on the user's machine, while keeping all browsing data on the device. The project distinguishes itself by preserving user sessions and fingerprints across automation tasks. It attaches to the user's existing browser profile to maintain cookies, logins, and authentication state, and uses the real browser's user agent, viewport, and extension context to
Dev-browser is a browser automation framework and headless browser controller that provides a sandboxed script runner for executing web tasks. It functions as a vision-based web automator and a specialized interface for large language models, enabling the navigation and interaction of web pages within isolated execution environments. The project distinguishes itself by converting complex web pages into simplified representations and coordinate-based maps, allowing AI agents to analyze layouts and perform actions based on pixel locations. It employs a mapping system that assigns unique identif
nodriver is an asynchronous Chromium browser automation framework that provides headless control and web scraping capabilities. It functions as a Chrome DevTools Protocol client, allowing for granular engine control by attaching directly to the browser's debug port without the need for external driver binaries. The framework is specifically designed as an anti-bot detection bypass tool. It modifies browser fingerprints and protocol headers to evade automated security systems, handle security warnings, and bypass common obstacles like insecure connection alerts. The system covers a broad rang
DrissionPage is a Python library designed for web automation, data scraping, and testing. It functions as a browser automation framework that communicates directly with the browser engine via the Chrome DevTools Protocol, allowing for precise control over browser instances and page states. The library distinguishes itself by providing a unified interface that combines full browser automation with raw HTTP request capabilities. This hybrid approach allows users to switch between lightweight network requests and heavy browser-based interactions within a single workflow. By wrapping asynchronous
This project is an LLM browser automation framework and AI agent browser interface. It serves as a control layer that translates natural language instructions into browser interactions using large language models, enabling AI agents to navigate and interact with web pages through standardized browser-control functions. The system functions as an RPA workflow orchestrator and headless browser management tool, capable of recording and replaying deterministic browser sequences to automate repetitive tasks. It distinguishes itself through stealth configurations, including residential proxies and
Playwriter is a browser automation framework and remote controller that manages stateful sessions and executes programmatic commands via the Chrome DevTools Protocol. It provides a system for controlling web browsers to interact with pages and extract data through both programmatic APIs and a command-line interface. The project features a visual element selector that generates screenshots with accessibility labels, mapping visual interface elements to programmatic selectors to help agents navigate. It supports remote browser control through WebSocket tunneling, allowing users to manage browse
pydoll is a Chrome DevTools Protocol automation library and headless browser controller used for web data extraction and parallel browser automation. It controls Chromium-based browsers via direct WebSocket connections, allowing it to manage isolated browser contexts and tabs while bypassing the overhead and detection associated with WebDriver. The project features an anti-bot evasion framework that mimics natural human behavior, including mouse movements generated via Bezier curves and variable typing patterns. It provides specialized stealth capabilities to bypass behavioral analysis and au
Undetected-chromedriver is a framework for automated browser navigation designed to bypass anti-bot security measures. It functions by patching browser drivers at the binary level to obscure automation signals, allowing scripts to interact with protected websites without being flagged or blocked by security services. The project distinguishes itself through its ability to maintain stealth during automated sessions, including those executed in headless mode. It achieves this by injecting custom configurations to mimic human user behavior and by hooking into low-level browser debugging protocol
MediaCrawler is an automated web scraping framework designed to extract public posts, comments, and creator metadata from various social media platforms. It functions as a headless browser automator, utilizing real browser instances to render dynamic content and execute the client-side scripts necessary for interacting with modern web interfaces. The system distinguishes itself through a focus on session persistence and network flexibility. It supports remote debugging to reuse active browser sessions and cookies, which helps minimize the risk of triggering platform security challenges. To ma
OpenBrowser is an AI web agent toolkit and automation framework designed to translate natural language instructions into executable browser workflows. It functions as a headless browser controller and orchestrator, enabling the creation of autonomous agents that navigate websites, interact with elements, and extract data using plain English commands. The system features a sandboxed execution environment that utilizes domain whitelists and memory limits to ensure secure web interaction. It distinguishes itself through a command-line interface for triggering autonomous tasks with configurable m
This project is an MCP browser automation server that connects large language models to headless cloud browsers. It functions as an autonomous web workflow engine and an LLM web agent interface, enabling the translation of natural language instructions into browser actions and structured data retrieval. The system distinguishes itself through a managed headless browser cloud API that supports concurrent Chromium sessions with integrated stealth modes, CAPTCHA solving, and proxy traffic routing. It utilizes self-healing element selection to maintain automation resilience when page structures c
Nightmare is an Electron-based browser automation library and headless browser controller. It provides the infrastructure to programmatically navigate web pages, interact with DOM elements, and execute JavaScript within a background browser instance. The project distinguishes itself by integrating a full Chromium instance within an Electron shell, allowing for the management of browser sessions, network proxy settings, and persistent storage partitions. It enables the capture of page states as PNG screenshots, PDF documents, or HTML files. The tool covers a broad range of capabilities includ
Crawlee is a web scraping framework designed for building scalable, reliable, and distributed data extraction pipelines. It provides a unified interface for managing headless browser automation and lightweight HTTP requests, allowing developers to handle complex web navigation, dynamic content rendering, and large-scale data collection within a single, modular architecture. The project distinguishes itself through its resource-aware concurrency controller, which dynamically scales task execution based on real-time CPU and memory usage to prevent host machine exhaustion. It also features a rob
This project is a Model Context Protocol server that enables Large Language Models to control Playwright browsers for web automation, scraping, and end-to-end testing. It functions as a programmable interface for executing JavaScript, capturing screenshots, and interacting with web elements across multiple browser engines. The server exposes browser automation capabilities as a set of standardized tools that models can discover and invoke. It supports session-based browser isolation to ensure unique contexts for each client connection and provides a transport layer using either standard input
Botasaurus is a Python web scraping framework and headless browser automation system used to build scalable data extraction tools. It functions as a web data extraction tool and OCR document parser, converting website content, images, and PDF files into structured formats such as JSON, CSV, and Excel. The framework distinguishes itself by providing a scraper management interface that allows Python functions to be wrapped in a web-based UI or deployed as standalone desktop applications. This enables non-technical users to trigger extraction jobs and manage tasks via a graphical interface or RE
Osmedeus is a security workflow orchestration engine that coordinates AI agents, shell commands, and scanning tools through declarative YAML pipelines. It functions as a distributed security scanner, a declarative workflow automator, and an AI agent framework for security, enabling automated multi-step security analysis with conditional branching, parallel execution, and distributed workers. The engine distinguishes itself through a hybrid runner model that executes workflow steps on the local host, inside Docker containers, or over SSH to remote machines, selected per step or module. It supp
This project is an agentic framework designed to enable autonomous web navigation and browser automation. It functions as a controller that translates natural language instructions into deterministic browser actions, allowing agents to interact with websites, perform data extraction, and manage complex authentication flows. By leveraging accessibility trees and semantic element resolution, the framework mimics human-like navigation, moving beyond brittle DOM selectors to interact reliably with modern web interfaces. The framework distinguishes itself through its focus on secure, scalable exec
This project is an automation framework that connects large language models to web browsers via the Chrome DevTools Protocol for autonomous task execution. It functions as a bridge between intelligent agents and browser engines, allowing for the direct control of browser sessions and profiles. The framework features a self-healing agent capable of generating and executing custom scripts during runtime to resolve failures and optimize browser tasks. It supports stealthy deployment through the use of integrated proxies and captcha solvers to bypass bot detection and security mitigations. The s
Playwright MCP is a browser automation server that provides a standardized interface for connecting large language models to web navigation and interaction capabilities. By operating as a Model Context Protocol server, it enables external AI agents to execute browser-based tasks, extract data, and perform complex web sequences through a unified communication protocol. The project distinguishes itself by acting as a remote controller that manages headless browser lifecycles and isolated automation contexts. It maintains session-based state isolation, allowing for distinct user profiles and per
php-webdriver is a WebDriver PHP client and browser automation framework that implements the W3C WebDriver standard. It serves as a programmatic interface for controlling web browsers, executing JavaScript, and managing browser sessions in both headed and headless environments. The library functions as a Selenium protocol implementation, allowing PHP applications to communicate with browser drivers such as ChromeDriver or GeckoDriver. It provides the ability to automate user actions, navigate pages, and validate DOM elements for web UI testing. Its capabilities cover broad areas of browser i
LaVague is an LLM web agent framework and large action model designed to translate natural language instructions into executable browser automation scripts. It functions as a multi-modal orchestrator that reasons over web page states and HTML content to automate multi-step tasks via a Selenium-based automation engine. The framework features a modular model provider layer, allowing users to swap between different language and vision models from providers such as Anthropic, Gemini, and Azure OpenAI. It employs a multi-modal world model to process screenshots and HTML structures, utilizing retri
Youtu Agent is an open-source framework for building, running, and evaluating autonomous agents powered by large language models. It provides the core infrastructure for creating agents that follow reasoning loops, use toolkits, and coordinate with other agents to solve complex tasks, all managed through YAML-driven configuration files. The framework distinguishes itself through its support for multi-agent orchestration, where a planner agent decomposes tasks and coordinates specialized worker agents, and through its integration with the Model Context Protocol for connecting to external toolk
This project provides a set of containerized images for running a distributed Selenium Grid to automate browser testing at scale. It enables containerized browser testing by executing automated web tests using ephemeral browser containers and WebDriver browser containers that scale on demand. The system allows for the deployment of a distributed test grid to spread browser workloads across multiple machines. It includes a VNC browser interface that provides a web-based visual interface for inspecting live browser sessions running inside the containers. Capabilities cover browser test debuggi
Skyvern is an autonomous web navigation agent and browser-based workflow orchestrator that uses large language models to execute multi-step tasks on websites. By translating natural language instructions into actionable browser commands, the framework enables the automation of complex user workflows, including data extraction and interface interaction, without manual intervention. The platform distinguishes itself through a focus on secure, self-hosted infrastructure and stealth-oriented execution. It utilizes containerized browser isolation to maintain consistent environments and employs pro
WebDriverIO is a Node.js test automation framework used for automating functional tests across web browsers and mobile applications. It acts as a WebDriver protocol client that manages remote browser sessions and executes commands against WebDriver and Appium servers to perform end-to-end testing. The framework is distinguished by its ability to control both native and hybrid mobile applications and its support for running automated suites across local machines, remote grids, and cloud device providers. It includes specialized capabilities for coordinating multi-browser interactions and estab
Open-claude-cowork is an LLM agent workflow orchestrator and multi-agent collaborative workspace. It serves as a SaaS tool integration framework and a real-time AI chat interface designed to connect large language models with external software applications and browser tools to automate complex business processes. The platform functions as a headless browser automation tool, enabling AI agents to navigate websites and interact with web-based interfaces automatically. It allows for the creation of shared environments where multiple agents coordinate using external tools and shared memory to com
This is an open-source Python SDK for building and orchestrating production-grade AI agents. It provides a unified framework for creating conversational agents that can use tools, maintain state, and coordinate across multiple language model providers including OpenAI, Anthropic, Google, Amazon Bedrock, and locally-hosted models. The SDK supports multi-agent orchestration through graphs, teams, and swarms, allowing several specialized agents to collaborate on complex tasks. Agents can be composed as callable tools that other agents invoke, and the framework includes policy handlers that inspe