1 repo
Tools that coordinate headless browser engines to automate web interaction and content extraction.
Distinguishing note: Focuses on the orchestration layer for headless browsers rather than generic testing or scraping libraries.
Explore 1 awesome GitHub repository matching development tools & productivity · Browser Automation Orchestrators. Refine with filters or upvote what's useful.
ArchiveBox is a self-hosted archiving tool designed for personal digital preservation and research data management. It functions as an automated web preservation engine that monitors URL inputs from bookmarks, browser history, or manual entries to capture and store permanent, offline copies of web content. By utilizing headless browser automation, the system renders dynamic web pages to ensure that captured snapshots, PDFs, and media assets remain accurate and accessible even if the original source disappears. The project distinguishes itself through a modular extractor pipeline and a task-qu
A management layer that coordinates headless browser engines and command-line tools to render and extract complex web content for archival.