Maxun | Awesome Repository

Maxun is an open-source web scraping and automation platform designed to transform dynamic website content into structured data. By leveraging artificial intelligence to interpret natural language prompts, the system identifies page elements and extracts information without requiring manual selector configuration. It serves as a bridge between raw web content and intelligent workflows, providing structured outputs in formats optimized for large language model ingestion and agent-based applications.

The platform distinguishes itself through its ability to handle complex, authenticated, and dynamic web environments. It synchronizes local browser sessions to access password-protected content and employs proxy rotation and browser fingerprinting to bypass anti-scraping measures. Users can orchestrate multi-step browser interactions—such as clicking buttons and filling forms—to replicate human navigation, while the self-hosted infrastructure ensures full control over data pipelines and extraction robots.

Beyond core extraction, the platform supports a broad range of automation capabilities, including recurring task scheduling, web search integration, and visual content capture. It provides programmatic access through a command-line interface and a dedicated software development kit, allowing for seamless integration with external systems via webhooks. The platform also includes monitoring tools to track website changes and distill large volumes of information into actionable insights.

Features

Web Scraping and Automation - Provides a platform for building and scheduling browser-based extraction workflows for AI agents.
Structured Data Extraction - Captures specific information from webpages and organizes it into structured formats for export or further processing.
Web Data Extraction - Converts raw web pages into clean, structured data formats to simplify downstream processing and automated information collection.
AI-Powered Web Crawlers - Uses language models to interpret web pages and transform unstructured content into structured formats.

Features

Web Scraping and Automation - Provides a platform for building and scheduling browser-based extraction workflows for AI agents.
Structured Data Extraction - Captures specific information from webpages and organizes it into structured formats for export or further processing.
Web Data Extraction - Converts raw web pages into clean, structured data formats to simplify downstream processing and automated information collection.
AI-Powered Web Crawlers - Uses language models to interpret web pages and transform unstructured content into structured formats.