Goutte is a PHP web scraper and DOM crawler designed for extracting data from websites. It functions as an HTTP client wrapper that enables the retrieval of web pages and the parsing of HTML content.
The project provides a web form automator to programmatically fill and submit HTML forms to remote servers. It also includes a mechanism for automated website crawling by following links to discover and archive web content.
The system supports stateful session management to maintain cookies and headers across requests. It further covers HTML data extraction through DOM-based element selection and CSS selectors.