1 repo
Libraries and tools for automating data collection from websites and managing browser-based navigation.
Distinguishing note: Focuses on the domain of automated data collection, distinct from general web development.
Explore 1 awesome GitHub repository matching web development · Web Scraping Frameworks. Refine with filters or upvote what's useful.
MediaCrawler is an automated web scraping framework designed to extract public posts, comments, and creator metadata from various social media platforms. It functions as a headless browser automator, utilizing real browser instances to render dynamic content and execute the client-side scripts necessary for interacting with modern web interfaces. The system distinguishes itself through a focus on session persistence and network flexibility. It supports remote debugging to reuse active browser sessions and cookies, which helps minimize the risk of triggering platform security challenges. To ma
Implements automated pipelines for navigating websites and collecting data at scale.