awesome-repositories.com
© 2026 Bringes Technology SRL·VAT RO45896025·hello@bringes.io
MCPSitemapPrivacyTerms

2 repos

Awesome GitHub RepositoriesWeb Scrapers

Tools and frameworks for extracting data from websites and social media platforms.

Distinguishing note: Focuses on the high-level capability of social media content extraction.

Explore 2 awesome GitHub repositories matching web development · Web Scrapers. Refine with filters or upvote what's useful.

  1. Home
  2. Web Development
  3. Web Scrapers

Awesome Web Scrapers GitHub Repositories

Describe the repository you're looking for…
Find the best repos with AI.We'll search the best matching repositories with AI.
  • NanmiCoder/MediaCrawler

    NanmiCoder/MediaCrawler

    44,037View on GitHub↗

    MediaCrawler is an automated web scraping framework designed to extract public posts, comments, and creator metadata from various social media platforms. It functions as a headless browser automator, utilizing real browser instances to render dynamic content and execute the client-side scripts necessary for interacting with modern web interfaces. The system distinguishes itself through a focus on session persistence and network flexibility. It supports remote debugging to reuse active browser sessions and cookies, which helps minimize the risk of triggering platform security challenges. To ma

    Collects posts, comments, and creator details from social platforms using a unified interface.

    Python
    44,037View on GitHub↗
  • DIYgod/RSSHub

    DIYgod/RSSHub

    41,959View on GitHub↗

    RSSHub is a headless, server-side engine designed to generate standardized RSS and Atom feeds from websites that do not natively provide them. By acting as an extensible data aggregator, it enables the automated collection of web content, allowing users to monitor updates from disparate sources through centralized feed readers or workflow automation tools. The platform distinguishes itself through a route-based data extraction framework that maps specific URL patterns to custom scraping logic. This modular architecture is supported by a middleware-driven request pipeline and declarative route

    Provides routing logic and parsing tools to extract structured data from websites lacking native feeds.

    TypeScriptbilibilidoubandribbble
    41,959View on GitHub↗