awesome-repositories.com
© 2026 Bringes Technology SRL·VAT RO45896025·hello@bringes.io
MCPSitemapPrivacyTerms
Web Crawling · Awesome GitHub Repositories

2 repos

Awesome GitHub RepositoriesWeb Crawling

Systems designed to systematically discover, navigate, and index web content across domains for large-scale data collection.

Explore 2 awesome GitHub repositories matching web development · Web Crawling. Refine with filters or upvote what's useful.

  1. Home
  2. Web Development
  3. Web Automation and Scraping
  4. Web Scraping and Automation
  5. Web Crawling

Awesome Web Crawling GitHub Repositories

Describe the repository you're looking for…
We'll search the best matching repositories with AI.
  • firecrawl/firecrawl

    firecrawl/firecrawl

    84,034GitHubView on GitHub↗

    Firecrawl is a web data extraction platform designed to convert unstructured web content into clean, LLM-ready formats like markdown or JSON. It functions as an autonomous web crawler and scraper, capable of mapping entire domains, performing recursive navigation, and executing complex data gathering tasks. By leveragi

    TypeScriptaiai-agentsai-crawler
  • unclecode/crawl4ai

    unclecode/crawl4ai

    60,452GitHubView on GitHub↗

    Crawl4AI is an AI-powered web crawling and data extraction engine designed to transform complex web content into structured formats. It functions as a headless browser orchestrator, enabling the navigation of dynamic websites, the execution of custom scripts, and the capture of visual assets like screenshots and PDFs.

    Python

Explore sub-tags

  • Adaptive Crawling EnginesIntelligent crawling systems that dynamically adjust navigation strategies based on real-time data requirements and content relevance.
  • Autonomous Web CrawlersRecursive navigation services that map site structures and traverse domains to aggregate datasets.
  • Distributed Crawling InfrastructuresScalable architectures for executing concurrent web data collection tasks across multiple environments.
  • Large-Scale Domain CrawlersInfrastructure designed for comprehensive discovery and indexing of entire websites.
  • Site Map GeneratorsTools that discover and index all URLs within a domain for structured data extraction.