What are the best open-source alternatives to Pipet?

30 open-source projects similar to bjesus/pipet, ranked by shared features. Top picks: psf/requests-html, snehasishroy/leetcode-companywise-interview-questions, code4craft/webmagic, binux/pyspider, apify/crawlee-python, steel-dev/steel-browser, gosom/google-maps-scraper, lapwinglabs/x-ray, yusufkaraaslan/skill_seekers, firecrawl/firecrawl-mcp-server.

Is psf/requests-html a good alternative to Pipet?

requests-html is a Python HTML parsing library and web scraping framework. It functions as an asynchronous HTTP client and a JavaScript rendering engine designed to fetch and parse web pages for structured data extraction. The project integrates a headless browser to execute JavaScript, allowing i…

Is snehasishroy/leetcode-companywise-interview-questions a good alternative to Pipet?

snehasishroy/leetcode-companywise-interview-questions is an open-source alternative to Pipet.

Is code4craft/webmagic a good alternative to Pipet?

Webmagic is a Java web crawling framework designed for building scalable automated crawlers to download and process large volumes of web pages. It functions as a distributed web crawler and dynamic content crawler, utilizing an XPath HTML parser to locate and extract specific data points from page…

Is binux/pyspider a good alternative to Pipet?

PySpider is a Python web crawling framework designed for automated data extraction. It provides a pipeline for periodically fetching web content, processing HTML, and persisting scraped information into database backends. The system features a web-based management interface for editing scraping sc…

Is apify/crawlee-python a good alternative to Pipet?

Crawlee-python is a web crawling framework for building scalable scrapers using Python. It serves as a comprehensive tool for web scraping automation, providing a system to extract structured data from websites using both lightweight HTTP requests and headless browser automation. The framework is…

Is steel-dev/steel-browser a good alternative to Pipet?

Steel is a cloud browser automation platform that provides a REST API for launching and controlling remote Chrome browser sessions. It enables programmatic browsing and web scraping using standard automation tools like Puppeteer, Playwright, and Selenium, connecting to cloud-hosted browser instance…

Is gosom/google-maps-scraper a good alternative to Pipet?

This project is a distributed scraping engine designed to extract business details, customer reviews, and lead information from Google Maps. It functions as a business scraper and data extractor that can be deployed as a permanent system or as on-demand serverless functions. The system utilizes a…

Is lapwinglabs/x-ray a good alternative to Pipet?

X-Ray is a web scraping framework and asynchronous web crawler designed to extract structured data from websites. It functions as an HTML data extractor that transforms raw page content into a defined schema using CSS-style selectors. The project implements a headless browser crawler capable of ex…

Is yusufkaraaslan/skill_seekers a good alternative to Pipet?

Skill Seekers is a toolset for generating large language model knowledge bases, featuring a multi-source content scraper and a dedicated RAG data pipeline. It extracts technical data from documentation, code, and video to create structured assets and configuration files for AI-powered IDE extension…

Is firecrawl/firecrawl-mcp-server a good alternative to Pipet?

Firecrawl MCP Server is a Model Context Protocol tool server that exposes the full suite of Firecrawl’s web scraping, crawling, and automation capabilities as tools that large language models can invoke directly. It acts as a proxy to the Firecrawl cloud platform, which manages headless browser orc…

Back to bjesus/pipet

Open-source alternatives to Pipet

30 open-source projects similar to bjesus/pipet, ranked by how many features they have in common. Compare stars, activity and what each one does to find the best Pipet alternative.

psf/requests-html
psf/requests-html
13,826View on GitHub
requests-html is a Python HTML parsing library and web scraping framework. It functions as an asynchronous HTTP client and a JavaScript rendering engine designed to fetch and parse web pages for structured data extraction. The project integrates a headless browser to execute JavaScript, allowing it to retrieve dynamically generated content that standard HTML parsers cannot see. It provides tools for automated data extraction using CSS selectors and XPath expressions to isolate specific text or attributes from HTML structures. The framework covers network operations including asynchronous pag
Pythonbeautifulsoupcss-selectorshtml
View on GitHub13,826
snehasishroy/leetcode-companywise-interview-questions
snehasishroy/leetcode-companywise-interview-questions
2,656View on GitHub
Javaamazon-interviewapple-interviewfacebook-interview
View on GitHub2,656
code4craft/webmagic
code4craft/webmagic
11,680View on GitHub
Webmagic is a Java web crawling framework designed for building scalable automated crawlers to download and process large volumes of web pages. It functions as a distributed web crawler and dynamic content crawler, utilizing an XPath HTML parser to locate and extract specific data points from page structures. The framework distinguishes itself through its ability to handle dynamic content by rendering JavaScript and executing asynchronous requests to extract data from non-static pages. It also allows users to define and execute crawler logic via scripting languages, enabling the update of col
Javacrawlerframeworkjava
View on GitHub11,680

Open-source alternatives to Pipet

psf/requests-html

snehasishroy/leetcode-companywise-interview-questions

code4craft/webmagic

binux/pyspider

apify/crawlee-python

steel-dev/steel-browser

gosom/google-maps-scraper

lapwinglabs/x-ray

yusufkaraaslan/Skill_Seekers

firecrawl/firecrawl-mcp-server

shekhargulati/52-technologies-in-2016

freeok/so-novel

scrapy/scrapy

gsh199449/spider

crawlab-team/crawlab

segmentio/nightmare

RipMeApp/ripme

autoscrape-labs/pydoll

ariya/phantomjs

tmpvar/jsdom

cantino/huginn

Panniantong/Agent-Reach

matthewmueller/x-ray

venera-app/venera

guyueyingmu/avbook

FriendsOfPHP/Goutte

jsdom/jsdom

FlareSolverr/FlareSolverr

ChromeDevTools/chrome-devtools-mcp

Predidit/Kazumi