What are the best open-source alternatives to Scrapy Redis?

30 open-source projects similar to rmax/scrapy-redis, ranked by shared features. Top picks: rolando/scrapy-redis, taskforcesh/bullmq, contribsys/faktory, henrylee2cn/pholcus, j3ssie/osmedeus, optimalbits/bull, wistbean/learn_python3_spider, resque/resque, mperham/sidekiq, rq/rq.

Is rolando/scrapy-redis a good alternative to Scrapy Redis?

This project is a distributed web crawling framework that enables the horizontal scaling of scraping tasks. It uses Redis as a centralized request queue manager and state store to coordinate crawl progress and request metadata across multiple server instances. The system distributes crawling workl…

Is taskforcesh/bullmq a good alternative to Scrapy Redis?

BullMQ is a Redis-backed message queue library and background processor designed for distributed task queueing. It functions as a distributed queue manager and task scheduler, utilizing Redis to manage asynchronous job processing and persistence. The system distinguishes itself through its role as…

Is contribsys/faktory a good alternative to Scrapy Redis?

Faktory is an open-source work server that queues, dispatches, and manages background jobs across multiple programming languages. It stores job payloads as JSON hashes in a Redis-backed queue and provides language-specific client and worker libraries that enable any language to push jobs to the ser…

Is henrylee2cn/pholcus a good alternative to Scrapy Redis?

Pholcus is a distributed web crawler framework written in Go designed for high-concurrency data extraction. It functions as a distributed crawling orchestrator and dynamic data extraction engine, utilizing a server-client architecture to coordinate tasks across multiple nodes. The system integrate…

Is j3ssie/osmedeus a good alternative to Scrapy Redis?

Osmedeus is a security workflow orchestration engine that coordinates AI agents, shell commands, and scanning tools through declarative YAML pipelines. It functions as a distributed security scanner, a declarative workflow automator, and an AI agent framework for security, enabling automated multi-…

Is optimalbits/bull a good alternative to Scrapy Redis?

Bull is a Node.js library for managing distributed jobs and message queues using Redis as the primary data store. It functions as a distributed task worker, job scheduler, and priority queue manager designed to handle asynchronous workloads across multiple processes. The project distinguishes itse…

Is wistbean/learn_python3_spider a good alternative to Scrapy Redis?

This project is a comprehensive educational guide and framework for building web scrapers using Python. It provides a course-based approach to data extraction, combining a Python crawler framework with tutorials on web reverse engineering and network traffic analysis. The project distinguishes its…

Is resque/resque a good alternative to Scrapy Redis?

Resque is a Ruby library for enqueueing and processing asynchronous tasks using Redis as a data store. It functions as a distributed task processor and queue manager, allowing long-running work to be moved out of the main request cycle. The system executes background jobs in isolated child process…

Is mperham/sidekiq a good alternative to Scrapy Redis?

Sidekiq is a Ruby background processing framework and asynchronous task runner. It functions as a Redis-backed background job processor that offloads heavy or time-consuming work from web requests to separate worker processes to ensure the main application remains responsive. The system operates a…

Is rq/rq a good alternative to Scrapy Redis?

rq is a distributed task queue and background worker system for Python that uses a Redis backend to decouple task submission from execution. It functions as a reliable message queue and task scheduler, allowing Python functions or asyncio coroutines to be processed asynchronously across multiple wo…

Back to rmax/scrapy-redis

Open-source alternatives to Scrapy Redis

30 open-source projects similar to rmax/scrapy-redis, ranked by how many features they have in common. Compare stars, activity and what each one does to find the best Scrapy Redis alternative.

rolando/scrapy-redis
rolando/scrapy-redis
5,639View on GitHub
This project is a distributed web crawling framework that enables the horizontal scaling of scraping tasks. It uses Redis as a centralized request queue manager and state store to coordinate crawl progress and request metadata across multiple server instances. The system distributes crawling workloads by sharing a single request queue and utilizes a distributed duplicate filter to prevent multiple workers from visiting the same page. It persists complex request state and metadata as JSON strings within the shared remote store. The framework also provides capabilities for distributed data pro
Python
View on GitHub5,639
taskforcesh/bullmq
taskforcesh/bullmq
8,432View on GitHub
BullMQ is a Redis-backed message queue library and background processor designed for distributed task queueing. It functions as a distributed queue manager and task scheduler, utilizing Redis to manage asynchronous job processing and persistence. The system distinguishes itself through its role as a job workflow orchestrator, enabling the definition of complex parent-child job dependencies and hierarchies for multi-step workflows. It provides sandboxed process execution to isolate heavy workloads and prevent event loop blocking, alongside distributed rate limiting to protect downstream servic
TypeScriptbackground-jobselixirnodejs
View on GitHub8,432
contribsys/faktory
contribsys/faktory
6,089View on GitHub
Faktory is an open-source work server that queues, dispatches, and manages background jobs across multiple programming languages. It stores job payloads as JSON hashes in a Redis-backed queue and provides language-specific client and worker libraries that enable any language to push jobs to the server or fetch and execute them. The server includes a batch workflow orchestrator that groups jobs into batches with completion tracking for coordinating multi-step asynchronous workflows. It features a configurable job uniqueness filter that prevents duplicate enqueues within a time window, an expon
Go
View on GitHub6,089

Open-source alternatives to Scrapy Redis

rolando/scrapy-redis

taskforcesh/bullmq

contribsys/faktory

henrylee2cn/pholcus

j3ssie/Osmedeus

OptimalBits/bull

wistbean/learn_python3_spider

resque/resque

mperham/sidekiq

rq/rq

binux/pyspider

azkaban/azkaban

quartznet/quartznet

hibiken/asynq

firecrawl/firecrawl

Automattic/kue

hydro-dev/Hydro

dockersamples/example-voting-app

apache/ignite

hazelcast/hazelcast

crazyguitar/pysheeet

hakluke/hakrawler

gocd/gocd

gnemoug/distribute_crawler

coleifer/huey

databricks/learning-spark

oban-bg/oban

laravel/horizon

dora-rs/dora

agronholm/apscheduler