# speedyapply/jobspy

**Attribution required: if you use, quote, or summarise this content, you must credit and link back to [awesome-repositories.com](https://awesome-repositories.com/repository/speedyapply-jobspy).**

2,798 stars · 575 forks · Python · mit

## Links

- GitHub: https://github.com/speedyapply/JobSpy
- awesome-repositories: https://awesome-repositories.com/repository/speedyapply-jobspy.md

## Topics

`bayt` `bdjobs` `glassdoor` `google-jobs` `indeed` `internship` `job-scraper` `job-search` `jobs-scraper` `jobs-search` `jobsearch` `jobseeker` `linkedin` `linkedin-scraper` `remote-job` `remote-jobs` `remote-work` `ziprecruiter`

## Description

JobSpy is a job board scraper and listing aggregator designed to extract employment opportunities from multiple websites and compile them into a unified dataset. It functions as a job search automation tool that programmatically collects vacancies based on keywords, locations, and specific filters.

The project serves as a web scraping framework that utilizes proxy routing and user-agent rotation to bypass rate limits and avoid server-side blocking during data extraction. It includes infrastructure for concurrent request aggregation and schema-based data normalization to ensure consistent formatting across disparate sources.

The system provides capabilities for automated job searching and employment trend analysis through job market data aggregation and search filtering.

## Tags

### Business & Productivity Software

- [Job Aggregators](https://awesome-repositories.com/f/business-productivity-software/job-aggregators.md) — Aggregates job listings from multiple online boards into a single unified dataset. ([source](https://github.com/speedyapply/JobSpy#readme))
- [Job Board Scrapers](https://awesome-repositories.com/f/business-productivity-software/job-board-scrapers.md) — Extracts job postings from multiple employment websites and aggregates them into a unified dataset.
- [Job Discovery Automation](https://awesome-repositories.com/f/business-productivity-software/job-discovery-automation.md) — Automates the discovery of job opportunities across various platforms using keywords and location filters.
- [Job Market Scraping](https://awesome-repositories.com/f/business-productivity-software/job-market-scraping.md) — Automates the extraction of employment opportunities and market requirements from multiple web platforms.

### Data & Databases

- [Schema-Driven Data Normalizers](https://awesome-repositories.com/f/data-databases/data-processing-pipelines/data-processing/data-normalization-schema-enforcement/schema-driven-data-normalizers.md) — Standardizes heterogeneous HTML and JSON responses from multiple job boards into a consistent data schema.
- [Search Result Filtering](https://awesome-repositories.com/f/data-databases/search-result-filtering.md) — Refines search results using specific constraints such as location, job type, and posting recency. ([source](https://github.com/speedyapply/JobSpy/blob/main/README.md))

### Networking & Communication

- [Proxy and Fingerprint Rotation](https://awesome-repositories.com/f/networking-communication/proxy-rotation-services/proxy-and-fingerprint-rotation.md) — Automatically rotates both proxies and user-agents to prevent IP blocking and automation detection. ([source](https://github.com/speedyapply/JobSpy/blob/main/README.md))
- [Proxy Routing](https://awesome-repositories.com/f/networking-communication/request-proxies/proxy-routing.md) — Distributes outbound requests across a pool of rotating proxies to bypass IP-based rate limits.
- [User Agent Rotation](https://awesome-repositories.com/f/networking-communication/user-agent-rotation.md) — Cycles through multiple user-agent strings to mimic different browsers and avoid detection.

### Web Development

- [Web Scraping](https://awesome-repositories.com/f/web-development/web-scraping.md) — Provides a robust framework for extracting structured data from websites using anti-blocking techniques.
- [Concurrent Request Pooling](https://awesome-repositories.com/f/web-development/http-request-managers/concurrent-request-pooling.md) — Implements parallel HTTP request execution using capped pools to optimize data collection throughput.
- [Web Scraping Frameworks](https://awesome-repositories.com/f/web-development/web-scraping-frameworks.md) — Implements a framework for automated data collection from websites with built-in proxy and user-agent management.

### Software Engineering & Architecture

- [Job Listing Filters](https://awesome-repositories.com/f/software-engineering-architecture/filtering-engines/job-listing-filters.md) — Provides filtering capabilities to narrow job postings using keywords and company blacklists.
