# s0md3v/Photon

**Attribution required: if you use, quote, or summarise this content, you must credit and link back to [awesome-repositories.com](https://awesome-repositories.com/repository/s0md3v-photon).**

12,688 stars · 1,671 forks · Python · gpl-3.0

## Links

- GitHub: https://github.com/s0md3v/Photon
- awesome-repositories: https://awesome-repositories.com/repository/s0md3v-photon.md

## Topics

`crawler` `information-gathering` `osint` `python` `spider`

## Description

Photon is a command-line web crawler designed for security reconnaissance and information gathering. It systematically traverses websites to discover URLs, map domain infrastructure, and identify associated subdomains by retrieving DNS records.

The tool distinguishes itself through its ability to perform deep content analysis, including the extraction of sensitive data such as API keys and authentication tokens using user-defined regular expressions. It supports offline inspection by cloning crawled web content to the local filesystem, allowing for structural analysis without additional network activity.

The crawler utilizes multi-threaded execution to maximize throughput during discovery and supports proxy-aware routing to manage traffic origin. Its architecture is built for integration into automated security workflows, allowing users to pipe discovered metadata and extracted patterns directly to standard output or export results into structured files for further processing.

## Tags

### Security & Cryptography

- [Infrastructure Reconnaissance](https://awesome-repositories.com/f/security-cryptography/infrastructure-reconnaissance.md) — Traverses websites to discover URLs, extract sensitive data, and map domain infrastructure for security analysis.
- [Subdomain Enumeration Tools](https://awesome-repositories.com/f/security-cryptography/subdomain-enumeration-tools.md) — Identifies associated subdomains and retrieves DNS records to expand the scope of web reconnaissance.
- [Sensitive Data Scanners](https://awesome-repositories.com/f/security-cryptography/vulnerability-scanning/sensitive-data-scanners.md) — Scans crawled web content for exposed API keys, authentication tokens, and other sensitive secrets.
- [Access Key Management](https://awesome-repositories.com/f/security-cryptography/access-key-management.md) — Scans crawled content for high-entropy strings to detect exposed authentication tokens and API keys. ([source](https://github.com/s0md3v/Photon/wiki/Usage))
- [Secret Extractors](https://awesome-repositories.com/f/security-cryptography/security/operations-and-incident-response/security-information-management/secret-extractors.md) — Applies custom regular expressions to identify exposed API keys and secrets during web crawls.

### Data & Databases

- [Security Crawlers](https://awesome-repositories.com/f/data-databases/web-scrapers/security-crawlers.md) — Provides a scriptable command-line crawler that saves content locally and pipes discovered metadata into security workflows.
- [Recursive Structure Traversers](https://awesome-repositories.com/f/data-databases/recursive-structure-processors/recursive-structure-traversers.md) — Maintains a dynamic queue of discovered URLs to systematically navigate website structures.
- [Pattern Extraction Utilities](https://awesome-repositories.com/f/data-databases/content-extraction/pattern-extraction-utilities.md) — Applies user-defined regular expressions during the crawl process to capture specific strings of interest. ([source](https://github.com/s0md3v/Photon/wiki/Usage))
- [Web Content Scrapers](https://awesome-repositories.com/f/data-databases/data-engineering-infrastructure/data-extraction-ingestion/web-extraction-engines/web-content-scrapers.md) — Clones crawled web content to the local filesystem for offline analysis and inspection. ([source](https://github.com/s0md3v/Photon/wiki/Usage))
- [Local Filesystem Storage](https://awesome-repositories.com/f/data-databases/storage-abstraction/local-filesystem-storage.md) — Saves crawled web pages and metadata to the local disk for offline inspection and structural analysis.

### Operating Systems & Systems Programming

- [Security](https://awesome-repositories.com/f/operating-systems-systems-programming/terminal-command-line-environments/shells-scripting/orchestration-scripts/security.md) — Integrates web crawling and data extraction tasks into automated shell scripts and security pipelines.

### Web Development

- [Web Crawling](https://awesome-repositories.com/f/web-development/web-automation-scraping/web-scraping-automation/web-crawling.md) — Systematically traverses websites to discover URLs and metadata by following links. ([source](https://github.com/s0md3v/Photon/wiki))
- [Data Extraction](https://awesome-repositories.com/f/web-development/data-extraction.md) — Applies user-defined pattern matching to identify and capture specific strings during traversal.

### DevOps & Infrastructure

- [Security Assessment Frameworks](https://awesome-repositories.com/f/devops-infrastructure/security-automation-workflows/security-assessment-frameworks.md) — Gathers structured website data and metadata to support security auditing and vulnerability assessment workflows.

### Development Tools & Productivity

- [Domain Configuration Tools](https://awesome-repositories.com/f/development-tools-productivity/domain-configuration-tools.md) — Resolves and saves subdomain information to map domain infrastructure and DNS configuration. ([source](https://github.com/s0md3v/Photon/wiki/Usage))

### Networking & Communication

- [Proxy-Aware Network Clients](https://awesome-repositories.com/f/networking-communication/network-infrastructure-routing/network-utilities/proxy-aware-network-clients.md) — Routes outgoing HTTP requests through configurable external proxy servers to mask traffic origin.
- [Traffic Proxying](https://awesome-repositories.com/f/networking-communication/traffic-proxying.md) — Routes web requests through third-party services to distribute traffic origin during reconnaissance. ([source](https://github.com/s0md3v/Photon/wiki/Usage))

### System Administration & Monitoring

- [Discovery Result Exporters](https://awesome-repositories.com/f/system-administration-monitoring/log-analysis-reports/discovery-result-exporters.md) — Saves discovered URLs and extracted data into structured files for further analysis. ([source](https://github.com/s0md3v/Photon/wiki))
- [Offline Analysis Engines](https://awesome-repositories.com/f/system-administration-monitoring/network-operation-logs/offline-analysis-engines.md) — Clones web content to the local filesystem to enable offline structural analysis and inspection of target websites.
