# projectdiscovery/katana

**Attribution required: if you use, quote, or summarise this content, you must credit and link back to [awesome-repositories.com](https://awesome-repositories.com/repository/projectdiscovery-katana).**

15,584 stars · 946 forks · Go · mit

## Links

- GitHub: https://github.com/projectdiscovery/katana
- awesome-repositories: https://awesome-repositories.com/repository/projectdiscovery-katana.md

## Topics

`cli` `crawler` `gocrawler` `hacktoberfest` `headless` `spider-framework` `web-spider`

## Description

Katana is a web crawler and spider designed for security reconnaissance and web application mapping. It functions as a utility for identifying endpoints, forms, and API structures across web targets by combining standard HTTP request traversal with headless browser automation to render dynamic, JavaScript-heavy content.

The tool distinguishes itself through its ability to maintain authenticated sessions and handle complex web interactions, such as automated form submission and captcha resolution. It provides granular control over the discovery process, allowing users to define specific crawl scopes, throttle request rates, and apply custom filtering logic to refine datasets based on response attributes or status codes.

Beyond basic navigation, the project supports advanced data extraction and monitoring capabilities. It can classify page content, store raw request and response pairs for auditing, and use pattern-based matching to isolate specific information from web traffic. The software is distributed as a single, statically compiled binary to ensure portability across different environments.

## Tags

### Web Development

- [Web Crawlers](https://awesome-repositories.com/f/web-development/web-automation-scraping/web-scraping-automation/web-scraping/web-crawlers.md) — Functions as a web crawler and spider for discovering and mapping application endpoints via HTTP and headless browser automation.
- [Web Crawling](https://awesome-repositories.com/f/web-development/web-automation-scraping/web-scraping-automation/web-crawling.md) — Discovers and maps endpoints by traversing web pages using standard HTTP requests or headless browser automation. ([source](https://github.com/projectdiscovery/katana#readme))
- [Headless Browser Orchestrators](https://awesome-repositories.com/f/web-development/web-automation-scraping/browser-orchestration-systems/headless-browser-orchestrators.md) — Uses a browser automation engine to render dynamic client-side content and execute JavaScript for comprehensive endpoint discovery.
- [Crawling Environment Configurations](https://awesome-repositories.com/f/web-development/web-automation-scraping/web-scraping-automation/web-crawling/crawling-environment-configurations.md) — Adjusts crawling depth, concurrency, and rate limits to balance discovery speed against target server capacity. ([source](https://github.com/projectdiscovery/katana/blob/dev/README.md))
- [Data Extraction](https://awesome-repositories.com/f/web-development/data-extraction.md) — Identifies and captures specific information from HTTP responses during the crawling process using regex-based configuration. ([source](https://github.com/projectdiscovery/katana#readme))
- [Form Submission Clients](https://awesome-repositories.com/f/web-development/form-submission-clients.md) — Populates web forms automatically during data collection by using configurable field values and dynamic data generation. ([source](https://github.com/projectdiscovery/katana#readme))

### Security & Cryptography

- [Security Reconnaissance Tools](https://awesome-repositories.com/f/security-cryptography/security-reconnaissance-tools.md) — Acts as a specialized utility for identifying sensitive information, forms, and API structures across web targets.
- [Reconnaissance Tools](https://awesome-repositories.com/f/security-cryptography/web-application-security/reconnaissance-tools.md) — Maps web application structures and endpoints to identify hidden resources during security assessments.
- [Vulnerability Assessment Frameworks](https://awesome-repositories.com/f/security-cryptography/vulnerability-assessment-testing/security-testing-auditing/security-testing-tools/reconnaissance-assessment-platforms/vulnerability-assessment-frameworks.md) — Systematically probes web interfaces and forms to identify security weaknesses and potential attack surfaces.
- [Browser Session Authentication](https://awesome-repositories.com/f/security-cryptography/identity-access-management/authentication-strategies/session-and-credential-handling/session-credential-management/browser-session-authentication.md) — Accesses protected content by injecting custom headers, cookies, or connecting directly to an active browser session. ([source](https://github.com/projectdiscovery/katana#readme))
- [Automated Captcha Solvers](https://awesome-repositories.com/f/security-cryptography/captcha-services/automated-captcha-solvers.md) — Detects and resolves common captcha challenges during headless crawling sessions by integrating with external solving services. ([source](https://github.com/projectdiscovery/katana#readme))

### Development Tools & Productivity

- [Crawl Depth Limiters](https://awesome-repositories.com/f/development-tools-productivity/search-paging-limits/crawl-depth-limiters.md) — Provides configurable depth and scope limits to control recursive link traversal during web application mapping. ([source](https://github.com/projectdiscovery/katana#readme))
- [Headless Browser Automation](https://awesome-repositories.com/f/development-tools-productivity/headless-browser-automation.md) — Uses headless browser engines to render dynamic JavaScript content and discover hidden web endpoints.

### Software Engineering & Architecture

- [Crawling Request Throttlers](https://awesome-repositories.com/f/software-engineering-architecture/request-throttling/crawling-request-throttlers.md) — Implements request rate and concurrency controls to manage network throughput and respect target server capacity. ([source](https://github.com/projectdiscovery/katana#readme))

### Data & Databases

- [Web Data Extraction](https://awesome-repositories.com/f/data-databases/web-data-extraction.md) — Extracts and isolates specific data points from web content using pattern-based matching and custom logic.

### DevOps & Infrastructure

- [Crawl Boundary Controls](https://awesome-repositories.com/f/devops-infrastructure/dependency-management/environment-scoping-controls/crawl-boundary-controls.md) — Defines boundaries for discovery using domain rules, regex patterns, or exclusion lists to prevent navigating outside intended targets. ([source](https://github.com/projectdiscovery/katana/blob/dev/README.md))

### Networking & Communication

- [HTTP Request Customization](https://awesome-repositories.com/f/networking-communication/http-request-customization.md) — Maintains authenticated state by injecting persistent cookies and custom headers into outgoing HTTP requests to access protected web resources.
- [Logic-Based Filters](https://awesome-repositories.com/f/networking-communication/http-response-processors/logic-based-filters.md) — Evaluates discovered endpoints against custom expression rules to dynamically refine datasets based on response attributes and status codes.
- [Network Traffic Management](https://awesome-repositories.com/f/networking-communication/network-infrastructure-routing/network-routing-traffic-management/network-traffic-management.md) — Manages the flow and speed of automated crawling tasks to ensure efficient data collection.

### System Administration & Monitoring

- [Page Content Classifiers](https://awesome-repositories.com/f/system-administration-monitoring/monitoring-and-status-pages/page-lifecycle-trackers/page-content-classifiers.md) — Analyzes crawled pages to identify page types, extract forms, detect secrets, and categorize endpoints. ([source](https://github.com/projectdiscovery/katana#readme))

### Testing & Quality Assurance

- [Dynamic Response Filters](https://awesome-repositories.com/f/testing-quality-assurance/general-testing-utilities/test-utilities-assertions/network-api-mocking/api-response-modifiers/dynamic-response-filters.md) — Applies domain-specific language expressions to refine and match discovered endpoints based on response attributes or status codes. ([source](https://github.com/projectdiscovery/katana#readme))
