# cv-cat/spider_xhs

**Attribution required: if you use, quote, or summarise this content, you must credit and link back to [awesome-repositories.com](https://awesome-repositories.com/repository/cv-cat-spider-xhs).**

4,348 stars · 773 forks · JavaScript

## Links

- GitHub: https://github.com/cv-cat/Spider_XHS
- awesome-repositories: https://awesome-repositories.com/repository/cv-cat-spider-xhs.md

## Description

Spider_XHS is a data extraction and automation tool built specifically for the Xiaohongshu social platform. It orchestrates multi-step workflows that combine comment tree traversal, cookie-based session reuse, high-resolution media retrieval, keyword search, proxy-backed retries, QR-code login, structured file export, and aggregated user profile collection into a single pipeline.

The tool distinguishes itself through its integrated authentication and publishing capabilities, supporting login via QR code scanning or phone verification codes to establish and maintain authenticated sessions. It can upload image albums and videos to creator accounts, retrieve previously published works, and scrape full post details along with watermark-free media and associated comment threads. A proxy-aware network layer handles automatic retries and proxy routing to ensure reliable access to platform data.

Beyond core scraping, Spider_XHS exports scraped content into structured JSON and Excel files, organizing media into timestamped local directories. It aggregates user profiles by fetching notes, likes, and favorites from multiple endpoints and merging them into unified records. The tool also supports keyword-based search crawling, querying the platform’s search API to collect matching notes and user accounts with paginated results.

## Tags

### Business & Productivity Software

- [Xiaohongshu Content Scrapers](https://awesome-repositories.com/f/business-productivity-software/content-publishing/video-publishing-integrations/xiaohongshu-content-scrapers.md) — A tool for extracting user profiles, posts, and comments from the Xiaohongshu social platform for research or analysis.

### Artificial Intelligence & ML

- [Keyword Search Crawlers](https://awesome-repositories.com/f/artificial-intelligence-ml/bot-platforms/platform-normalization-adapters/platform-search-adapters/keyword-search-crawlers.md) — Queries the platform’s search API with user-supplied terms and paginates through results to collect matching notes and profiles. ([source](https://cdn.jsdelivr.net/gh/cv-cat/spider_xhs@master/README.md))

### Content Management & Publishing

- [Comment Extractors](https://awesome-repositories.com/f/content-management-publishing/content-aggregation-curation/comment-systems/comment-extractors.md) — Walks nested comment trees by following reply chains and paginating through top-level comments for each post.

### Data & Databases

- [Automated Content Publishing](https://awesome-repositories.com/f/data-databases/api-upload-interfaces/content-publishing-apis/automated-content-publishing.md) — Uploading image galleries and video files to Xiaohongshu creator accounts through automated workflows.
- [High-Resolution Media Scraping](https://awesome-repositories.com/f/data-databases/data-scraping-tools/social-media-data-scraping/high-resolution-media-scraping.md) — Retrieves image and video URLs from platform responses, stripping watermark parameters to fetch original files.
- [Post & Comment Scraping](https://awesome-repositories.com/f/data-databases/data-scraping-tools/social-media-data-scraping/high-resolution-media-scraping/post-comment-scraping.md) — Fetch full post details, high-resolution media without watermarks, and associated comment threads for analysis. ([source](https://cdn.jsdelivr.net/gh/cv-cat/spider_xhs@master/README.md))
- [Structured Data Exporters](https://awesome-repositories.com/f/data-databases/data-serialization-formats/structured-data-exporters.md) — A utility that saves scraped Xiaohongshu data as JSON and Excel files in organized local directories.
- [Scraped Data Exporters](https://awesome-repositories.com/f/data-databases/structured-data-extraction/structured-data-file-extractors/scraped-data-exporters.md) — Converts scraped data into JSON and Excel files, organizing media into timestamped local directories.

### Graphics & Multimedia

- [Watermark-Free Media Retrieval](https://awesome-repositories.com/f/graphics-multimedia/watermark-free-media-retrieval.md) — A scraper that retrieves high-resolution images and videos from Xiaohongshu without watermarks.
- [Social Platform Uploaders](https://awesome-repositories.com/f/graphics-multimedia/media-production-suites/media-management-production/media-management-systems/media-file-upload-handlers/social-platform-uploaders.md) — Upload image albums and videos to the creator platform and retrieve a list of previously published works. ([source](https://cdn.jsdelivr.net/gh/cv-cat/spider_xhs@master/README.md))

### Networking & Communication

- [QR Code & Phone Verification Logins](https://awesome-repositories.com/f/networking-communication/communication-platforms-services/messaging-notification-systems/messaging-automation/account-authentication-gateways/account-authentication/qr-code-phone-verification-logins.md) — Logging into Xiaohongshu via QR code or phone verification to access restricted data and maintain session stability.
- [User Profile Retrieval](https://awesome-repositories.com/f/networking-communication/contact-management/user-profile-retrieval.md) — Retrieve detailed user information including profile, uploaded notes, liked and favorited content from the platform. ([source](https://cdn.jsdelivr.net/gh/cv-cat/spider_xhs@master/README.md))
- [Proxy-Aware Network Clients](https://awesome-repositories.com/f/networking-communication/network-infrastructure-routing/network-utilities/proxy-aware-network-clients.md) — Handling network requests with automatic retries and proxy support to ensure reliable scraping and publishing operations.

### Security & Cryptography

- [Session-Cookie Persistences](https://awesome-repositories.com/f/security-cryptography/session-cookie-handlers/session-cookie-persistences.md) — Maintains authenticated state by storing and reusing session cookies across requests to avoid repeated logins.
- [Phone Verification Code Flows](https://awesome-repositories.com/f/security-cryptography/user-authentication-flows/qr-code-handshakes/phone-verification-code-flows.md) — Log in to the platform by scanning a QR code or entering a phone verification code for secure access. ([source](https://cdn.jsdelivr.net/gh/cv-cat/spider_xhs@master/README.md))
- [Session Token Extractors](https://awesome-repositories.com/f/security-cryptography/user-authentication-flows/qr-code-handshakes/session-token-extractors.md) — Handles login by polling a QR code endpoint until the user scans it, then extracts session tokens from the response.

### System Administration & Monitoring

- [Social Media Account Bots](https://awesome-repositories.com/f/system-administration-monitoring/account-management-apis/account-automation-frameworks/social-media-account-bots.md) — A bot that automates login via QR code or phone verification and publishes content to Xiaohongshu creator accounts.

### Web Development

- [Retry and Backoff Logic](https://awesome-repositories.com/f/web-development/http-client-wrappers/retry-and-backoff-logic.md) — Routes requests through configurable proxy pools and automatically retries failed requests with exponential backoff.
- [Multi-Endpoint Aggregators](https://awesome-repositories.com/f/web-development/user-profiles/multi-endpoint-aggregators.md) — Fetches multiple profile-related endpoints (notes, likes, favorites) and merges them into a single structured record.
