Spider_XHS is a data extraction and automation tool built specifically for the Xiaohongshu social platform. It orchestrates multi-step workflows that combine comment tree traversal, cookie-based session reuse, high-resolution media retrieval, keyword search, proxy-backed retries, QR-code login, structured file export, and aggregated user profile collection into a single pipeline.
The tool distinguishes itself through its integrated authentication and publishing capabilities, supporting login via QR code scanning or phone verification codes to establish and maintain authenticated sessions. It can upload image albums and videos to creator accounts, retrieve previously published works, and scrape full post details along with watermark-free media and associated comment threads. A proxy-aware network layer handles automatic retries and proxy routing to ensure reliable access to platform data.
Beyond core scraping, Spider_XHS exports scraped content into structured JSON and Excel files, organizing media into timestamped local directories. It aggregates user profiles by fetching notes, likes, and favorites from multiple endpoints and merging them into unified records. The tool also supports keyword-based search crawling, querying the platform’s search API to collect matching notes and user accounts with paginated results.