# chyroc/wechatsogou

**Attribution required: if you use, quote, or summarise this content, you must credit and link back to [awesome-repositories.com](https://awesome-repositories.com/repository/chyroc-wechatsogou).**

6,181 stars · 1,703 forks · Python · apache-2.0

## Links

- GitHub: https://github.com/chyroc/WechatSogou
- awesome-repositories: https://awesome-repositories.com/repository/chyroc-wechatsogou.md

## Topics

`crawler` `pypi` `python` `scrapy` `sogou` `wechat`

## Description

WechatSogou is a Python library that provides a programmatic bridge to WeChat content by scraping Sogou's search engine results. It resolves WeChat account profile pages to article lists, filters trending articles by predefined categories, and manages browser-like cookie sessions to maintain authenticated access. The library parses raw HTML to extract article metadata, account details, and keyword suggestions, while dynamically adjusting request frequency to avoid IP blocking.

The project offers capabilities for searching WeChat public accounts and articles by keyword, returning profile details such as name, ID, and authentication status, along with article titles, abstracts, and URLs. It can fetch detailed account information including introductions, QR codes, and recent posting statistics, as well as retrieve the most recent articles from an account's history with metadata like title, cover image, and publish time. The library also generates related keyword suggestions to refine searches and retrieves trending articles from WeChat's homepage by category.

Additional functionality includes parsing trending articles from specified categories and extracting paginated article URLs from account profile pages. The library's documentation covers installation and usage for these data extraction tasks.

## Tags

### Web Development

- [WeChat-Sogou Scraping Bridges](https://awesome-repositories.com/f/web-development/api-bridges/wechat-sogou-scraping-bridges.md) — Bridges WeChat data by scraping Sogou's search engine results and parsing HTML responses into structured JSON objects.

### Part of an Awesome List

- [Social Media Profile Extractors](https://awesome-repositories.com/f/awesome-lists/ai/information-extraction/social-media-profile-extractors.md) — Extracting detailed information for specific public accounts, such as introductions, QR codes, and recent posting statistics.

### Business & Productivity Software

- [WeChat Account Search](https://awesome-repositories.com/f/business-productivity-software/account-wide-search/wechat-account-search.md) — Searching for WeChat public accounts and articles by keyword, retrieving profile details, article metadata, and related search suggestions.

### Content Management & Publishing

- [Account Article Link Resolvers](https://awesome-repositories.com/f/content-management-publishing/article-list-retrieval/account-article-link-resolvers.md) — Resolves WeChat account profile pages to article lists by following redirect chains and extracting paginated article URLs.
- [Account Data Scrapers](https://awesome-repositories.com/f/content-management-publishing/article-list-retrieval/account-article-link-resolvers/account-data-scrapers.md) — An API client for searching WeChat public accounts and articles, fetching account details, and retrieving trending content programmatically.
- [Account Article Metadata Retrievers](https://awesome-repositories.com/f/content-management-publishing/article-list-retrieval/account-article-metadata-retrievers.md) — Retrieving recent articles from an account's history, including metadata like title, cover image, and publish time.
- [Trending Article Retrievers](https://awesome-repositories.com/f/content-management-publishing/article-list-retrieval/trending-article-retrievers.md) — Retrieve trending articles from a platform's homepage by category, returning their titles, abstracts, and source account details. ([source](https://cdn.jsdelivr.net/gh/chyroc/wechatsogou@master/README.md))
- [Account Article History Extractors](https://awesome-repositories.com/f/content-management-publishing/web-article-extraction/account-article-history-extractors.md) — Extract the most recent articles from an account's history page, including metadata like title, cover image, and publish time. ([source](https://cdn.jsdelivr.net/gh/chyroc/wechatsogou@master/README.md))
- [WeChat Article Extraction](https://awesome-repositories.com/f/content-management-publishing/web-article-extraction/wechat-article-extraction.md) — Search for articles by keyword and return their titles, abstracts, URLs, and associated account info. ([source](https://cdn.jsdelivr.net/gh/chyroc/wechatsogou@master/README.md))
- [Article List Retrieval](https://awesome-repositories.com/f/content-management-publishing/article-list-retrieval.md) — Retrieve trending articles from a specified category, such as food. ([source](https://github.com/chyroc/WechatSogou/tree/master/docs/))

### Data & Databases

- [WeChat](https://awesome-repositories.com/f/data-databases/account-discovery/wechat.md) — Search for public accounts by keyword and return their profile details, including name, ID, and authentication status. ([source](https://cdn.jsdelivr.net/gh/chyroc/wechatsogou@master/README.md))
- [Trending Article Category Filters](https://awesome-repositories.com/f/data-databases/data-filtering/service-category-filters/trending-article-category-filters.md) — Filters trending articles by predefined categories (e.g., food, tech) using URL parameter manipulation and category ID mapping.
- [WeChat Keyword Suggestions](https://awesome-repositories.com/f/data-databases/search-suggestions/wechat-keyword-suggestions.md) — Generating related keyword suggestions based on a given query to refine searches for accounts or articles.

### Networking & Communication

- [WeChat Account Search Tools](https://awesome-repositories.com/f/networking-communication/wechat-automation-tools/wechat-account-search-tools.md) — A client that searches for WeChat public accounts by keyword and returns profile details including name, ID, and authentication status.
- [WeChat Article Search APIs](https://awesome-repositories.com/f/networking-communication/wechat-automation-tools/wechat-article-search-apis.md) — A tool that searches WeChat articles by keyword and returns titles, abstracts, URLs, and associated account metadata.

### Security & Cryptography

- [Session & Cookie Handlers](https://awesome-repositories.com/f/security-cryptography/session-cookie-handlers.md) — Manages browser-like cookie sessions to bypass anti-scraping measures and maintain authenticated access to Sogou's search pages.

### Software Engineering & Architecture

- [Client-Side Adaptive Throttling](https://awesome-repositories.com/f/software-engineering-architecture/traffic-management/request-rate-limiting/adaptive-rate-control/client-side-adaptive-throttling.md) — Adjusts request frequency dynamically based on HTTP response codes and retry-after headers to avoid IP blocking.
- [WeChat Account Detail Retrievers](https://awesome-repositories.com/f/software-engineering-architecture/component-lifecycle-management/component-detail-retrievers/content-detail-retrievers/wechat-account-detail-retrievers.md) — Fetch detailed information for a specific public account, including its introduction, QR code, and recent posting stats. ([source](https://cdn.jsdelivr.net/gh/chyroc/wechatsogou@master/README.md))

### User Interface & Experience

- [PDF and HTML Content Extraction](https://awesome-repositories.com/f/user-interface-experience/html-content-processing/pdf-and-html-content-extraction.md) — Extracts article metadata, account details, and keyword suggestions by parsing raw HTML with regex and DOM traversal.
- [Search Keyword Suggesters](https://awesome-repositories.com/f/user-interface-experience/autocomplete-suggestion-engines/search-keyword-suggesters.md) — Generates related search terms by parsing Sogou's autocomplete API responses and filtering for WeChat-specific content.

### Graphics & Multimedia

- [WeChat Trending Content](https://awesome-repositories.com/f/graphics-multimedia/video-downloaders/trending-content-retrievers/wechat-trending-content.md) — Fetching and parsing trending articles from WeChat platform categories, including titles, abstracts, and source account information.
