Sensitive-lexicon is a sensitive word detection service and content moderation tool designed to identify prohibited text. It utilizes a curated lexicon of thousands of categorized terms and a fuzzy matching text scanner to detect restricted words and phrases.
The project features specialized filters for Chinese language content across political, social, and adult domains. It supports approximate string matching to identify terms that use noise characters or whitespace to evade standard keyword filters.
The system includes a network interface for hosting the detection service, allowing for real-time lexicon updates without interrupting the active process. It organizes sensitive terms into domain labels to provide context for flagged text.