Ltp

Ltp - perform Chinese NLP tasks | Awesome Repos

Features

Chinese Natural Language Processing - Provides a comprehensive suite for Chinese natural language processing, including segmentation, tagging, and named entity recognition.
Chinese NLP Libraries - Offers a comprehensive suite of linguistic analysis tools specifically designed for the Chinese language.
Dependency Syntax Analysis - Establishes syntactic relationships and grammatical dependencies between individual words using a neural network.
Named Entity Recognition - Identifies and categorizes specific entities such as people, places, and organizations within Chinese text.
Dependency Parsers - Implements a neural dependency parser to map syntactic and semantic relationships in text.
Part-of-Speech Taggers - Assigns standardized grammatical categories to words to identify parts of speech such as nouns and verbs.
NLP Training Toolkits - Includes a machine learning training suite for creating custom linguistic models using annotated datasets.
Semantic Dependency Mapping - Maps logical relationships and meanings between words using tree or graph representations to extract sentence meaning.
Semantic Role Extraction - Assigns semantic roles such as beneficiary, location, time, and purpose to determine the meaning of a sentence.
Semantic Role Labeling - Labels constituents as agents, patients, or instruments to determine the roles of actors within a sentence.
Sentence Boundary Detection - Breaks large blocks of Chinese text into individual, logically complete sentences.
Sentence Structure Analysis - Parses dependency and semantic relationships to determine the overall grammatical structure of a sentence.
Lexicon-Based Word Segmentation - Identifies word boundaries in character sequences using lexicons and rules for URIs and English.
Syntactic Parsers - Provides neural network-based parsing to map grammatical hierarchy and dependencies in Chinese sentences.
Word Segmentation - Provides word segmentation to divide continuous Chinese text into individual words for further linguistic analysis.
Lexicon-Driven Segmenters - Combines machine learning models with user-defined dictionaries and frequency weights to identify word boundaries.
Text Processing Pipelines - Processes input through a sequential chain of modules for segmentation, tagging, entity recognition, and dependency parsing.
Native C++ Implementations - Implements core linguistic processing logic in native C++ for high-performance neural network execution.
Linguistic Analysis Pipelines - Executes a pipeline of tasks including segmentation, tagging, entity recognition, and dependency parsing.
Custom Model Training - Enables the generation of statistical models tailored for specific Chinese NLP analysis tasks.
Machine Learning Training - Includes a training suite for creating custom linguistic models using annotated datasets.
Model Pruning - Reduces memory footprint and improves performance by removing infrequently used features from trained models.
Semantic Role Training - Trains predicate prediction and role labeling components to identify sentence participants.
Model Training - Trains models to identify and categorize named entities using manually annotated datasets.
Containerized NLP Servers - Provides a dockerized server environment for deploying NLP models and tools as scalable APIs.
Model Training - Trains models to assign grammatical categories to tokens based on annotated POS data.
Text Segmentation Model Training - Provides frameworks for training word segmentation models to recognize boundaries in continuous text.
Domain Adaptations - Supports improving segmentation accuracy for specialized domains by training on small sets of labeled data.
Chinese - Provides an API to execute linguistic analysis on Chinese text using local or cloud interfaces.
Lexicon Management - Provides systems for updating word dictionaries and adding custom terms to improve analysis accuracy.
Lexicon Extensions - Supports loading user-defined dictionaries to improve the recognition of specific terms during analysis.
Containerized Deployments - Packages the toolkit and pretrained models into Docker images for consistent deployment.
Containerized Server Deployments - Enables exposing processing capabilities via a server interface to handle remote requests.
Web Service Deployments - Hosts natural language processing capabilities as a network service or Docker container to provide analysis via API.
Vocabulary Extensions - Allows adding custom words and frequency weights to the dictionary to improve word segmentation accuracy.
NLP Web Services - Exposes linguistic analysis capabilities through a web service interface returning results in XML format.
Docker Deployments - Provides containerization for the software and models to enable rapid deployment of an API server.
Service Hosting - Exposes natural language processing capabilities as a web service that accepts requests and returns XML results.
Natural Language Processing - Language technology platform for Chinese.
Chinese NLP Toolkits - C++ based language technology platform with Python bindings.

Open-source alternatives to Ltp

Similar open-source projects, ranked by how many features they share with Ltp.

isnowfy/snownlp
isnowfy/snownlp
6,631View on GitHub
SnowNLP is a Python library for Chinese natural language processing. It provides tools for text segmentation, sentiment analysis, document classification, and phonetic transliteration. The library includes capabilities for training and saving custom machine learning models for tokenization and sentiment analysis using raw training datasets. It covers a range of linguistic processing areas, including parts of speech tagging, sentence splitting, and text similarity measurement. The toolkit also provides utilities for extracting key information through text summarization and calculating word im
Python
View on GitHub6,631
hankcs/hanlp
hankcs/HanLP
36,413View on GitHub
HanLP is a natural language processing library and deep learning framework specifically optimized for the Chinese language, while also functioning as a multilingual text processor. It serves as a toolkit for performing linguistic analysis, semantic understanding, and script conversion. The project distinguishes itself through a dedicated focus on Chinese linguistic structures, including a specialized script converter for transforming text between Simplified Chinese, Traditional Chinese, and Pinyin. It further supports domain-specific model training to improve the recognition of professional t
Pythondependency-parserhanlpnamed-entity-recognition
View on GitHub36,413
stanfordnlp/stanza
stanfordnlp/stanza
7,809View on GitHub
Stanza is a Python natural language processing library designed for tokenization, lemmatization, and dependency parsing across many human languages using neural models. It provides a neural processing pipeline that converts raw text into structured linguistic data objects, alongside a specialized analyzer for extracting medical insights from clinical and biomedical language. The project includes a wrapper that connects Python scripts to Java-based natural language processing tools and remote annotation servers. This enables a bridge for extracting linguistic annotations and analysis data from
Pythonartificial-intelligencecorenlpdeep-learning
View on GitHub7,809
lancopku/pkuseg-python
lancopku/pkuseg-python
6,707View on GitHub
pkuseg-python is a Chinese word segmentation toolkit and natural language processing library. It provides specialized models for splitting Chinese text into words across various domains, including news, medical, and web content, and includes a tool for assigning grammatical parts of speech tags to segmented words. The library allows for the training of custom segmentation models using annotated datasets and supports the integration of user-defined dictionaries to ensure specialized terminology is recognized correctly. It employs a multi-threaded execution engine to process large volumes of Ch
Python
View on GitHub6,707

See all 30 alternatives to Ltp

HIT-SCIRltp

Features

Open-source alternatives to Ltp

isnowfy/snownlp

hankcs/HanLP

stanfordnlp/stanza

lancopku/pkuseg-python

Star history

Open-source alternatives to Ltp

isnowfy/snownlp

hankcs/HanLP

stanfordnlp/stanza

lancopku/pkuseg-python