# hit-scir/ltp

**Attribution required: if you use, quote, or summarise this content, you must credit and link back to [awesome-repositories.com](https://awesome-repositories.com/repository/hit-scir-ltp).**

5,253 stars · 1,062 forks · Python

## Links

- GitHub: https://github.com/HIT-SCIR/ltp
- Homepage: http://ltp.ai
- awesome-repositories: https://awesome-repositories.com/repository/hit-scir-ltp.md

## Topics

`chinese-nlp` `machine-learning` `natural-language-processing` `nlp`

## Description

This is a Chinese natural language processing toolkit providing a suite of tools for word segmentation, part-of-speech tagging, and named entity recognition. It includes a neural dependency parser for analyzing syntactic and semantic relationships between words and a machine learning training suite for creating custom linguistic models using annotated datasets.

The toolkit distinguishes itself through its deployment flexibility, offering a dockerized server and a web service interface that exposes processing capabilities via API. It supports the use of pretrained models and allows for the integration of external lexicons and word dictionary extensions to improve analysis accuracy.

Broadly, the project covers a full pipeline of linguistic tasks, including sentence segmentation, syntactic dependency mapping, and semantic role labeling. These capabilities are available through a command-line interface, standalone modules, or integrated analysis pipelines.

The core logic is implemented in C++ with official language bindings for Python and Java.

## Tags

### Artificial Intelligence & ML

- [Chinese Natural Language Processing](https://awesome-repositories.com/f/artificial-intelligence-ml/chinese-natural-language-processing.md) — Provides a comprehensive suite for Chinese natural language processing, including segmentation, tagging, and named entity recognition.
- [Chinese NLP Libraries](https://awesome-repositories.com/f/artificial-intelligence-ml/chinese-nlp-libraries.md) — Offers a comprehensive suite of linguistic analysis tools specifically designed for the Chinese language.
- [Dependency Syntax Analysis](https://awesome-repositories.com/f/artificial-intelligence-ml/dependency-syntax-analysis.md) — Establishes syntactic relationships and grammatical dependencies between individual words using a neural network. ([source](http://ltp.ai/docs/ltp3.x/theory.html))
- [Named Entity Recognition](https://awesome-repositories.com/f/artificial-intelligence-ml/named-entity-recognition.md) — Identifies and categorizes specific entities such as people, places, and organizations within Chinese text. ([source](http://ltp.ai/docs/index.html))
- [Dependency Parsers](https://awesome-repositories.com/f/artificial-intelligence-ml/neural-networks/dependency-parsers.md) — Implements a neural dependency parser to map syntactic and semantic relationships in text.
- [Part-of-Speech Taggers](https://awesome-repositories.com/f/artificial-intelligence-ml/part-of-speech-taggers.md) — Assigns standardized grammatical categories to words to identify parts of speech such as nouns and verbs. ([source](https://cdn.jsdelivr.net/gh/hit-scir/ltp@main/README.md))
- [NLP Training Toolkits](https://awesome-repositories.com/f/artificial-intelligence-ml/pytorch-training-frameworks/nlp-training-toolkits.md) — Includes a machine learning training suite for creating custom linguistic models using annotated datasets.
- [Semantic Dependency Mapping](https://awesome-repositories.com/f/artificial-intelligence-ml/semantic-dependency-mapping.md) — Maps logical relationships and meanings between words using tree or graph representations to extract sentence meaning. ([source](http://ltp.ai/docs/quickstart.html))
- [Semantic Role Extraction](https://awesome-repositories.com/f/artificial-intelligence-ml/semantic-role-extraction.md) — Assigns semantic roles such as beneficiary, location, time, and purpose to determine the meaning of a sentence. ([source](http://ltp.ai/docs/ltp3.x/appendix.html))
- [Semantic Role Labeling](https://awesome-repositories.com/f/artificial-intelligence-ml/semantic-role-labeling.md) — Labels constituents as agents, patients, or instruments to determine the roles of actors within a sentence. ([source](http://ltp.ai/docs/appendix.html))
- [Sentence Boundary Detection](https://awesome-repositories.com/f/artificial-intelligence-ml/sentence-boundary-detection.md) — Breaks large blocks of Chinese text into individual, logically complete sentences. ([source](http://ltp.ai/docs/ltp3.x/install.html))
- [Sentence Structure Analysis](https://awesome-repositories.com/f/artificial-intelligence-ml/sentence-structure-analysis.md) — Parses dependency and semantic relationships to determine the overall grammatical structure of a sentence. ([source](http://ltp.ai/docs/index.html))
- [Lexicon-Based Word Segmentation](https://awesome-repositories.com/f/artificial-intelligence-ml/sequence-modeling/character-level-models/character-based-segmentation/lexicon-based-word-segmentation.md) — Identifies word boundaries in character sequences using lexicons and rules for URIs and English. ([source](https://cdn.jsdelivr.net/gh/hit-scir/ltp@main/README.md))
- [Syntactic Parsers](https://awesome-repositories.com/f/artificial-intelligence-ml/syntactic-parsers.md) — Provides neural network-based parsing to map grammatical hierarchy and dependencies in Chinese sentences.
- [Word Segmentation](https://awesome-repositories.com/f/artificial-intelligence-ml/word-segmentation.md) — Provides word segmentation to divide continuous Chinese text into individual words for further linguistic analysis. ([source](http://ltp.ai/docs/ltp3.x/install.html))
- [Custom Model Training](https://awesome-repositories.com/f/artificial-intelligence-ml/custom-model-training.md) — Enables the generation of statistical models tailored for specific Chinese NLP analysis tasks. ([source](http://ltp.ai/docs/introduction.html))
- [Machine Learning Training](https://awesome-repositories.com/f/artificial-intelligence-ml/machine-learning/infrastructure/machine-learning-training.md) — Includes a training suite for creating custom linguistic models using annotated datasets.
- [Model Pruning](https://awesome-repositories.com/f/artificial-intelligence-ml/model-optimization/compression-techniques/model-pruning.md) — Reduces memory footprint and improves performance by removing infrequently used features from trained models. ([source](http://ltp.ai/docs/ltp3.x/theory.html))
- [Semantic Role Training](https://awesome-repositories.com/f/artificial-intelligence-ml/model-training-toolkits/semantic-role-training.md) — Trains predicate prediction and role labeling components to identify sentence participants. ([source](http://ltp.ai/docs/ltp3.x/train.html))
- [Model Training](https://awesome-repositories.com/f/artificial-intelligence-ml/named-entity-recognition/model-training.md) — Trains models to identify and categorize named entities using manually annotated datasets. ([source](http://ltp.ai/docs/ltp3.x/train.html))
- [Containerized NLP Servers](https://awesome-repositories.com/f/artificial-intelligence-ml/natural-language-processing/nlp-applications/containerized-nlp-servers.md) — Provides a dockerized server environment for deploying NLP models and tools as scalable APIs.
- [Model Training](https://awesome-repositories.com/f/artificial-intelligence-ml/part-of-speech-taggers/model-training.md) — Trains models to assign grammatical categories to tokens based on annotated POS data. ([source](http://ltp.ai/docs/ltp3.x/train.html))
- [Text Segmentation Model Training](https://awesome-repositories.com/f/artificial-intelligence-ml/text-segmentation-model-training.md) — Provides frameworks for training word segmentation models to recognize boundaries in continuous text. ([source](http://ltp.ai/docs/ltp3.x/train.html))
- [Domain Adaptations](https://awesome-repositories.com/f/artificial-intelligence-ml/word-segmentation-training/domain-adaptations.md) — Supports improving segmentation accuracy for specialized domains by training on small sets of labeled data. ([source](http://ltp.ai/docs/ltp3.x/theory.html))

### Part of an Awesome List

- [Lexicon-Driven Segmenters](https://awesome-repositories.com/f/awesome-lists/devtools/word-segmentation-tools/lexicon-driven-segmenters.md) — Combines machine learning models with user-defined dictionaries and frequency weights to identify word boundaries.
- [Chinese](https://awesome-repositories.com/f/awesome-lists/ai/text-analysis-apis/chinese.md) — Provides an API to execute linguistic analysis on Chinese text using local or cloud interfaces. ([source](http://ltp.ai/docs/introduction.html))
- [Natural Language Processing](https://awesome-repositories.com/f/awesome-lists/ai/natural-language-processing.md) — Language technology platform for Chinese.
- [Chinese NLP Toolkits](https://awesome-repositories.com/f/awesome-lists/devtools/chinese-nlp-toolkits.md) — C++ based language technology platform with Python bindings.

### Data & Databases

- [Text Processing Pipelines](https://awesome-repositories.com/f/data-databases/text-processing-pipelines.md) — Processes input through a sequential chain of modules for segmentation, tagging, entity recognition, and dependency parsing.

### Operating Systems & Systems Programming

- [Native C++ Implementations](https://awesome-repositories.com/f/operating-systems-systems-programming/native-c-implementations.md) — Implements core linguistic processing logic in native C++ for high-performance neural network execution.

### Software Engineering & Architecture

- [Linguistic Analysis Pipelines](https://awesome-repositories.com/f/software-engineering-architecture/compositional-transformation-pipelines/linguistic-analysis-pipelines.md) — Executes a pipeline of tasks including segmentation, tagging, entity recognition, and dependency parsing. ([source](http://ltp.ai/docs/ltp3.x/ltpserver.html))
- [NLP Web Services](https://awesome-repositories.com/f/software-engineering-architecture/api-wrappers/unified-nlp-interfaces/nlp-web-services.md) — Exposes linguistic analysis capabilities through a web service interface returning results in XML format.

### Development Tools & Productivity

- [Lexicon Management](https://awesome-repositories.com/f/development-tools-productivity/dictionary-and-translation-tools/lexicon-datasets/lexicon-driven-analysis/lexicon-management.md) — Provides systems for updating word dictionaries and adding custom terms to improve analysis accuracy.
- [Lexicon Extensions](https://awesome-repositories.com/f/development-tools-productivity/dictionary-and-translation-tools/lexicon-datasets/lexicon-extensions.md) — Supports loading user-defined dictionaries to improve the recognition of specific terms during analysis. ([source](http://ltp.ai/docs/ltp3.x/ltptest.html))

### DevOps & Infrastructure

- [Containerized Deployments](https://awesome-repositories.com/f/devops-infrastructure/containerized-deployments.md) — Packages the toolkit and pretrained models into Docker images for consistent deployment.
- [Containerized Server Deployments](https://awesome-repositories.com/f/devops-infrastructure/containerized-server-deployments.md) — Enables exposing processing capabilities via a server interface to handle remote requests. ([source](http://ltp.ai/docs/ltp3.x/index.html))
- [Web Service Deployments](https://awesome-repositories.com/f/devops-infrastructure/web-service-deployments.md) — Hosts natural language processing capabilities as a network service or Docker container to provide analysis via API.

### Education & Learning Resources

- [Vocabulary Extensions](https://awesome-repositories.com/f/education-learning-resources/word-dictionaries/vocabulary-extensions.md) — Allows adding custom words and frequency weights to the dictionary to improve word segmentation accuracy. ([source](http://ltp.ai/docs/quickstart.html))

### Web Development

- [Docker Deployments](https://awesome-repositories.com/f/web-development/self-hosted-file-servers/docker-deployments.md) — Provides containerization for the software and models to enable rapid deployment of an API server. ([source](http://ltp.ai/docs/ltp3.x/install.html))
- [Service Hosting](https://awesome-repositories.com/f/web-development/service-hosting.md) — Exposes natural language processing capabilities as a web service that accepts requests and returns XML results. ([source](http://ltp.ai/docs/ltp3.x/ltpserver.html))
