OpenNMT-py is a PyTorch neural machine translation framework used for training and deploying neural machine translation and large language models. It functions as a distributed model training system, an inference engine, and a toolkit for fine-tuning large language models.
The framework distinguishes itself with a dedicated toolkit for adapting large language models through low-rank adaptation, quantization, and instruction tuning. It also includes a neural machine translation server that allows trained models to be hosted and exposed via REST API endpoints.
The project covers a broad range of capabilities, including data preprocessing and augmentation, model architecture configuration, and performance optimization through tensor parallelism and weight quantization. It also provides tools for model execution, such as beam search decoding and word alignment extraction, alongside performance evaluation for translation quality and model accuracy.