# openmoss/moss

**Attribution required: if you use, quote, or summarise this content, you must credit and link back to [awesome-repositories.com](https://awesome-repositories.com/repository/openmoss-moss).**

12,140 stars · 1,134 forks · Python · Apache-2.0

## Links

- GitHub: https://github.com/OpenMOSS/MOSS
- Homepage: https://txsun1997.github.io/blogs/moss.html
- awesome-repositories: https://awesome-repositories.com/repository/openmoss-moss.md

## Topics

`chatgpt` `deep-learning` `dialogue-systems` `large-language-models` `natural-language-processing` `text-generation`

## Description

MOSS is a conversational AI API server and framework designed to manage stateful multi-turn dialogues via session identifiers for remote interaction. It functions as a tool-augmented language model framework and a quantized inference engine.

The project integrates external plugins, such as search engines and calculators, to provide factual and computed data within model responses. It also includes a supervised fine-tuning toolkit for adapting base language models to specific conversational datasets and behavioral instructions.

The system supports inference optimization through 4-bit and 8-bit weight quantization to reduce GPU memory and computation costs. It further provides capabilities for model API hosting and the deployment of interactive demos via web or command-line interfaces.

## Tags

### Artificial Intelligence & ML

- [Conversational AI APIs](https://awesome-repositories.com/f/artificial-intelligence-ml/conversational-ai-apis.md) — Functions as a web service that manages stateful multi-turn dialogues via session identifiers.
- [Conversational AI Deployments](https://awesome-repositories.com/f/artificial-intelligence-ml/conversational-ai-deployments.md) — Hosts a language model as a network service to manage stateful multi-turn dialogues.
- [Conversational Session Management](https://awesome-repositories.com/f/artificial-intelligence-ml/conversational-session-management.md) — Implements context tracking across multi-turn dialogues using unique session identifiers. ([source](https://github.com/openmoss/moss#readme))
- [External Tool Integration](https://awesome-repositories.com/f/artificial-intelligence-ml/external-tool-integration.md) — Integrates external plugins like search engines and calculators to augment model responses with factual data. ([source](https://github.com/OpenMOSS/MOSS/blob/main/README.md))
- [Tool Augmentations](https://awesome-repositories.com/f/artificial-intelligence-ml/language-model-orchestration/retrieval-augmented-generation/tool-augmentations.md) — Combines model outputs with external tool data from search engines and calculators for improved accuracy.
- [LLM Conversational AI Frameworks](https://awesome-repositories.com/f/artificial-intelligence-ml/llm-conversational-ai-frameworks.md) — Provides a framework coordinating a large language model with external tool plugins for factual augmentation.
- [Model Serving APIs](https://awesome-repositories.com/f/artificial-intelligence-ml/model-serving-apis.md) — Exposes language models as network-accessible services via standard API endpoints for remote interaction.
- [Conversational Dialogue Systems](https://awesome-repositories.com/f/artificial-intelligence-ml/natural-language-interfaces/conversational-dialogue-systems.md) — Provides a system for generating human-like text across multi-turn dialogues and following behavioral instructions. ([source](https://github.com/openmoss/moss#readme))
- [Quantized Inference Runtimes](https://awesome-repositories.com/f/artificial-intelligence-ml/quantized-inference-runtimes.md) — Provides a quantized inference engine that supports 4-bit and 8-bit weight precision to reduce GPU memory and computation costs. ([source](https://github.com/OpenMOSS/MOSS/blob/main/README_en.md))
- [Inference Optimization](https://awesome-repositories.com/f/artificial-intelligence-ml/machine-learning/infrastructure/model-inference-serving/inference-optimization.md) — Optimizes inference by using 4-bit and 8-bit quantization to reduce GPU memory and computation.
- [Weight Quantization](https://awesome-repositories.com/f/artificial-intelligence-ml/quantized-inference-runtimes/weight-quantization.md) — Utilizes 4-bit and 8-bit precision to compress model weights and reduce GPU memory costs.
- [Supervised Fine-Tuning](https://awesome-repositories.com/f/artificial-intelligence-ml/supervised-fine-tuning.md) — Supports training base models on custom conversational datasets to adapt specific behaviors.
- [Supervised Fine-Tuning Frameworks](https://awesome-repositories.com/f/artificial-intelligence-ml/supervised-fine-tuning-frameworks.md) — Ships a toolkit for aligning pretrained models to specific conversational roles using labeled datasets.

### DevOps & Infrastructure

- [ML Model Hosting](https://awesome-repositories.com/f/devops-infrastructure/ml-model-hosting.md) — Provides infrastructure for deploying and serving a language model as a dedicated network service. ([source](https://github.com/OpenMOSS/MOSS/blob/main/README.md))

### Software Engineering & Architecture

- [Plugin Execution Engines](https://awesome-repositories.com/f/software-engineering-architecture/plugin-execution-engines.md) — Executes external software modules during the inference loop for real-time data retrieval and computation.

### Part of an Awesome List

- [Natural Language Processing](https://awesome-repositories.com/f/awesome-lists/ai/natural-language-processing.md) — Listed in the “Natural Language Processing” section of the FunNLP awesome list.