2 repositorios
Backend services designed to host large language models for external API requests.
Distinct from Headless Server Hosting: Distinct from general backend services or deployment interfaces; specifically targets the hosting of LLM inference as a service.
Explore 2 awesome GitHub repositories matching devops & infrastructure · Model API Servers. Refine with filters or upvote what's useful.
ChatGLM3 is an open-weights large language model designed for bilingual conversational interactions in English and Chinese. It functions as a tool-augmented system capable of calling external functions and executing internal code to resolve complex tasks. The model utilizes four-bit quantization to reduce memory requirements, enabling inference on consumer hardware and diverse processing units including GPUs and CPUs. It features an expanded context window for processing and summarizing long documents and includes a supervised fine-tuning pipeline for adapting the model to specialized domains
Supports deployment of the model as a backend service using standard compatible interfaces.
DeepPavlov is a deep learning conversational AI framework designed for building end-to-end dialog systems and chatbots. It functions as an NLP model training library and a pipeline system that connects multiple natural language processing models into a single operational chain. The framework provides a REST API model server to expose trained deep learning models as web endpoints. This allows conversational agents to be deployed as web services that handle incoming HTTP requests and return predictions. The system covers the full lifecycle of conversational AI development, including NLP pipeli
Serves as a backend model server that exposes trained deep learning models as web endpoints.