1 repo
Runtime environments for executing machine learning models and integrating their outputs into applications.
Distinguishing note: Focuses on the execution of inference requests rather than model training or development.
Explore 1 awesome GitHub repository matching artificial intelligence & ml · Inference Execution Engines. Refine with filters or upvote what's useful.
Modular is a unified machine learning development platform designed for building, compiling, and deploying high-performance neural network models. It provides a comprehensive execution engine that supports both local and production-grade inference, enabling developers to manage the entire model lifecycle from initial architecture definition to scalable, containerized service deployment. The platform distinguishes itself through a hardware-agnostic runtime that abstracts diverse silicon architectures, allowing models to execute efficiently across varied compute environments. It includes a spec
Executes inference requests against local model endpoints to integrate responses into software applications.