2 repos
Frameworks that facilitate the retrieval of model weights and configuration files from remote storage.
Explore 2 awesome GitHub repositories matching artificial intelligence & ml · Model Downloaders. Refine with filters or upvote what's useful.
vLLM is a high-throughput inference engine designed for the efficient serving and execution of large language models. It functions as a production-ready distributed model server, providing standard API protocols for online serving while also supporting offline batch processing. The system is built to maximize token gen
Facilitates the retrieval and loading of model weights and configuration files from remote storage for immediate execution.
Llama is a computational framework and runtime environment designed for executing transformer-based neural networks locally. It functions as a generative AI inference engine, enabling the processing of input sequences through pre-trained model weights to produce text completions and structured data outputs directly on
Retrieves model weights and configuration files from remote repositories for local deployment.