awesome-repositories.com
© 2026 Bringes Technology SRL·VAT RO45896025·hello@bringes.io
MCPSitemapPrivacyTerms
Inference Execution Models · Awesome GitHub Repositories

1 repo

Awesome GitHub RepositoriesInference Execution Models

Architectural approaches for managing inference tasks, including the use of sliding windows to maintain context during execution.

Explore 1 awesome GitHub repository matching artificial intelligence & ml · Inference Execution Models. Refine with filters or upvote what's useful.

  1. Home
  2. Artificial Intelligence & ML
  3. Machine Learning
  4. Infrastructure
  5. Model Inference and Serving
  6. Engines, Runtimes & Servers
  7. Inference Execution Models

Awesome Inference Execution Models GitHub Repositories

Describe the repository you're looking for…
We'll search the best matching repositories with AI.
  • meta-llama/llama

    meta-llama/llama

    59,157GitHubView on GitHub↗

    Llama is a computational framework and runtime environment designed for executing transformer-based neural networks locally. It functions as a generative AI inference engine, enabling the processing of input sequences through pre-trained model weights to produce text completions and structured data outputs directly on

    Maintains context within a sliding window buffer to process inference tasks independently without persistent server state.

    Python

Explore sub-tags

  • Stateless Inference EnginesSystems that process inference requests independently without maintaining persistent server-side state between calls.