8 repos
Neural network designs utilizing stacked attention layers to process sequences and capture long-range dependencies.
Explore 8 awesome GitHub repositories matching artificial intelligence & ml · Transformer. Refine with filters or upvote what's useful.
Transformers is a comprehensive library for machine learning that provides a unified interface for training, fine-tuning, and deploying transformer-based models. It supports a wide range of tasks, including text classification, language modeling, question answering, and sequence-to-sequence translation, while offering
This project is a speech recognition and translation engine that utilizes a sequence-to-sequence transformer architecture to convert audio into text. It is built upon a weakly supervised learning framework, which leverages large-scale, unlabelled audio-transcript data to create generalized speech representations capabl
This project is an open-source, interactive educational platform designed to teach deep learning through a comprehensive, code-first curriculum. It provides a structured learning path that covers foundational mathematics, modern neural network architectures, and practical optimization techniques, enabling practitioners
This project is a comprehensive educational curriculum and engineering handbook focused on the lifecycle of large language models. It serves as a structured knowledge base for machine learning practitioners, covering the fundamental mathematical and architectural principles of transformer-based sequence modeling, as we
This project is a comprehensive educational resource and knowledge base dedicated to the development and application of large language models and autonomous agentic systems. It provides a structured framework for understanding prompt engineering, context management, and the architectural patterns required to build task
Codex is an automated programming tool and generative code assistant designed to interpret developer intent through a natural language interface. It functions as a machine learning model trained on public code repositories to provide intelligent code completion, suggestions, and refactoring within development environme
Llama is a computational framework and runtime environment designed for executing transformer-based neural networks locally. It functions as a generative AI inference engine, enabling the processing of input sequences through pre-trained model weights to produce text completions and structured data outputs directly on
nanoGPT is a lightweight engine for training and fine-tuning transformer-based language models from scratch. It provides a minimalist codebase designed for educational exploration and rapid experimentation with neural network architectures, utilizing self-attention and feed-forward layers to process sequences and predi