30 repos
Structural designs and mathematical patterns used to define the internal connectivity and data flow of neural networks.
Explore 30 awesome GitHub repositories matching artificial intelligence & ml · Architectures. Refine with filters or upvote what's useful.
Stable Diffusion Web UI is a browser-based interface designed for managing text-to-image generation tasks. It provides a centralized dashboard for controlling generative processes, including native support for multi-stage model architectures to facilitate high-quality image refinement. The platform distinguishes itsel
Powers the execution and management of complex generative media workflows through a graphical interface.
Transformers is a comprehensive library for machine learning that provides a unified interface for training, fine-tuning, and deploying transformer-based models. It supports a wide range of tasks, including text classification, language modeling, question answering, and sequence-to-sequence translation, while offering
Exposes a registry-based interface for implementing custom attention mechanisms or modifying existing model behaviors.
This project is a comprehensive, open-source educational curriculum designed to guide developers through the mastery of generative artificial intelligence. It provides a structured learning path that covers foundational concepts, prompt engineering, and the practical application of large language models. The repository
Focuses on utilizing smaller, efficient language models for practical deployment.
DeepSeek-V3 is a large language model that provides comprehensive resources for model utilization, including technical specifications, pre-trained weights, and evaluation benchmarks. The project details the core transformer architecture, including parameter counts and multi-token prediction modules, while supporting na
Standardized performance benchmarks and technical specifications allow for rigorous analysis of capabilities against industry-recognized metrics.
PyTorch is a machine learning framework centered on a GPU-ready tensor library that supports multi-dimensional array operations across both CPU and accelerator hardware. It provides a foundational infrastructure for mathematical computation and dynamic neural network construction, utilizing a tape-based automatic diffe
Organizes neural network architectures through modular base classes and container types for custom layer management.
This project is a speech recognition and translation engine that utilizes a sequence-to-sequence transformer architecture to convert audio into text. It is built upon a weakly supervised learning framework, which leverages large-scale, unlabelled audio-transcript data to create generalized speech representations capabl
Maps variable-length audio input sequences to text output sequences using deep learning and byte-level tokenization.
This repository serves as an educational framework for building large language models from the ground up. It provides a structured curriculum that guides learners through the end-to-end lifecycle of model development, including data processing, architecture design, and optimization. By focusing on low-level implementat
Implements gradient-based optimization logic manually to clarify the mechanics of weight updates and loss minimization.
This repository serves as a centralized collection of state-of-the-art deep learning architectures and reference implementations designed for research and application development. It provides a comprehensive toolkit for computer vision and natural language processing, offering pre-built models and training pipelines fo
Houses a centralized library of state-of-the-art deep learning architectures and verified reference implementations.
This project is an open-source, interactive educational platform designed to teach deep learning through a comprehensive, code-first curriculum. It provides a structured learning path that covers foundational mathematics, modern neural network architectures, and practical optimization techniques, enabling practitioners
Models sequential dependencies in data through clear, code-based implementations of recurrent neural network structures.
This project is a comprehensive educational curriculum and engineering handbook focused on the lifecycle of large language models. It serves as a structured knowledge base for machine learning practitioners, covering the fundamental mathematical and architectural principles of transformer-based sequence modeling, as we
Details the mechanics of stacked attention layers used to process sequences and capture long-range dependencies.
The algorithm is a distributed recommendation engine pipeline designed to construct and serve personalized content timelines. It functions as a multi-stage orchestration layer that aggregates candidate content from diverse social graphs and high-dimensional embedding spaces, processing user interaction data to deliver
Shares model architectures to predict multiple engagement signals simultaneously for optimized content relevance.
Tesseract is a neural network-based optical character recognition engine designed to convert scanned images and digital documents into machine-readable, searchable text. It functions as both a command-line utility for automating large-scale digitization workflows and a cross-platform library that can be embedded into d
Models sequential dependencies in text across diverse scripts and languages using advanced neural network architectures.
LobeHub is a comprehensive multi-agent orchestration platform designed for building, configuring, and deploying specialized AI agents. It provides a unified chat-based gateway that allows users to manage autonomous agent teams across web, desktop, and mobile environments. By utilizing a framework that supports persiste
Coordinates autonomous agents to work in concert on complex, long-horizon objectives and organizational tasks.
Stable Diffusion is a generative machine learning pipeline that synthesizes high-resolution visual content by performing iterative denoising within a compressed latent space. By mapping natural language embeddings into pixel outputs through conditioned probabilistic processes, the framework enables the generation of im
Maps pixel data into compact latent spaces to facilitate the synthesis of new visual media.
This project is an artificial intelligence-powered frontend generator that translates visual design inputs into functional source code. It functions as a workflow engine that interprets graphical user interfaces, mapping layout structures and styling rules to structured markup and programming language syntax. The tool
Processes visual design inputs through neural networks to interpret layout structures and translate them into functional source code.
This project is a comprehensive, community-driven directory of machine learning resources, software libraries, and educational materials. It serves as a centralized knowledge base for developers and researchers, organizing tools and frameworks by their primary programming language and technical domain to simplify disco
Identifies frameworks dedicated to the design and deployment of networks that simulate biological spiking patterns.
This project is a technical learning resource and developer knowledge base focused on the integration of large language models into software applications. It provides a structured collection of guides and code examples designed to teach developers how to implement intelligent features using proven patterns and best pra
Implements architectural strategies like retrieval-augmented generation to connect language models with external data sources.
This project is a comprehensive educational resource and knowledge base dedicated to the development and application of large language models and autonomous agentic systems. It provides a structured framework for understanding prompt engineering, context management, and the architectural patterns required to build task
Summarizes technical architectures capable of transforming text or image inputs into sequential video frames.
Scikit-learn is a machine learning library for predictive data analysis that provides a collection of algorithms for supervised and unsupervised learning. It functions as a comprehensive toolkit for data preprocessing, dimensionality reduction, and model selection, allowing users to classify data objects, predict conti
Chains data transformation and model estimation steps into sequential, reproducible workflows using a unified interface.
Keras is a high-level deep learning framework designed for constructing and training neural networks through the composition of modular, functional layers. It serves as a comprehensive modeling toolkit that provides standardized procedures for defining, evaluating, and deploying complex architectures. By utilizing a di
Composes neural networks using reusable, functional layers that perform specific mathematical transformations on input data.