30 open-source projects similar to google/seq2seq, ranked by how many features they have in common. Compare stars, activity and what each one does to find the best Seq2seq alternative.
This repository is a collection of practical deep learning implementations and examples built using the TensorFlow framework. It provides a variety of neural network architectures focusing on natural language processing, recommendation systems, reinforcement learning, and time series prediction. The project features a range of specialized models, including sequence-to-sequence and transformer architectures for text processing, and factorization machines for personalized ranking and retrieval. It also includes implementations of reinforcement learning agents using actor-critic and policy gradi
This project is a neural machine translation system used to build models that automatically translate text from one language to another. It utilizes sequence-to-sequence modeling to transform variable-length input sequences into corresponding output sequences. The system implements bidirectional recurrent neural network encoding and attention mechanisms to capture contextual information and focus on specific parts of the source text during translation. To manage training and inference, it employs separate computational graphs and supports distributing model layers across multiple GPU devices.
This project is a comprehensive collection of educational examples and reference implementations for building vision and language models using PyTorch. It serves as a deep learning tutorial covering the end-to-end process of developing neural networks, from initial architecture definition to final production deployment. The repository provides detailed guides on implementing a wide range of domain-specific models, including convolutional neural networks for object detection and segmentation, as well as transformer and recurrent architectures for natural language processing. It emphasizes gene
This project is an educational codebase and reference library that translates theoretical deep learning concepts into executable PyTorch code. It serves as a practical implementation of a deep learning textbook, providing a course-like structure of guided exercises and architectural examples for learning purposes. The repository includes a library of standard neural network architectures, including linear, convolutional, recurrent, and transformer models. It specifically implements a variety of deep learning patterns such as multilayer perceptrons, VGG networks, gated recurrent units, and lon
This project is a TensorFlow implementation of a transformer model, providing a text-to-text deep learning framework designed to recognize and generate sequence patterns. It functions as an attention-based sequence model and a neural machine translation framework for converting text from one language to another. The system implements the transformer network architecture, utilizing multi-head attention and positional encoding to process sequential data. It provides the necessary tools for transformer model training and machine translation inference, allowing for the execution of trained models
ESPnet is a comprehensive speech processing toolkit and PyTorch-based trainer designed for building end-to-end speech recognition, synthesis, and translation models. It provides a structured framework for developing automatic speech recognition systems using transducer and encoder-decoder architectures, alongside engines for text-to-speech synthesis and speech translation pipelines. The project distinguishes itself through a recipe-based workflow execution system that ensures experimental reproducibility by running standardized sequences of scripts for data preparation and model training. It
Practical PyTorch is a collection of deep learning tutorials and guides focused on implementing recurrent neural networks. The project provides practical code for building sequence models and sequence-to-sequence architectures using the PyTorch framework. The repository covers the implementation of models for neural machine translation, character-level text generation, and text classification. It includes examples for transforming input sequences into output sequences for machine translation and synthesizing new text. The project also extends to sequence data prediction and time series analy
Fairseq is a deep learning research toolkit and sequence-to-sequence framework built on PyTorch. It provides a system for training and deploying models that map input sequences to output sequences, with a primary focus on neural machine translation and speech recognition. The toolkit allows for the generation of text sequences through search algorithms such as beam search and nucleus sampling. It includes capabilities for producing synthetic parallel training data by translating monolingual text using reverse sequence models. The framework supports large scale model training through multi-de
This project is a collection of structured study notes and notebooks serving as an educational resource for deep learning and neural network fundamentals. It provides a technical reference for implementing machine learning theory, covering everything from basic network design to the construction of advanced architectures. The material specifically focuses on the implementation of convolutional neural networks for computer vision and sequence models for natural language processing. It includes detailed guidance on building object detection systems, face recognition, and speech transcription mo
Neuraltalk is an automated image captioning system that generates natural language descriptions for images. It utilizes a deep learning model that integrates a pretrained convolutional neural network for visual feature extraction with a recurrent neural network decoder to produce text sequences. The project provides a full workflow for training and evaluating captioning models, including weight optimization via backpropagation and gradient descent. It includes tools for measuring caption accuracy by comparing generated text against reference descriptions. The system covers data preprocessing
This project is a collection of PyTorch learning resources and educational guides designed to teach the construction and training of neural networks. It serves as a comprehensive deep learning tutorial covering various model architectures and practical implementation strategies. The resources provide specific guidance on implementing computer vision tasks, such as image classification and synthetic imagery generation, as well as reinforcement learning agents using value networks and experience replay. It also covers sequential data modeling through recurrent networks and generative modeling u
This project is a collection of educational resources and instructional guides for learning deep learning and neural network implementation using TensorFlow. It provides a structured set of tutorials and notebooks written in Chinese, covering supervised and unsupervised learning tasks. The material focuses on practical implementations of diverse neural network architectures, including convolutional, recurrent, and autoencoder networks. It includes specific training content for computer vision, natural language processing, and generative models. The coverage extends to specialized network arc
ParlAI is a conversational AI research framework designed for training, evaluating, and sharing dialogue models using a unified interface for datasets and agents. It functions as a PyTorch-based training platform and a dialogue data collection system, providing a centralized model zoo for the distribution of versioned pretrained agents. The project distinguishes itself through a knowledge-grounded retrieval system that combines dense and sparse indexing to ground responses in external information. It also provides a comprehensive infrastructure for gathering human-AI interaction data via inte
Fairseq is a PyTorch toolkit for sequence-to-sequence modeling, specializing in neural machine translation, automatic speech recognition, and large-scale language model training. It provides a framework for processing and aligning diverse data sources, including text, audio, and video, to support tasks such as speech-to-text conversion and multimodal sequence learning. The project is distinguished by its distributed training capabilities, which utilize parameter sharding, mixed-precision training, and CPU offloading to handle models that exceed single-device memory. It also includes specializ
This repository serves as an educational resource for learning the foundational architectures of natural language processing through concise code implementations. It provides a structured collection of deep learning models designed to process and understand human language, focusing on the core mechanics of neural network sequence modeling and text analysis. The project distinguishes itself by offering direct, hands-on implementations of complex architectures, including Transformers, attention mechanisms, and word embedding generation. By utilizing tensor-based computational graphs and gradien
OpenNMT-py is a PyTorch neural machine translation framework used for training and deploying neural machine translation and large language models. It functions as a distributed model training system, an inference engine, and a toolkit for fine-tuning large language models. The framework distinguishes itself with a dedicated toolkit for adapting large language models through low-rank adaptation, quantization, and instruction tuning. It also includes a neural machine translation server that allows trained models to be hosted and exposed via REST API endpoints. The project covers a broad range
The TensorFlow Cookbook is a collection of code examples and recipes for building, training, and deploying machine learning models using TensorFlow. It covers the full model lifecycle, from constructing neural networks and training them with configurable parameters to packaging trained models for production deployment with unit tests and multi-device support. The project also integrates TensorBoard for logging and visualizing computational graphs, scalar summaries, and histograms during training. The cookbook demonstrates a wide range of machine learning techniques, including convolutional ne
This project is a collection of deep learning research implementations and a reproduction kit designed to translate theoretical AI papers into working code. It provides a library of neural network architectures and reference implementations for reproducing seminal research concepts through interactive notebooks. The repository distinguishes itself through the implementation of AI theory and scaling laws, covering complexity dynamics, information theory, and the simulation of universal AI agents. It also includes a benchmarking suite for synthetic reasoning, allowing for the evaluation of mode
This project is a neural script generator that uses a recurrent neural network to synthesize human-like handwriting. It maps ASCII text characters to realistic pen stroke coordinates through an attention mechanism to mimic natural writing patterns. The system allows for handwriting style customization by adjusting priming and biasing parameters to control the neatness and stylistic characteristics of the generated text. Users can also define output formatting, including stroke colors and line widths, for the resulting digital scripts. The project includes a full neural network training workf
This project is a collection of TensorFlow machine learning examples providing reference implementations for various neural network paradigms. It covers supervised, unsupervised, reinforcement, and sequential learning models. The repository includes implementations for convolutional neural networks focused on image classification and ranking, as well as recurrent neural networks for time-series forecasting and sequence-to-sequence translation. It further provides examples of reinforcement learning agents trained via reward optimization and unsupervised learning techniques such as autoencoders
This is a collection of educational Jupyter Notebook tutorials that teach sequence-to-sequence modeling using PyTorch and TorchText, focused on neural machine translation. The project provides hands-on guides for building and training encoder-decoder architectures with recurrent neural networks like LSTM and GRU, implementing attention mechanisms that allow the decoder to focus on relevant input tokens during sequence generation. The tutorials cover the full pipeline of machine translation, from tokenizing multilingual text using language-specific tokenizers to training multi-layer encoder-de
This project is a comprehensive educational resource and tutorial handbook for building, training, and deploying machine learning models using TensorFlow 2. It serves as a structured learning guide covering core deep learning concepts, including neural network architectures, automatic differentiation, and tensor operations. The handbook provides technical guidance on optimizing execution efficiency through GPU memory management, distributed training, and model quantization. It also includes detailed manuals for constructing high-performance data pipelines and exporting models for production s
This repository is a deep learning educational resource and a neural network project suite. It provides a collection of practical TensorFlow implementations and coding projects designed to demonstrate the application of various neural network architectures to real-world data. The project includes specific samples for generative adversarial networks, focusing on synthetic image generation and style translation. It also provides examples of deep learning model construction across different learning paradigms. The codebase covers a broad range of capabilities, including computer vision for imag
The Annotated Transformer is an educational resource that provides annotated code implementations of the Transformer architecture for sequence-to-sequence tasks, built with PyTorch. It serves as a learning tool for understanding attention mechanisms, multi-head parallel attention, and scaled dot-product attention through executable examples that walk through each component of the model. The project covers the full Transformer pipeline, including stacked encoder-decoder layers with residual connections and layer normalization, sinusoidal positional encoding for order-aware representation, and
This project is a deep learning poetry generator designed to create traditional Chinese couplets. It utilizes a sequence-to-sequence neural network architecture to map input text sequences to matching output sequences, functioning as a text generation model and an inference web service. The system features a neural text ranking mechanism that evaluates candidate outputs based on length consistency and character patterns to ensure structural alignment. It also includes a content filtering process that scans generated text against forbidden word lists to remove sensitive or inappropriate materi
This project is a comprehensive educational curriculum and structured learning path covering the full lifecycle of large language models. It provides a guided progression through the theory, architecture, training, and deployment of these models. The curriculum includes specialized guides on transformer architecture, model training tutorials, and frameworks for designing autonomous agents. It also provides dedicated resources for studying model safety and ethics. The material covers a wide range of technical capabilities, including distributed training strategies, parameter-efficient fine-tu
This project is a collection of educational examples and code for implementing deep learning architectures using the PyTorch framework. It serves as a tutorial and implementation guide for building various neural network architectures for machine learning tasks. The project provides practical implementations for computer vision, including image classification and neural style transfer, as well as natural language processing examples for building sequence models and language predictors. It also covers generative models using adversarial and variational networks to synthesize or transform visua
This project is a TensorFlow-based supervised text categorizer designed for Chinese natural language processing. It utilizes a hybrid neural network architecture that combines convolutional and recurrent layers to map raw Chinese text to predefined categories. The system integrates convolutional neural networks for local feature extraction and recurrent neural networks for analyzing sequential dependencies. It employs character-level tokenization and word embeddings to represent text as numerical tensors. The implementation covers the end-to-end machine learning pipeline, including text prep
CTranslate2 is a C++ inference engine and runtime for Transformer models, designed to execute models on both CPU and GPU with optimizations for speed and memory efficiency. It functions as a model format converter, quantization tool, and REST API server, enabling deployment of neural machine translation, automatic speech recognition, and text generation models. The engine distinguishes itself through a suite of runtime optimizations including layer fusion, weight-matrix quantization, batch-by-length grouping, and a caching allocator that reuses GPU memory. It supports tensor-parallel model di
This is a library of generative model architectures built using the TensorFlow framework. It provides implementations for producing synthetic data and realistic images, specifically focusing on Variational Autoencoders and various Generative Adversarial Network variants. The collection includes specific GAN architectures such as WGAN-GP, LSGAN, InfoGAN, and EBGAN. It also features Variational Autoencoders designed to learn latent representations and synthesize new samples from learned distributions. The project covers image processing pipelines for normalizing and cropping data, as well as a