30 open-source projects similar to utkuozbulak/pytorch-cnn-visualizations, ranked by how many features they have in common. Compare stars, activity and what each one does to find the best Pytorch Cnn Visualizations alternative.
This project is a computer vision explainable AI library and framework for PyTorch, providing a suite of tools to visualize and audit the internal decision-making processes of deep neural networks. It serves as a neural network attribution tool and debugging utility to identify which image regions drive model predictions. The library is distinguished by its support for both gradient-based and gradient-free attribution methods, allowing for the generation of visual heatmaps and attribution maps without requiring modifications to the original model source code. It further differentiates itself
Tensorpack is a high-level TensorFlow neural network framework and research library designed for building and training deep learning models. It provides a collection of reproducible neural network architectures for computer vision, generative tasks, reinforcement learning, and natural language processing. The project distinguishes itself through a specialized deep learning data pipeline that uses pure Python for parallel data loading and streaming. It includes a multi-GPU training orchestrator for distributing workloads via data-parallel strategies and a dedicated interpretability toolkit for
Lit is a machine learning interpretability framework and model debugging tool designed to analyze model behavior and performance. It serves as an interpretability dashboard for large language models and a general performance analyzer for text, image, and tabular datasets. The project distinguishes itself through a comprehensive suite of interpretability tools, including salience map generation for feature attribution, the creation of synthetic and counterfactual examples to test robustness, and the projection of high-dimensional embeddings into visual spaces via UMAP or PCA. It further enable
This project is a static educational website and comprehensive curriculum focused on computer vision and deep learning. It serves as a public repository of instructional materials, lecture notes, and technical guides specifically detailing convolutional neural networks and visual recognition. The site is developed using static-site generation to host course documentation and student project directories. It provides structured academic resources that guide learners through image classification, generative modeling, and the implementation of various neural network architectures. The curriculum
This project is a comprehensive educational resource and technical manual focused on interpretable machine learning and explainable AI. It serves as a textbook and reference for implementing techniques that make complex machine learning models transparent and understandable to humans. The resource provides guidance on both building inherently transparent models, such as decision trees and sparse linear models, and applying post-hoc explanation methods to black-box systems. It details specific methodologies for quantifying feature importance, generating rationales for individual predictions, a
Pretrained ConvNets for pytorch: NASNet, ResNeXt, ResNet, InceptionV4, InceptionResnetV2, Xception, DPN, etc.
Quickly and easily create / train a custom DeepDream model
Visualization toolkit for neural networks in PyTorch! Demo -->
PyTorch and Tensorflow functional model definitions
This repo is used to research convolutional networks primarily for computer vision tasks. For this purpose, the repo contains (re)implementations of various classification, segmentation, detection, and pose estimation models and scripts for training/evaluating/converting.
This project is a library of pretrained computer vision architectures and backbones for image classification and feature extraction. It serves as a comprehensive model zoo and collection of standardized image encoders, including ResNet, Vision Transformers, and EfficientNet, for use in visual analysis and as backbones for object detection and image segmentation. The library provides a framework for distributed training and evaluation of image models using advanced data augmentation and optimization scripts. It includes a dedicated toolset for converting trained PyTorch vision models into the
This is a PyTorch-based implementation of diffusion models for synthesizing photorealistic images and video. It provides a framework for text-to-image and text-to-video generation, as well as unconditional image synthesis. The system utilizes a cascading diffusion pipeline to produce high-resolution imagery by passing low-resolution outputs through a sequence of super-resolution models. It also includes capabilities for image inpainting, allowing the reconstruction of masked or missing regions of visual media guided by surrounding context and text prompts. The project includes tools for diff
This project is a PyTorch implementation of a discrete variational autoencoder designed to compress high-resolution imagery into discrete latent representations. It functions as an image autoencoder that encodes visual data into discrete codes and decodes those codes back into reconstructed images. The system utilizes a latent space image compressor to convert images into a compressed, discrete format. This allows for generative image synthesis and the analysis of image compression by transitioning between raw pixels and discrete code sequences. The implementation covers latent space visuali
pykan is a library for implementing Kolmogorov-Arnold Networks, replacing fixed node activation functions with learnable spline functions located on the network edges. It serves as an interpretable AI framework and symbolic regression tool designed to derive transparent mathematical rules from complex data. The project focuses on converting learned numerical functions into human-readable symbolic expressions through library matching and formula conversion. It utilizes additive-compositional topologies and learnable piecewise polynomial segments to approximate non-linear mappings. The framewo
This project is a TensorFlow-based neural style transfer tool and deep learning image processor. It uses convolutional neural networks to apply the artistic style of one image to the content of another through neural image synthesis. The system supports multi-style blending to combine artistic characteristics from several different images into a single output. It also includes color-preserving stylization, which maintains the original color palette of the source image by merging source color data with the luminance of the stylized result. The tool provides capabilities for style abstraction
This project is an unsupervised image restoration tool that uses a convolutional neural network as a structural prior to reconstruct images from noisy or incomplete data. It functions as a neural network image prior, utilizing the inherent biases of the network architecture to restore pixels without the need for a pre-trained dataset or external learning. The system performs zero-shot image restoration by treating the network architecture itself as a regularization term. It uses a randomly initialized encoder-decoder structure and iterative gradient descent to minimize pixel-wise loss, recove
nlp-recipes is a collection of implementation guides and reference templates for applying natural language processing techniques to real-world tasks. It provides standardized workflows and code examples for developing NLP pipelines, from dataset preparation and model training to performance evaluation. The project focuses on the practical application of transformer-based models, offering patterns for fine-tuning pretrained architectures for tasks such as text classification, named entity recognition, and question answering. It also includes a toolkit for model interpretability, allowing users
This project is a deep learning style transfer framework designed to apply artistic styles to photographs. It functions as a photorealistic image stylizer that merges the content of one image with the visual characteristics of another while maintaining the original geometry and structural details. The system distinguishes itself through the use of matting Laplacian matrices and semantic segmentation masks to prevent distortion and preserve edge fidelity. These capabilities allow for region-specific styling, where different aesthetics can be applied to distinct objects or areas within a single
qrbtf is an AI QR code generator and image synthesis system that blends machine-readable data with artistic imagery. It uses a latent diffusion model and spatial control networks to produce functional QR codes that incorporate visual art generated from descriptive text prompts. The system provides a dedicated interface and programmatic API for tuning visual output, allowing for the adjustment of control strength, padding ratios, and error correction levels. It supports deterministic sampling via random seeds and the use of negative prompts to refine the final aesthetic of the generated assets
Z-Image is an AI image editing engine and generation framework designed for photorealistic synthesis and the refinement of diffusion models. It functions as a multilingual text-to-image renderer and a system for training custom foundation models to generate and edit visuals using natural language instructions. The project distinguishes itself through a reasoning-based prompt enhancer that expands simple descriptions into detailed visual instructions using a structured reasoning chain. It also features specialized capabilities for rendering high-quality Chinese and English typography within ge
OutfitAnyone is a diffusion-based virtual try-on system and AI person-garment integration tool. It functions as an image-to-image clothing transfer model designed to visualize how specific clothing items look on any person regardless of their pose. The system adapts garment textures and shapes to a person's body and pose to produce photorealistic results. It specifically focuses on adjusting clothing deformation based on body shape to maintain high fidelity and detail consistency during the fitting process. The project covers AI fashion visualization and virtual garment fitting, providing ca
This project is a diffusion model training framework and image synthesis pipeline. It provides the tools necessary to train generative models to learn image data distributions through an iterative denoising process. The framework includes a generative model evaluation tool consisting of automated scripts used to measure the quality and accuracy of produced samples. The system covers model training pipelines and performance evaluation for generative diffusion models.
DeepDream is a deep learning image processor and convolutional neural network art generator designed to synthesize psychedelic imagery and visualize how neural networks interpret visual data. It functions as a tool for generating generative AI art by amplifying patterns recognized by a pre-trained model to produce dream-like effects. The project utilizes a TensorFlow image visualizer to explore how different layers of a neural network perceive images. This is achieved through algorithmic image manipulation and deep learning visualization techniques that transform standard photographs into sty
ComfyUI is a modular generative AI workflow orchestrator and node-based GUI for designing and executing complex diffusion model pipelines. It functions as both a visual interface for building generative logic graphs and a programmable backend API that exposes diffusion model operations for external integration. The system distinguishes itself through a graph-based execution model that supports differential workflow execution, re-running only modified nodes to reduce computation. It features dynamic model offloading to manage memory between system RAM and GPU VRAM and utilizes metadata-embedde
Text to image synthesis using thought vectors
ICML'23 StyleGAN-T: Unlocking the Power of GANs for Fast Large-Scale Text-to-Image Synthesis