This is a PyTorch CNN visualization toolkit designed for neural network interpretability. It provides a set of tools to explain model decisions and analyze the internal behavior of convolutional neural networks through the visualization of activations, gradients, and filters. The project implements specialized techniques for synthesizing representative images, including Deep Dream optimizations to amplify patterns and class-specific image generation via input optimization. It also features a saliency map generator that produces gradient-based heatmaps to identify the specific image regions in
YOLO-World is a vision-language framework and open-vocabulary object detection model. It identifies objects in images and video based on free-form text prompts without requiring predefined category labels. The system enables the identification of arbitrary objects by fusing image features with text embeddings. It includes a specialized tool for automated image labeling, which generates bounding box annotations for custom datasets using text-based prompts. The project provides a deployment pipeline for converting models into quantized ONNX and TFLite formats, supporting real-time inference on