Why is exacity/deeplearningbook-chinese a recommended Convolutional Kernel Optimizations GitHub Repositories repository?

Explains how to represent multi-dimensional kernels as series of 1D convolutions to reduce computation.

Why is tile-ai/tilelang a recommended Convolutional Kernel Optimizations GitHub Repositories repository?

Defines matrix-matrix convolution computations with configurable dimensions, data types, and optional bias.

Why is ai-dawang/plugnplay-modules a recommended Convolutional Kernel Optimizations GitHub Repositories repository?

Provides Poly Kernel Inception Blocks that process features through parallel depthwise convolutions with varying dilations.

Why is tingsongyu/pytorch-tutorial-2nd a recommended Convolutional Kernel Optimizations GitHub Repositories repository?

Renders weight tensors from convolutional layers as images to analyze learned patterns.

4 مستودعات

Awesome GitHub RepositoriesConvolutional Kernel Optimizations

Methods for reducing the computational cost of convolutions by decomposing high-dimensional kernels.

Distinct from One-Dimensional Convolutions: Existing candidates focus on 1D convolutions or hardware caching, not mathematical kernel decomposition.

Explore 4 awesome GitHub repositories matching artificial intelligence & ml · Convolutional Kernel Optimizations. Refine with filters or upvote what's useful.

اعثر على أفضل المستودعات باستخدام الذكاء الاصطناعي.سنبحث عن أفضل المستودعات المطابقة باستخدام الذكاء الاصطناعي.

exacity/deeplearningbook-chinese
exacity/deeplearningbook-chinese
37,285عرض على GitHub
This project is a comprehensive Chinese translation of a technical deep learning textbook, providing an educational resource on the theory and implementation of neural networks. It functions as a collaborative technical translation project designed to make complex academic AI literature accessible to non-English speakers. The project utilizes a community-driven translation model that integrates external suggestions and pull requests to refine linguistic accuracy and reduce bias. It employs standardized terminology mapping to ensure a uniform vocabulary throughout the translated content. To i
Explains how to represent multi-dimensional kernels as series of 1D convolutions to reduce computation.
TeX
عرض على GitHub37,285
tile-ai/tilelang
tile-ai/tilelang
5,226عرض على GitHub
TileLang is a Python-embedded domain-specific language compiler that JIT-compiles and autotunes GPU kernels. It uses a tile-based DSL, automatic software pipelining, and parallel autotuning to generate optimized GPU kernels at runtime. It supports tensor core operations with Pythonic syntax, automatic memory management, and thread mapping. The compiler searches over tile sizes, thread counts, and scheduling policies, compiling and benchmarking candidates in parallel to find the fastest kernel. It also caches compiled binaries and tuning results to disk for reuse across sessions. TileLang inc
Defines matrix-matrix convolution computations with configurable dimensions, data types, and optional bias.
Python
عرض على GitHub5,226
ai-dawang/plugnplay-modules
ai-dawang/PlugNPlay-Modules
4,968عرض على GitHub
PlugNPlay-Modules is a collection of reusable PyTorch computer vision modules and deep learning architectural components. It provides a library of standardized building blocks for constructing neural networks, focusing on attention mechanisms, signal processing layers, and feature fusion modules. The project is distinguished by its extensive variety of attention primitives, covering spatial, channel, and temporal weighting, as well as specialized variants like deformable, frequency-enhanced, and linear-complexity attention. It also implements advanced signal processing tools within the neural
Provides Poly Kernel Inception Blocks that process features through parallel depthwise convolutions with varying dilations.
Python
عرض على GitHub4,968
tingsongyu/pytorch-tutorial-2nd
TingsongYu/PyTorch-Tutorial-2nd
4,555عرض على GitHub
هذا المشروع عبارة عن مورد تعليمي شامل ودورة تدريبية لبناء الشبكات العصبية باستخدام PyTorch. يغطي اللبنات الأساسية للتعلم العميق، بما في ذلك معالجة الموترات (tensors)، والتمايز التلقائي، وبناء مكونات الشبكة العصبية المعيارية. يعمل المستودع كدليل تقني للعديد من المجالات المتخصصة. يوفر تفاصيل تنفيذ لمهام رؤية الكمبيوتر مثل تصنيف الصور، واكتشاف الكائنات، والتجزئة الدلالية، بالإضافة إلى سير عمل معالجة اللغات الطبيعية التي تتضمن المحولات (transformers)، والشبكات المتكررة، والنماذج التوليدية. بالإضافة إلى ذلك، يتضمن مرجعاً للذكاء الاصطناعي التوليدي، مع التركيز بشكل خاص على تركيب الصور عبر نماذج الانتشار (diffusion models) والشبكات التنافسية. تمتد المادة إلى تحسين النماذج وخطوط أنابيب النشر. تغطي تقنيات لتقليل حجم النموذج وزيادة سرعة الاستنتاج من خلال التكميم (quantization) وتصدير النماذج إلى تنسيقات مثل ONNX وTensorRT. تشمل مجالات القدرة الأخرى هندسة البيانات للتحميل المتوازي، وتقييم النموذج باستخدام مقاييس مخصصة، ونشر نماذج اللغات الكبيرة مفتوحة المصدر. يتم تقديم المشروع بشكل أساسي كسلسلة من دفاتر Jupyter.
Renders weight tensors from convolutional layers as images to analyze learned patterns.
Jupyter Notebookcomputer-visiondeepsortdiffusion-models
عرض على GitHub4,555

Awesome Convolutional Kernel Optimizations GitHub Repositories

exacity/deeplearningbook-chinese

tile-ai/tilelang

ai-dawang/PlugNPlay-Modules

TingsongYu/PyTorch-Tutorial-2nd

استكشف الوسوم الفرعية