# Tencent/ncnn

**Attribution required: if you use, quote, or summarise this content, you must credit and link back to [awesome-repositories.com](https://awesome-repositories.com/repository/tencent-ncnn).**

22,811 stars · 4,396 forks · C++ · other

## Links

- GitHub: https://github.com/Tencent/ncnn
- awesome-repositories: https://awesome-repositories.com/repository/tencent-ncnn.md

## Topics

`android` `arm-neon` `artificial-intelligence` `caffe` `darknet` `deep-learning` `high-preformance` `inference` `ios` `keras` `mlir` `mxnet` `ncnn` `neural-network` `onnx` `pytorch` `riscv` `simd` `tensorflow` `vulkan`

## Description

ncnn is a high-performance neural network inference framework designed for executing deep learning models locally on mobile and desktop hardware. It functions as a specialized engine that enables the deployment of artificial intelligence tasks directly on resource-constrained devices, eliminating the need for external network connectivity or cloud-based processing services.

The framework provides a comprehensive toolset for model optimization, allowing users to convert and quantize machine learning models into specialized binary structures. By utilizing static model graph compilation and zero-copy memory management, the engine minimizes memory footprint and reduces data movement during execution. It further distinguishes itself through platform-agnostic hardware abstraction, which maps neural network operations to available local accelerators, including CPUs, GPUs, and specialized neural processing units.

The library supports a wide range of complex, multi-branch neural network architectures, facilitating tasks such as image recognition and audio analysis. Performance is maintained through layer-specific kernel optimizations and graph-level operator fusion, which maximize efficiency on diverse hardware architectures. The project is distributed as a C++ library, providing a unified interface for cross-platform inference deployment.

## Tags

### Artificial Intelligence & ML

- [High-Performance AI Inference](https://awesome-repositories.com/f/artificial-intelligence-ml/machine-learning/frameworks/inference-runtimes/high-performance-ai-inference.md) — Provides a high-performance library for executing deep learning models on mobile and desktop hardware by optimizing execution speed and memory usage.
- [Neural Networks](https://awesome-repositories.com/f/artificial-intelligence-ml/neural-networks.md) — Executes deep learning models directly on local hardware by leveraging processor acceleration for fast, offline inference. ([source](https://cdn.jsdelivr.net/gh/Tencent/ncnn@master/README.md))
- [On-Device Inference Engines](https://awesome-repositories.com/f/artificial-intelligence-ml/on-device-inference-engines.md) — Runs pre-trained neural networks directly on mobile devices to achieve fast performance without relying on cloud-based processing services.
- [Model Quantization](https://awesome-repositories.com/f/artificial-intelligence-ml/model-optimization/quantization/model-quantization.md) — Converts and quantizes machine learning models into specialized structures that accelerate performance on local hardware processors.
- [Cross-Platform Inference Frameworks](https://awesome-repositories.com/f/artificial-intelligence-ml/cross-platform-inference-frameworks.md) — Standardizes the execution of deep learning models across diverse desktop and mobile hardware architectures using a unified high-performance framework.
- [Model Quantization](https://awesome-repositories.com/f/artificial-intelligence-ml/model-quantization.md) — Reduces model size and increases processing speed by performing calculations using lower-precision integer arithmetic instead of standard floating-point numbers.
- [Hardware Acceleration Abstractions](https://awesome-repositories.com/f/artificial-intelligence-ml/hardware-acceleration-abstractions.md) — Provides a unified interface that maps neural network operations to available local hardware accelerators like CPUs, GPUs, and specialized neural processing units.
- [Model Performance Optimization](https://awesome-repositories.com/f/artificial-intelligence-ml/model-optimization/profiling-and-benchmarking/model-performance-optimization.md) — Applies quantization and graph adjustments to reduce memory consumption and increase processing speed for applications on resource-constrained devices. ([source](https://cdn.jsdelivr.net/gh/Tencent/ncnn@master/README.md))
- [Kernel Optimizations](https://awesome-repositories.com/f/artificial-intelligence-ml/kernel-optimizations.md) — Implements hand-tuned assembly and intrinsic instructions for individual neural network operations to maximize performance on specific mobile processor architectures.
- [Model Inference](https://awesome-repositories.com/f/artificial-intelligence-ml/machine-learning/infrastructure/model-inference-serving/model-integration-pipelines/model-inference.md) — Provides tools to convert machine learning models into specialized structures for efficient deployment on resource-constrained hardware. ([source](https://cdn.jsdelivr.net/gh/Tencent/ncnn@master/README.md))
- [Multi-Branch Networks](https://awesome-repositories.com/f/artificial-intelligence-ml/multi-branch-networks.md) — Supports advanced multi-input and multi-branch network structures for sophisticated tasks like image recognition and audio analysis. ([source](https://cdn.jsdelivr.net/gh/Tencent/ncnn@master/README.md))

### DevOps & Infrastructure

- [Edge Computing](https://awesome-repositories.com/f/devops-infrastructure/edge-computing.md) — Enables complex artificial intelligence tasks like image and audio analysis to run locally on hardware for privacy and offline functionality.

### Hardware & IoT

- [Embedded System Optimizations](https://awesome-repositories.com/f/hardware-iot/embedded-system-optimizations.md) — Converts and compresses machine learning models to minimize memory footprint and maximize execution speed on resource-constrained hardware platforms.

### Programming Languages & Runtimes

- [Static Graph Execution](https://awesome-repositories.com/f/programming-languages-runtimes/runtime-execution-environments/runtime-environments/execution-engines/static-graph-execution.md) — Transforms high-level neural network definitions into a memory-efficient binary format optimized for rapid loading and execution on target hardware.
- [Kernel Fusion Operations](https://awesome-repositories.com/f/programming-languages-runtimes/runtime-execution-environments/runtime-environments/runtimes/graph-symbolic-execution-engines/operation-kernels/kernel-fusion-operations.md) — Combines multiple sequential neural network operations into single compute kernels to minimize memory overhead and improve cache locality.

### Software Engineering & Architecture

- [Zero-Copy Mechanisms](https://awesome-repositories.com/f/software-engineering-architecture/performance-reliability/performance-optimization/data-handling-throughput/zero-copy-mechanisms.md) — Minimizes data movement by reusing pre-allocated memory buffers across different layers of the neural network during the inference process.
