This document provides an overview of the model conversion pipeline in ncnn, explaining how neural network models from different frameworks are converted into the ncnn format for optimized inference. It covers the supported input formats, the conversion workflow, and the ncnn model representation.
For detailed information about specific converters and tools:
ncnn supports model conversion from four major deep learning frameworks through dedicated conversion tools:
| Framework | Converter Tool | Primary Use Case |
|---|---|---|
| PyTorch | pnnx | Modern PyTorch models, TorchScript |
| ONNX | onnx2ncnn | ONNX format models from various frameworks |
| Caffe | caffe2ncnn | Legacy Caffe models |
| MXNet | mxnet2ncnn | MXNet models |
Sources: tools/onnx/onnx2ncnn.cpp1-100 tools/caffe/caffe2ncnn.cpp1-100 tools/mxnet/mxnet2ncnn.cpp1-100
ncnn uses a dual-file format for model representation:
| File Extension | Content | Format |
|---|---|---|
.param | Network topology and layer parameters | Human-readable text |
.bin | Weight data and learned parameters | Binary floating-point data |
The .param file contains:
The .bin file contains:
Sources: tools/ncnn2mem.cpp153-197 tools/caffe/caffe2ncnn.cpp96-100
The conversion process follows a multi-stage pipeline:
Sources: tools/ncnnoptimize.cpp36-78 tools/ncnn2mem.cpp609-629
Each converter follows a similar pattern but adapts to framework-specific requirements:
Sources: tools/caffe/caffe2ncnn.cpp23-63 tools/mxnet/mxnet2ncnn.cpp342-583 tools/onnx/onnx2ncnn.cpp19-42
Each converter implements framework-specific optimizations during conversion:
The ONNX converter `onnx2ncnn.cpp` performs multiple fusion passes to simplify the graph:
| Fusion Pass | Pattern | Purpose |
|---|---|---|
fuse_weight_reshape | Reshape(weight) | Reshape weight tensors directly |
fuse_weight_transpose | Transpose(weight) | Transpose weight matrices for efficiency |
fuse_shufflechannel | Reshape-Transpose-Reshape | Recognize ShuffleChannel pattern |
fuse_hardswish | Add(+3)-Clip(0,6)-Mul-Div(/6) | Fuse HardSwish activation |
fuse_hardsigmoid | Add(+3)-Clip(0,6)-Div(/6) | Fuse HardSigmoid activation |
fuse_swish | Sigmoid-Mul | Fuse Swish activation |
fuse_batchnorm1d_squeeze_unsqueeze | Unsqueeze-BatchNorm-Squeeze | Remove dimension manipulations |
Sources: tools/onnx/onnx2ncnn.cpp404-451 tools/onnx/onnx2ncnn.cpp524-650 tools/onnx/onnx2ncnn.cpp729-902
The MXNet converter `mxnet2ncnn.cpp` includes:
| Fusion Pass | Pattern | Purpose |
|---|---|---|
fuse_shufflechannel | Reshape-SwapAxis-Reshape | Recognize ShuffleChannel from MXNet |
fuse_hardsigmoid_hardswish | _plus_scalar(+3)-clip(0,6)-_div_scalar(/6) | Fuse HardSigmoid/HardSwish from MXNet operators |
Sources: tools/mxnet/mxnet2ncnn.cpp795-871 tools/mxnet/mxnet2ncnn.cpp873-957
The Caffe converter `caffe2ncnn.cpp` directly maps Caffe layer types to ncnn layer types:
Sources: tools/caffe/caffe2ncnn.cpp178-226
After initial conversion, the model typically goes through optimization with ncnnoptimize:
Sources: tools/ncnnoptimize.cpp42-78 tools/ncnnoptimize.cpp85-144 tools/ncnnoptimize.cpp146-227
This common fusion combines convolution and batch normalization into a single operation:
Before: Conv(weight, bias) -> BatchNorm(mean, var, slope, bias_bn)
After: Conv(weight_fused, bias_fused)
where:
b = slope / sqrt(var + eps)
a = bias_bn - slope * mean / sqrt(var + eps)
weight_fused = weight * b
bias_fused = bias * b + a
Sources: tools/ncnnoptimize.cpp146-227
Fuses activation functions directly into convolution layers:
Supported activations:
- ReLU: fuse_convolution_activation (activation_type=1)
- LeakyReLU: fuse_convolution_activation (activation_type=1, params)
- Clip: fuse_convolution_activation (activation_type=2)
- Sigmoid: fuse_convolution_activation (activation_type=3)
- Mish: fuse_convolution_activation (activation_type=5)
Sources: tools/ncnnoptimize.cpp1195-1435
The ncnn2mem tool converts ncnn models into C++ header files for embedding directly in application binaries:
The .id.h file provides compile-time constants for accessing model elements:
Sources: tools/ncnn2mem.cpp153-525 tools/ncnn2mem.cpp527-607
Different frameworks use different layer naming conventions. Converters normalize these to ncnn layer types:
| Caffe Type | MXNet Type | ONNX Type | NCNN Type |
|---|---|---|---|
Convolution | Convolution | Conv | Convolution |
Pooling | Pooling | MaxPool/AveragePool | Pooling |
InnerProduct | FullyConnected | Gemm/MatMul | InnerProduct |
BatchNorm | BatchNorm | BatchNormalization | BatchNorm |
ReLU | Activation (act_type=relu) | Relu | ReLU |
Dropout | Dropout | Dropout | Dropout |
Eltwise | elemwise_add/elemwise_mul | Add/Mul | Eltwise/BinaryOp |
Weight data layout differs by converter but follows these patterns:
Convolution Weights:
[num_output, channels, kernel_h, kernel_w][num_output, channels, kernel_h, kernel_w] (same as Caffe)[in_ch, out_ch] to [out_ch, in_ch]InnerProduct Weights:
[num_output, num_input]Sources: tools/caffe/caffe2ncnn.cpp340-413 tools/caffe/caffe2ncnn.cpp455-531 tools/caffe/caffe2ncnn.cpp591-621
After conversion, models should be validated to ensure correctness:
.param file starts with 7767517.bin file| Issue | Symptom | Solution |
|---|---|---|
| Unsupported layer | Layer type not found | Implement custom layer or use alternative pattern |
| Weight mismatch | Incorrect output values | Check weight transpose/reshape operations |
| Missing bias | Different results from original | Verify bias_term parameter during conversion |
| Shape mismatch | Runtime shape errors | Validate dynamic shape handling in converter |
After understanding the conversion overview:
Sources: tools/onnx/onnx2ncnn.cpp1-5000 tools/caffe/caffe2ncnn.cpp1-1000 tools/mxnet/mxnet2ncnn.cpp1-1500 tools/ncnnoptimize.cpp1-2500 tools/ncnn2mem.cpp1-630
Refresh this wiki
This wiki was recently refreshed. Please wait 2 days to refresh again.