Model Conversion Overview

Relevant source files

This document provides an overview of the model conversion pipeline in ncnn, explaining how neural network models from different frameworks are converted into the ncnn format for optimized inference. It covers the supported input formats, the conversion workflow, and the ncnn model representation.

For detailed information about specific converters and tools:

PyTorch conversion details: see PNNX PyTorch Converter Architecture
PNNX intermediate representation: see PNNX Intermediate Representation
ONNX, Caffe, and MXNet converters: see ONNX, MXNet, and Caffe Converters
Post-conversion optimization: see Model Optimization Tools
Quantization tools: see Post-Training Quantization Tools

Supported Input Formats

ncnn supports model conversion from four major deep learning frameworks through dedicated conversion tools:

Framework	Converter Tool	Primary Use Case
PyTorch	`pnnx`	Modern PyTorch models, TorchScript
ONNX	`onnx2ncnn`	ONNX format models from various frameworks
Caffe	`caffe2ncnn`	Legacy Caffe models
MXNet	`mxnet2ncnn`	MXNet models

Converter Tool Mapping

Sources: tools/onnx/onnx2ncnn.cpp1-100 tools/caffe/caffe2ncnn.cpp1-100 tools/mxnet/mxnet2ncnn.cpp1-100

NCNN Model Format

ncnn uses a dual-file format for model representation:

Model File Structure

File Extension	Content	Format
`.param`	Network topology and layer parameters	Human-readable text
`.bin`	Weight data and learned parameters	Binary floating-point data

The .param file contains:

Magic number (7767517) identifying the file format
Layer count and blob count metadata
Layer definitions with type, name, connections, and parameters
Blob names defining data flow between layers

The .bin file contains:

Quantization tags for each weight tensor
Raw weight data in 32-bit float format (or quantized formats)
Bias terms for layers with bias

Parameter File Format

Sources: tools/ncnn2mem.cpp153-197 tools/caffe/caffe2ncnn.cpp96-100

Conversion Workflow

High-Level Conversion Pipeline

The conversion process follows a multi-stage pipeline:

Sources: tools/ncnnoptimize.cpp36-78 tools/ncnn2mem.cpp609-629

Conversion Tool Architecture

Each converter follows a similar pattern but adapts to framework-specific requirements:

Sources: tools/caffe/caffe2ncnn.cpp23-63 tools/mxnet/mxnet2ncnn.cpp342-583 tools/onnx/onnx2ncnn.cpp19-42

Converter-Specific Fusion Passes

Each converter implements framework-specific optimizations during conversion:

ONNX Converter Fusions

The ONNX converter `onnx2ncnn.cpp` performs multiple fusion passes to simplify the graph:

Fusion Pass	Pattern	Purpose
`fuse_weight_reshape`	`Reshape(weight)`	Reshape weight tensors directly
`fuse_weight_transpose`	`Transpose(weight)`	Transpose weight matrices for efficiency
`fuse_shufflechannel`	`Reshape-Transpose-Reshape`	Recognize ShuffleChannel pattern
`fuse_hardswish`	`Add(+3)-Clip(0,6)-Mul-Div(/6)`	Fuse HardSwish activation
`fuse_hardsigmoid`	`Add(+3)-Clip(0,6)-Div(/6)`	Fuse HardSigmoid activation
`fuse_swish`	`Sigmoid-Mul`	Fuse Swish activation
`fuse_batchnorm1d_squeeze_unsqueeze`	`Unsqueeze-BatchNorm-Squeeze`	Remove dimension manipulations

Sources: tools/onnx/onnx2ncnn.cpp404-451 tools/onnx/onnx2ncnn.cpp524-650 tools/onnx/onnx2ncnn.cpp729-902

MXNet Converter Fusions

The MXNet converter `mxnet2ncnn.cpp` includes:

Fusion Pass	Pattern	Purpose
`fuse_shufflechannel`	`Reshape-SwapAxis-Reshape`	Recognize ShuffleChannel from MXNet
`fuse_hardsigmoid_hardswish`	`_plus_scalar(+3)-clip(0,6)-_div_scalar(/6)`	Fuse HardSigmoid/HardSwish from MXNet operators

Sources: tools/mxnet/mxnet2ncnn.cpp795-871 tools/mxnet/mxnet2ncnn.cpp873-957

Caffe Converter Layer Mapping

The Caffe converter `caffe2ncnn.cpp` directly maps Caffe layer types to ncnn layer types:

Sources: tools/caffe/caffe2ncnn.cpp178-226

Post-Conversion Optimization

After initial conversion, the model typically goes through optimization with ncnnoptimize:

Optimization Categories

Sources: tools/ncnnoptimize.cpp42-78 tools/ncnnoptimize.cpp85-144 tools/ncnnoptimize.cpp146-227

Key Fusion Examples

Convolution + BatchNorm Fusion

This common fusion combines convolution and batch normalization into a single operation:

Before: Conv(weight, bias) -> BatchNorm(mean, var, slope, bias_bn)
After:  Conv(weight_fused, bias_fused)

where:
  b = slope / sqrt(var + eps)
  a = bias_bn - slope * mean / sqrt(var + eps)
  weight_fused = weight * b
  bias_fused = bias * b + a

Sources: tools/ncnnoptimize.cpp146-227

Convolution + Activation Fusion

Fuses activation functions directly into convolution layers:

Supported activations:
- ReLU: fuse_convolution_activation (activation_type=1)
- LeakyReLU: fuse_convolution_activation (activation_type=1, params)
- Clip: fuse_convolution_activation (activation_type=2)
- Sigmoid: fuse_convolution_activation (activation_type=3)
- Mish: fuse_convolution_activation (activation_type=5)

Sources: tools/ncnnoptimize.cpp1195-1435

Binary Embedding

The ncnn2mem tool converts ncnn models into C++ header files for embedding directly in application binaries:

ncnn2mem Workflow

Generated ID Header Structure

The .id.h file provides compile-time constants for accessing model elements:

Sources: tools/ncnn2mem.cpp153-525 tools/ncnn2mem.cpp527-607

Common Conversion Patterns

Layer Type Mapping

Different frameworks use different layer naming conventions. Converters normalize these to ncnn layer types:

Caffe Type	MXNet Type	ONNX Type	NCNN Type
`Convolution`	`Convolution`	`Conv`	`Convolution`
`Pooling`	`Pooling`	`MaxPool`/`AveragePool`	`Pooling`
`InnerProduct`	`FullyConnected`	`Gemm`/`MatMul`	`InnerProduct`
`BatchNorm`	`BatchNorm`	`BatchNormalization`	`BatchNorm`
`ReLU`	`Activation` (act_type=relu)	`Relu`	`ReLU`
`Dropout`	`Dropout`	`Dropout`	`Dropout`
`Eltwise`	`elemwise_add`/`elemwise_mul`	`Add`/`Mul`	`Eltwise`/`BinaryOp`

Weight Data Organization

Weight data layout differs by converter but follows these patterns:

Convolution Weights:

Caffe format: [num_output, channels, kernel_h, kernel_w]
ncnn format: [num_output, channels, kernel_h, kernel_w] (same as Caffe)
Deconvolution: Requires transposition from [in_ch, out_ch] to [out_ch, in_ch]

InnerProduct Weights:

Stored as [num_output, num_input]
May be transposed during conversion depending on source framework

Sources: tools/caffe/caffe2ncnn.cpp340-413 tools/caffe/caffe2ncnn.cpp455-531 tools/caffe/caffe2ncnn.cpp591-621

Model Validation

After conversion, models should be validated to ensure correctness:

Validation Checklist

Magic Number: Verify .param file starts with 7767517
Layer Count: Ensure layer count matches expected architecture
Blob Connectivity: Check that all blobs are properly connected
Weight Sizes: Validate weight tensor dimensions match layer parameters
Quantization Tags: Verify quantization tags are present in .bin file

Common Conversion Issues

Issue	Symptom	Solution
Unsupported layer	Layer type not found	Implement custom layer or use alternative pattern
Weight mismatch	Incorrect output values	Check weight transpose/reshape operations
Missing bias	Different results from original	Verify `bias_term` parameter during conversion
Shape mismatch	Runtime shape errors	Validate dynamic shape handling in converter

Next Steps

After understanding the conversion overview:

For PyTorch models: Study the PNNX PyTorch Converter Architecture for the advanced 6-level optimization pipeline
For PNNX details: See PNNX Intermediate Representation for the IR format specification
For other frameworks: Review ONNX, MXNet, and Caffe Converters for framework-specific details
For optimization: Learn about Model Optimization Tools including ncnnoptimize usage patterns
For quantization: Explore Post-Training Quantization Tools for INT8 model generation

Sources: tools/onnx/onnx2ncnn.cpp1-5000 tools/caffe/caffe2ncnn.cpp1-1000 tools/mxnet/mxnet2ncnn.cpp1-1500 tools/ncnnoptimize.cpp1-2500 tools/ncnn2mem.cpp1-630

Model Conversion Overview

Relevant source files

For detailed information about specific converters and tools:

PyTorch conversion details: see PNNX PyTorch Converter Architecture
PNNX intermediate representation: see PNNX Intermediate Representation
ONNX, Caffe, and MXNet converters: see ONNX, MXNet, and Caffe Converters
Post-conversion optimization: see Model Optimization Tools
Quantization tools: see Post-Training Quantization Tools

Supported Input Formats

ncnn supports model conversion from four major deep learning frameworks through dedicated conversion tools:

Framework	Converter Tool	Primary Use Case
PyTorch	`pnnx`	Modern PyTorch models, TorchScript
ONNX	`onnx2ncnn`	ONNX format models from various frameworks
Caffe	`caffe2ncnn`	Legacy Caffe models
MXNet	`mxnet2ncnn`	MXNet models

Converter Tool Mapping

Sources: tools/onnx/onnx2ncnn.cpp1-100 tools/caffe/caffe2ncnn.cpp1-100 tools/mxnet/mxnet2ncnn.cpp1-100

NCNN Model Format

ncnn uses a dual-file format for model representation:

Model File Structure

File Extension	Content	Format
`.param`	Network topology and layer parameters	Human-readable text
`.bin`	Weight data and learned parameters	Binary floating-point data

The .param file contains:

Magic number (7767517) identifying the file format
Layer count and blob count metadata
Layer definitions with type, name, connections, and parameters
Blob names defining data flow between layers

The .bin file contains:

Quantization tags for each weight tensor
Raw weight data in 32-bit float format (or quantized formats)
Bias terms for layers with bias

Parameter File Format

Sources: tools/ncnn2mem.cpp153-197 tools/caffe/caffe2ncnn.cpp96-100

Conversion Workflow

High-Level Conversion Pipeline

The conversion process follows a multi-stage pipeline:

Sources: tools/ncnnoptimize.cpp36-78 tools/ncnn2mem.cpp609-629

Conversion Tool Architecture

Each converter follows a similar pattern but adapts to framework-specific requirements:

Sources: tools/caffe/caffe2ncnn.cpp23-63 tools/mxnet/mxnet2ncnn.cpp342-583 tools/onnx/onnx2ncnn.cpp19-42

Converter-Specific Fusion Passes

Each converter implements framework-specific optimizations during conversion:

ONNX Converter Fusions

The ONNX converter `onnx2ncnn.cpp` performs multiple fusion passes to simplify the graph:

Fusion Pass	Pattern	Purpose
`fuse_weight_reshape`	`Reshape(weight)`	Reshape weight tensors directly
`fuse_weight_transpose`	`Transpose(weight)`	Transpose weight matrices for efficiency
`fuse_shufflechannel`	`Reshape-Transpose-Reshape`	Recognize ShuffleChannel pattern
`fuse_hardswish`	`Add(+3)-Clip(0,6)-Mul-Div(/6)`	Fuse HardSwish activation
`fuse_hardsigmoid`	`Add(+3)-Clip(0,6)-Div(/6)`	Fuse HardSigmoid activation
`fuse_swish`	`Sigmoid-Mul`	Fuse Swish activation
`fuse_batchnorm1d_squeeze_unsqueeze`	`Unsqueeze-BatchNorm-Squeeze`	Remove dimension manipulations

Sources: tools/onnx/onnx2ncnn.cpp404-451 tools/onnx/onnx2ncnn.cpp524-650 tools/onnx/onnx2ncnn.cpp729-902

MXNet Converter Fusions

The MXNet converter `mxnet2ncnn.cpp` includes:

Fusion Pass	Pattern	Purpose
`fuse_shufflechannel`	`Reshape-SwapAxis-Reshape`	Recognize ShuffleChannel from MXNet
`fuse_hardsigmoid_hardswish`	`_plus_scalar(+3)-clip(0,6)-_div_scalar(/6)`	Fuse HardSigmoid/HardSwish from MXNet operators

Sources: tools/mxnet/mxnet2ncnn.cpp795-871 tools/mxnet/mxnet2ncnn.cpp873-957

Caffe Converter Layer Mapping

The Caffe converter `caffe2ncnn.cpp` directly maps Caffe layer types to ncnn layer types:

Sources: tools/caffe/caffe2ncnn.cpp178-226

Post-Conversion Optimization

After initial conversion, the model typically goes through optimization with ncnnoptimize:

Optimization Categories

Sources: tools/ncnnoptimize.cpp42-78 tools/ncnnoptimize.cpp85-144 tools/ncnnoptimize.cpp146-227

Key Fusion Examples

Convolution + BatchNorm Fusion

This common fusion combines convolution and batch normalization into a single operation:

Before: Conv(weight, bias) -> BatchNorm(mean, var, slope, bias_bn)
After:  Conv(weight_fused, bias_fused)

where:
  b = slope / sqrt(var + eps)
  a = bias_bn - slope * mean / sqrt(var + eps)
  weight_fused = weight * b
  bias_fused = bias * b + a

Sources: tools/ncnnoptimize.cpp146-227

Convolution + Activation Fusion

Fuses activation functions directly into convolution layers:

Supported activations:
- ReLU: fuse_convolution_activation (activation_type=1)
- LeakyReLU: fuse_convolution_activation (activation_type=1, params)
- Clip: fuse_convolution_activation (activation_type=2)
- Sigmoid: fuse_convolution_activation (activation_type=3)
- Mish: fuse_convolution_activation (activation_type=5)

Sources: tools/ncnnoptimize.cpp1195-1435

Binary Embedding

The ncnn2mem tool converts ncnn models into C++ header files for embedding directly in application binaries:

ncnn2mem Workflow

Generated ID Header Structure

The .id.h file provides compile-time constants for accessing model elements:

Sources: tools/ncnn2mem.cpp153-525 tools/ncnn2mem.cpp527-607

Common Conversion Patterns

Layer Type Mapping

Different frameworks use different layer naming conventions. Converters normalize these to ncnn layer types:

Caffe Type	MXNet Type	ONNX Type	NCNN Type
`Convolution`	`Convolution`	`Conv`	`Convolution`
`Pooling`	`Pooling`	`MaxPool`/`AveragePool`	`Pooling`
`InnerProduct`	`FullyConnected`	`Gemm`/`MatMul`	`InnerProduct`
`BatchNorm`	`BatchNorm`	`BatchNormalization`	`BatchNorm`
`ReLU`	`Activation` (act_type=relu)	`Relu`	`ReLU`
`Dropout`	`Dropout`	`Dropout`	`Dropout`
`Eltwise`	`elemwise_add`/`elemwise_mul`	`Add`/`Mul`	`Eltwise`/`BinaryOp`

Weight Data Organization

Weight data layout differs by converter but follows these patterns:

Convolution Weights:

Caffe format: [num_output, channels, kernel_h, kernel_w]
ncnn format: [num_output, channels, kernel_h, kernel_w] (same as Caffe)
Deconvolution: Requires transposition from [in_ch, out_ch] to [out_ch, in_ch]

InnerProduct Weights:

Stored as [num_output, num_input]
May be transposed during conversion depending on source framework

Sources: tools/caffe/caffe2ncnn.cpp340-413 tools/caffe/caffe2ncnn.cpp455-531 tools/caffe/caffe2ncnn.cpp591-621

Model Validation

After conversion, models should be validated to ensure correctness:

Validation Checklist

Magic Number: Verify .param file starts with 7767517
Layer Count: Ensure layer count matches expected architecture
Blob Connectivity: Check that all blobs are properly connected
Weight Sizes: Validate weight tensor dimensions match layer parameters
Quantization Tags: Verify quantization tags are present in .bin file

Common Conversion Issues

Issue	Symptom	Solution
Unsupported layer	Layer type not found	Implement custom layer or use alternative pattern
Weight mismatch	Incorrect output values	Check weight transpose/reshape operations
Missing bias	Different results from original	Verify `bias_term` parameter during conversion
Shape mismatch	Runtime shape errors	Validate dynamic shape handling in converter

Next Steps

After understanding the conversion overview:

For PyTorch models: Study the PNNX PyTorch Converter Architecture for the advanced 6-level optimization pipeline
For PNNX details: See PNNX Intermediate Representation for the IR format specification
For other frameworks: Review ONNX, MXNet, and Caffe Converters for framework-specific details
For optimization: Learn about Model Optimization Tools including ncnnoptimize usage patterns
For quantization: Explore Post-Training Quantization Tools for INT8 model generation

Sources: tools/onnx/onnx2ncnn.cpp1-5000 tools/caffe/caffe2ncnn.cpp1-1000 tools/mxnet/mxnet2ncnn.cpp1-1500 tools/ncnnoptimize.cpp1-2500 tools/ncnn2mem.cpp1-630

Model Conversion Overview

Supported Input Formats

Converter Tool Mapping

NCNN Model Format

Model File Structure

Parameter File Format

Conversion Workflow

High-Level Conversion Pipeline

Conversion Tool Architecture

Converter-Specific Fusion Passes

ONNX Converter Fusions

MXNet Converter Fusions

Caffe Converter Layer Mapping

Post-Conversion Optimization

Optimization Categories

Key Fusion Examples

Convolution + BatchNorm Fusion

Convolution + Activation Fusion

Binary Embedding

ncnn2mem Workflow

Generated ID Header Structure

Common Conversion Patterns

Layer Type Mapping

Weight Data Organization

Model Validation

Validation Checklist

Common Conversion Issues

Next Steps

On this page

Model Conversion Overview

Supported Input Formats

Converter Tool Mapping

NCNN Model Format

Model File Structure

Parameter File Format

Conversion Workflow

High-Level Conversion Pipeline

Conversion Tool Architecture

Converter-Specific Fusion Passes

ONNX Converter Fusions

MXNet Converter Fusions

Caffe Converter Layer Mapping

Post-Conversion Optimization

Optimization Categories

Key Fusion Examples

Convolution + BatchNorm Fusion

Convolution + Activation Fusion

Binary Embedding

ncnn2mem Workflow

Generated ID Header Structure

Common Conversion Patterns

Layer Type Mapping

Weight Data Organization

Model Validation

Validation Checklist

Common Conversion Issues

Next Steps

On this page