This page documents the Layer base class and its virtual interface, the build-time layer registry generated by the ncnn_add_layer CMake macro, the runtime dispatch mechanism that selects the best platform-specific implementation, and the set of layer flags that control inference behavior.
For how Net uses layers during model loading and forward pass execution, see 2.1 For the Mat data structure that layers operate on, see 2.3 For the Option struct that is passed into layer methods, see 2.5 For Vulkan-specific layer implementations, see 3.4
Layer Base ClassEvery computational operation in ncnn is a subclass of Layer, declared in src/layer.h The class is abstract: all methods have default no-op or error-returning implementations in src/layer.cpp and concrete layer classes override the ones they need.
Diagram: Layer Virtual Method Lifecycle
Sources: src/layer.h29-44 src/layer.cpp44-62
The four lifecycle methods are:
| Method | Signature | Purpose |
|---|---|---|
load_param | virtual int load_param(const ParamDict& pd) | Read integer/float/array hyperparameters from the parsed .param file |
load_model | virtual int load_model(const ModelBin& mb) | Load weight tensors from the .bin file |
create_pipeline | virtual int create_pipeline(const Option& opt) | One-time setup: weight packing, pipeline allocation. Called after load_model. |
destroy_pipeline | virtual int destroy_pipeline(const Option& opt) | Release resources allocated in create_pipeline |
The forward methods come in four overloads for the CPU path:
| Method | Use case |
|---|---|
forward(const vector<Mat>& bottom, vector<Mat>& top, const Option& opt) | Multiple inputs → multiple outputs |
forward(const Mat& bottom, Mat& top, const Option& opt) | Single input → single output (used when one_blob_only is true) |
forward_inplace(vector<Mat>& bottom_top, const Option& opt) | In-place multiple blobs |
forward_inplace(Mat& bottom_top, const Option& opt) | In-place single blob |
When support_inplace is set but forward is not overridden, the base Layer::forward implementation in src/layer.cpp64-90 clones the input and delegates to forward_inplace.
When NCNN_VULKAN is enabled, four parallel overloads exist taking VkMat and VkCompute& arguments src/layer.h102-116 plus upload_model(VkTransfer&, const Option&) for uploading weights to device memory.
Each Layer instance carries a set of bool flags that the runtime uses to select execution paths and apply automatic format conversions.
Diagram: Layer Flag Fields on the Layer Class
Sources: src/layer.h44-90 src/layer.cpp14-38
All flags default to false in Layer::Layer() src/layer.cpp14-38 Concrete subclasses set them in their constructors. For example, Flatten sets one_blob_only = true src/layer/flatten.cpp11-13 and Packing sets one_blob_only = true, support_inplace = false src/layer/packing.cpp9-12
| Flag | Effect when true |
|---|---|
one_blob_only | Net calls forward(Mat, Mat, Option) instead of the vector overload |
support_inplace | Net may skip allocating a separate output blob |
support_packing | Net will not insert a Packing layer to unpack blobs before this layer |
support_vulkan | Net will use the GPU path for this layer if opt.use_vulkan_compute is set |
support_bf16_storage | Net will not insert a Cast layer to convert from BF16 |
support_fp16_storage | Net will not insert a Cast layer to convert from FP16 |
support_any_packing | Layer can handle any elempack value without a preceding Packing conversion |
Other instance fields on Layer:
| Field | Type | Meaning |
|---|---|---|
typeindex | int | Integer index into the layer registry |
type | std::string | Layer class name (e.g. "Convolution") — only present when NCNN_STRING is on |
name | std::string | Instance name from .param (e.g. "conv1") |
bottoms | vector<int> | Indices into the Net's blob array for input blobs |
tops | vector<int> | Indices into the Net's blob array for output blobs |
bottom_shapes / top_shapes | vector<Mat> | Shape hints (populated during graph construction) |
userdata | void* | Passed through to the factory function; used by custom layer registrations |
The registry maps integer type indices (and optionally string names) to factory functions that create Layer instances. It is generated at build time from CMake variables.
layer_registry_entry and Factory Functionssrc/layer.h143-165 defines the core types:
typedef Layer* (*layer_creator_func)(void*);
typedef void (*layer_destroyer_func)(Layer*, void*);
struct layer_registry_entry {
const char* name; // only with NCNN_STRING
layer_creator_func creator;
};
#define DEFINE_LAYER_CREATOR(name) \
::ncnn::Layer* name##_layer_creator(void*) { return new name; }
The DEFINE_LAYER_CREATOR macro generates a C-linkage factory function for each layer class. These are emitted into layer_declaration.h, which is #include-d at the top of src/layer.cpp10
The generated file layer_registry.h (from src/layer_registry.h.in) contains multiple static arrays of layer_registry_entry:
| Array name | Contents |
|---|---|
layer_registry | Base (portable) implementations |
layer_registry_arch | Architecture-specific implementations (ARM, x86, RISC-V, MIPS, LoongArch) |
layer_registry_avx512 | x86 AVX-512 optimized variants |
layer_registry_fma | x86 FMA optimized variants |
layer_registry_avx | x86 AVX optimized variants |
layer_registry_arm82 | ARMv8.2 FP16 variants |
layer_registry_arm82dot | ARMv8.2 dot-product variants |
layer_registry_arm84bf16 | ARMv8.4 BF16 variants |
layer_registry_arm84i8mm | ARMv8.4 INT8 matrix-multiply variants |
layer_registry_arm86sve | ARMv8.6 SVE variants |
layer_registry_msa | MIPS MSA variants |
layer_registry_lsx / layer_registry_lasx | LoongArch LSX/LASX variants |
layer_registry_rvv | RISC-V Vector variants |
Each array has one entry per registered layer type, at the same integer index. If a layer has no ISA-specific override for a given column, the entry in that column still points to the base layer_creator_func.
Sources: src/layer_registry.h.in1-60 src/layer.cpp143-145
ncnn_add_layer CMake MacroAll layer registration is driven by ncnn_add_layer calls in src/CMakeLists.txt66-176 The macro is defined in cmake/ncnn_add_layer.cmake82-160
Diagram: ncnn_add_layer(Convolution) Expansion
Sources: cmake/ncnn_add_layer.cmake82-160 src/CMakeLists.txt56-190
For ISA variants on x86 (AVX, FMA, AVX-512, etc.) and ARM (ARMv8.2, ARMv8.4, etc.), the helper macro ncnn_add_arch_opt_layer generates additional source files by running a CMake script that wraps the arch source in a new class name (e.g., Convolution_x86_avx512) and sets per-file compile flags such as -mavx512f. Entries for each ISA variant go into the corresponding layer_registry_avx512, layer_registry_fma, etc. arrays.
The final step in src/CMakeLists.txt184-190 writes the collected CMake string variables into concrete header files via configure_file:
configure_file(layer_declaration.h.in → layer_declaration.h)
configure_file(layer_registry.h.in → layer_registry.h)
configure_file(layer_type_enum.h.in → layer_type_enum.h)
WITH_LAYER_xxx OptionsEvery ncnn_add_layer(Foo) call creates a CMake option WITH_LAYER_foo defaulting to ON. Passing ncnn_add_layer(ArgMax OFF) (as seen at src/CMakeLists.txt67) sets the default to OFF. This lets integrators strip unused layers from the build to reduce binary size.
create_layer and Layer_finalDiagram: create_layer Runtime Dispatch Chain
Sources: src/layer.cpp143-450 src/layer_registry.h.in1-60
Layer_final src/layer.cpp199 is an internal wrapper class that holds two pointers:
layer_cpu — the best available CPU implementation selected from the ISA-specific registrieslayer_vulkan — the Vulkan counterpart (when NCNN_VULKAN is compiled in and a Vulkan class exists for this layer)Layer_final overrides all virtual methods and delegates to layer_cpu or layer_vulkan as appropriate. After construction, get_layer_properties() copies the flag values (one_blob_only, support_inplace, support_packing, etc.) from layer_cpu into the Layer_final instance itself so that callers observe the correct flags through the base Layer* pointer.
Three additional factory functions are available for specific use cases:
| Function | Description |
|---|---|
create_layer(int index) | Creates a Layer_final wrapping both CPU and Vulkan instances |
create_layer_naive(int index) | Creates a plain base implementation from layer_registry only |
create_layer_cpu(int index) | Creates the best CPU implementation only, without the Layer_final wrapper |
create_layer_vulkan(int index) | Creates the Vulkan implementation only |
String-based overloads (create_layer(const char* type)) are available when NCNN_STRING is enabled, and use layer_to_index() src/layer.cpp148-157 to convert the name to an integer before dispatching.
Net provides register_custom_layer methods src/net.h52-57 to register user-defined layers at runtime:
Custom layers override entries in an internal per-Net custom registry that is consulted before the global layer_registry during load_param. The layer_creator_func signature is Layer* (*)(void* userdata), matching the DEFINE_LAYER_CREATOR expansion.
The Python bindings python/src/main.cpp39-104 implement this via a fixed pool of 10 pre-allocated LayerCreator/LayerDestroyer function slots (g_layer_factroys), each of which calls back into a std::function<Layer*()> stored in a LayerFactory struct.
As of the current src/CMakeLists.txt, the following layers are registered by default (layers marked OFF are excluded from standard builds):
AbsVal, BatchNorm, Bias, BNLL, Concat, Convolution, Crop, Deconvolution, Dropout, Eltwise, ELU, Embed, Exp, Flatten, InnerProduct, Input, Log, LRN, MemoryData, MVN, Pooling, Power, PReLU, Reduction, ReLU, Reshape, Scale, Sigmoid, Slice, Softmax, Split, TanH, Threshold, Tile, RNN, LSTM, BinaryOp, UnaryOp, ConvolutionDepthWise, Padding, Squeeze, ExpandDims, Normalize, Permute, PriorBox, DetectionOutput, Interp, DeconvolutionDepthWise, ShuffleChannel, InstanceNorm, Clip, Quantize, Dequantize, Packing, Requantize, Cast, HardSigmoid, SELU, HardSwish, Noop, Mish, Swish, Gemm, GroupNorm, LayerNorm, GRU, MultiHeadAttention, GELU, Convolution1D, Pooling1D, Gemm, MatMul, GridSample, RMSNorm, SDPA, RotaryEmbed, and many more.
Disabled by default: ArgMax, SPP, Proposal, ROIPooling, YoloDetectionOutput, Yolov3DetectionOutput, PSROIPooling, ROIAlign, DeepCopy, StatisticsPooling.
Sources: src/CMakeLists.txt66-176
Refresh this wiki
This wiki was recently refreshed. Please wait 3 days to refresh again.