Mobile and Edge Deployment

Relevant source files

This document explains how to deploy PaddleOCR models on mobile devices and edge platforms using Paddle-Lite. It covers model optimization, C++ inference implementation, build configuration, and deployment workflows for ARM-based devices (ARMv7/ARMv8).

For Python-based inference on servers or desktops, see 5.1 Python Inference System. For C++ deployment on x86/GPU servers, see 5.3 C++ Inference and Build System.

Overview and Architecture

Paddle-Lite is a lightweight inference engine that enables efficient OCR execution on resource-constrained mobile and IoT devices. The mobile deployment system provides C++ implementations of detection, recognition, and classification pipelines optimized for ARM processors.

Sources: deploy/lite/readme.md1-299 deploy/lite/ocr_db_crnn.cc1-679

Paddle-Lite Framework

Paddle-Lite provides optimized inference for mobile and embedded devices through:

Lightweight Runtime: Minimal memory footprint and fast model loading
Model Optimization: Quantization, operator fusion, and memory optimization
ARM NEON Acceleration: SIMD instructions for faster computation
Multiple Precision Support: INT8 quantization and FP32 inference

Framework Components

Component	File Format	Purpose
MobileConfig	C++ API	Runtime configuration for model loading
PaddlePredictor	C++ API	Inference execution interface
Optimized Models	.nb files	Serialized models in naive_buffer format
Prediction Library	.so/.a	Shared/static libraries for ARM

Sources: deploy/lite/readme.md12-88 deploy/lite/ocr_db_crnn.cc334-343

Model Optimization Pipeline

Converting PaddlePaddle inference models to Paddle-Lite format requires the paddle_lite_opt tool, which performs graph optimization and model serialization.

paddle_lite_opt Parameters

Parameter	Description	Example Value
`--model_file`	Path to .pdmodel file	`./inference.pdmodel`
`--param_file`	Path to .pdiparams file	`./inference.pdiparams`
`--optimize_out`	Output .nb model path	`./model_opt`
`--valid_targets`	Target hardware backend	`arm`
`--optimize_out_type`	Serialization format	`naive_buffer`

Optimization Example

Sources: deploy/lite/readme.md92-151 deploy/lite/readme.md111-149

C++ Inference System

The mobile deployment system is implemented in C++ with modular components for detection, recognition, and classification. The main entry point orchestrates the complete OCR pipeline.

Core Functions and File Structure

Source File	Key Functions	Purpose
ocr_db_crnn.cc662-679	`main()`	Program entry, mode selection
ocr_db_crnn.cc438-524	`system()`	Full detection + recognition pipeline
ocr_db_crnn.cc526-586	`det()`	Detection-only mode
ocr_db_crnn.cc588-660	`rec()`	Recognition-only mode
ocr_db_crnn.cc252-332	`RunDetModel()`	Text detection inference
ocr_db_crnn.cc158-250	`RunRecModel()`	Text recognition inference
ocr_db_crnn.cc111-156	`RunClsModel()`	Text orientation classification
ocr_db_crnn.cc28-69	`NeonMeanScale()`	ARM NEON preprocessing

Sources: deploy/lite/ocr_db_crnn.cc1-679

Detection Processing

The detection module implements DB (Differentiable Binarization) post-processing:

Sources: deploy/lite/ocr_db_crnn.cc252-332 deploy/lite/db_post_process.cc1-352

Recognition Processing

The recognition module extracts text from detected regions:

Sources: deploy/lite/ocr_db_crnn.cc158-250 deploy/lite/crnn_process.cc1-119 deploy/lite/cls_process.cc1-44

ARM NEON Optimization

The NeonMeanScale function uses ARM NEON intrinsics for efficient preprocessing:

SIMD Operations: Process 4 pixels simultaneously using float32x4_t vectors
Layout Transform: Convert NHWC (OpenCV) to NCHW (Paddle) format
Normalization: Apply mean subtraction and scaling in single pass

Sources: deploy/lite/ocr_db_crnn.cc28-69

Build System and Dependencies

The mobile deployment uses a Makefile-based build system for cross-compilation to ARM targets.

Dependency Table

Dependency	Version	Purpose	Download Link
Paddle-Lite	2.10	Inference runtime	v2.10 ARMv7/ARMv8
OpenCV	4.1.0	Image processing	Auto-downloaded by Makefile
Clipper	Latest	Polygon clipping for DB	Auto-downloaded by Makefile

Build Commands

Sources: deploy/lite/Makefile1-81 deploy/lite/readme.md34-88

Deployment Workflow

Detailed Deployment Steps

1. Environment Setup

2. Download Paddle-Lite Library

3. Prepare Demo Environment

4. Compile

5. Deploy to Android Device

Sources: deploy/lite/readme.md154-274 deploy/lite/prepare.sh1-10

Configuration System

The runtime behavior is controlled by deploy/lite/config.txt1-9 which specifies parameters for detection and recognition.

Configuration Parameters

Parameter	Type	Default	Description	Usage
`max_side_len`	int	960	Maximum dimension for input image resize	ocr_db_crnn.cc256
`det_db_thresh`	float	0.3	Threshold for binarizing DB probability map	ocr_db_crnn.cc305
`det_db_box_thresh`	float	0.5	Minimum confidence score for detected boxes	db_post_process.cc243
`det_db_unclip_ratio`	float	1.6	Expansion ratio for text boxes (higher = looser fit)	db_post_process.cc244
`det_db_use_dilate`	int	0	Apply morphological dilation to binary map	ocr_db_crnn.cc257-314
`det_use_polygon_score`	int	1	Use polygon-based scoring vs. fast box scoring	db_post_process.cc245-281
`use_direction_classify`	int	1	Enable text orientation classification (0° or 180°)	ocr_db_crnn.cc180-460
`rec_image_height`	int	48	Input height for recognition model (PP-OCRv3=48, v2=32)	ocr_db_crnn.cc186-604

Sources: deploy/lite/config.txt1-9 deploy/lite/ocr_db_crnn.cc388-397

Supported Models and Platforms

Pre-optimized Models

PaddleOCR provides pre-converted .nb models for immediate deployment:

Version	Type	Size	Detection Model	Orientation Model	Recognition Model	Paddle-Lite Version
PP-OCRv3	Standard	16.2M	ch_PP-OCRv3_det.nb	ch_ppocr_v2.0_cls.nb	ch_PP-OCRv3_rec.nb	v2.10
PP-OCRv3	Slim	5.9M	ch_PP-OCRv3_det_slim.nb	ch_ppocr_v2.0_cls_slim.nb	ch_PP-OCRv3_rec_slim.nb	v2.10
PP-OCRv2	Standard	11M	ch_PP-OCRv2_det.nb	ch_ppocr_v2.0_cls.nb	ch_PP-OCRv2_rec.nb	v2.10
PP-OCRv2	Slim	4.6M	ch_PP-OCRv2_det_slim.nb	ch_ppocr_v2.0_cls_slim.nb	ch_PP-OCRv2_rec_slim.nb	v2.10

Note: Slim models use INT8 quantization for reduced size and faster inference at minimal accuracy cost.

Sources: deploy/lite/readme.md100-106

Hardware Platform Support

Platform	Architecture	Bits	Library	ABI Setting
Android ARMv8	ARM Cortex-A53/A72/A73	64-bit	inference_lite_lib.android.armv8	`ARM_ABI=arm8`
Android ARMv7	ARM Cortex-A7/A9/A15	32-bit	inference_lite_lib.android.armv7	`ARM_ABI=arm7`
iOS ARMv8	Apple A-series	64-bit	inference_lite_lib.ios.armv8	iOS specific
iOS ARMv7	Apple A5/A6	32-bit	inference_lite_lib.ios.armv7	iOS specific

Sources: deploy/lite/readme.md37-46

Language Support

The system supports multiple languages through dictionary files:

Language	Dictionary File	Location
Chinese	`ppocr_keys_v1.txt`	ppocr/utils/ppocr_keys_v1.txt
English	`ic15_dict.txt`	ppocr/utils/ic15_dict.txt
French	`french_dict.txt`	ppocr/utils/dict/french_dict.txt
German	`german_dict.txt`	ppocr/utils/dict/german_dict.txt
Japanese	`japan_dict.txt`	ppocr/utils/dict/japan_dict.txt
Korean	`korean_dict.txt`	ppocr/utils/dict/korean_dict.txt

Sources: deploy/lite/readme.md227-235

Performance Considerations

Precision Modes

The system supports two precision modes:

FP32: Full floating-point precision (standard models)
INT8: 8-bit integer quantization (slim models)
- Reduces model size by ~70%
- Increases inference speed by 2-3x
- Minimal accuracy loss (<1-2%)

Threading Configuration

The number of threads can be configured via command-line argument (default: 10 threads):

Sources: ocr_db_crnn.cc339-444

Memory Optimization

Models are loaded once and reused across multiple images
Image preprocessing uses in-place operations where possible
NEON vectorization reduces memory bandwidth requirements

Troubleshooting

Common Issues

Q1: Model compatibility error

Error: This model is not supported, because kernel for 'io_copy' is not supported by Paddle-Lite.

Solution: Ensure paddlelite version matches prediction library version (both v2.10).

Q2: Model replacement

To use different models, simply replace the .nb files in the debug/ directory and re-push to device.

Q3: Testing with different images

Replace the test image in debug/ and push the updated folder:

Q4: Integration into mobile app

The C++ code in deploy/lite/ provides the core OCR algorithm. For a complete Android app example, see deploy/android_demo/.

Sources: deploy/lite/readme.md282-299

Summary

Mobile and edge deployment with Paddle-Lite enables efficient OCR inference on resource-constrained ARM devices through:

Model Optimization: Convert PaddlePaddle models to .nb format using paddle_lite_opt
C++ Implementation: Modular inference pipeline with detection, recognition, and classification
ARM Optimization: NEON-accelerated preprocessing and optimized model execution
Cross-compilation: Makefile-based build system for ARMv7/ARMv8 targets
Flexible Configuration: Runtime parameters via config.txt for tuning performance
Multiple Precision: Support for FP32 and INT8 quantized models

The deployment workflow involves model optimization, C++ compilation, file preparation, and execution on target devices via ADB commands.

Mobile and Edge Deployment

Relevant source files

For Python-based inference on servers or desktops, see 5.1 Python Inference System. For C++ deployment on x86/GPU servers, see 5.3 C++ Inference and Build System.

Overview and Architecture

Sources: deploy/lite/readme.md1-299 deploy/lite/ocr_db_crnn.cc1-679

Paddle-Lite Framework

Paddle-Lite provides optimized inference for mobile and embedded devices through:

Lightweight Runtime: Minimal memory footprint and fast model loading
Model Optimization: Quantization, operator fusion, and memory optimization
ARM NEON Acceleration: SIMD instructions for faster computation
Multiple Precision Support: INT8 quantization and FP32 inference

Framework Components

Component	File Format	Purpose
MobileConfig	C++ API	Runtime configuration for model loading
PaddlePredictor	C++ API	Inference execution interface
Optimized Models	.nb files	Serialized models in naive_buffer format
Prediction Library	.so/.a	Shared/static libraries for ARM

Sources: deploy/lite/readme.md12-88 deploy/lite/ocr_db_crnn.cc334-343

Model Optimization Pipeline

Converting PaddlePaddle inference models to Paddle-Lite format requires the paddle_lite_opt tool, which performs graph optimization and model serialization.

paddle_lite_opt Parameters

Parameter	Description	Example Value
`--model_file`	Path to .pdmodel file	`./inference.pdmodel`
`--param_file`	Path to .pdiparams file	`./inference.pdiparams`
`--optimize_out`	Output .nb model path	`./model_opt`
`--valid_targets`	Target hardware backend	`arm`
`--optimize_out_type`	Serialization format	`naive_buffer`

Optimization Example

Sources: deploy/lite/readme.md92-151 deploy/lite/readme.md111-149

C++ Inference System

The mobile deployment system is implemented in C++ with modular components for detection, recognition, and classification. The main entry point orchestrates the complete OCR pipeline.

Core Functions and File Structure

Source File	Key Functions	Purpose
ocr_db_crnn.cc662-679	`main()`	Program entry, mode selection
ocr_db_crnn.cc438-524	`system()`	Full detection + recognition pipeline
ocr_db_crnn.cc526-586	`det()`	Detection-only mode
ocr_db_crnn.cc588-660	`rec()`	Recognition-only mode
ocr_db_crnn.cc252-332	`RunDetModel()`	Text detection inference
ocr_db_crnn.cc158-250	`RunRecModel()`	Text recognition inference
ocr_db_crnn.cc111-156	`RunClsModel()`	Text orientation classification
ocr_db_crnn.cc28-69	`NeonMeanScale()`	ARM NEON preprocessing

Sources: deploy/lite/ocr_db_crnn.cc1-679

Detection Processing

The detection module implements DB (Differentiable Binarization) post-processing:

Sources: deploy/lite/ocr_db_crnn.cc252-332 deploy/lite/db_post_process.cc1-352

Recognition Processing

The recognition module extracts text from detected regions:

Sources: deploy/lite/ocr_db_crnn.cc158-250 deploy/lite/crnn_process.cc1-119 deploy/lite/cls_process.cc1-44

ARM NEON Optimization

The NeonMeanScale function uses ARM NEON intrinsics for efficient preprocessing:

SIMD Operations: Process 4 pixels simultaneously using float32x4_t vectors
Layout Transform: Convert NHWC (OpenCV) to NCHW (Paddle) format
Normalization: Apply mean subtraction and scaling in single pass

Sources: deploy/lite/ocr_db_crnn.cc28-69

Build System and Dependencies

The mobile deployment uses a Makefile-based build system for cross-compilation to ARM targets.

Dependency Table

Dependency	Version	Purpose	Download Link
Paddle-Lite	2.10	Inference runtime	v2.10 ARMv7/ARMv8
OpenCV	4.1.0	Image processing	Auto-downloaded by Makefile
Clipper	Latest	Polygon clipping for DB	Auto-downloaded by Makefile

Build Commands

Sources: deploy/lite/Makefile1-81 deploy/lite/readme.md34-88

Deployment Workflow

Detailed Deployment Steps

1. Environment Setup

2. Download Paddle-Lite Library

3. Prepare Demo Environment

4. Compile

5. Deploy to Android Device

Sources: deploy/lite/readme.md154-274 deploy/lite/prepare.sh1-10

Configuration System

The runtime behavior is controlled by deploy/lite/config.txt1-9 which specifies parameters for detection and recognition.

Configuration Parameters

Parameter	Type	Default	Description	Usage
`max_side_len`	int	960	Maximum dimension for input image resize	ocr_db_crnn.cc256
`det_db_thresh`	float	0.3	Threshold for binarizing DB probability map	ocr_db_crnn.cc305
`det_db_box_thresh`	float	0.5	Minimum confidence score for detected boxes	db_post_process.cc243
`det_db_unclip_ratio`	float	1.6	Expansion ratio for text boxes (higher = looser fit)	db_post_process.cc244
`det_db_use_dilate`	int	0	Apply morphological dilation to binary map	ocr_db_crnn.cc257-314
`det_use_polygon_score`	int	1	Use polygon-based scoring vs. fast box scoring	db_post_process.cc245-281
`use_direction_classify`	int	1	Enable text orientation classification (0° or 180°)	ocr_db_crnn.cc180-460
`rec_image_height`	int	48	Input height for recognition model (PP-OCRv3=48, v2=32)	ocr_db_crnn.cc186-604

Sources: deploy/lite/config.txt1-9 deploy/lite/ocr_db_crnn.cc388-397

Supported Models and Platforms

Pre-optimized Models

PaddleOCR provides pre-converted .nb models for immediate deployment:

Version	Type	Size	Detection Model	Orientation Model	Recognition Model	Paddle-Lite Version
PP-OCRv3	Standard	16.2M	ch_PP-OCRv3_det.nb	ch_ppocr_v2.0_cls.nb	ch_PP-OCRv3_rec.nb	v2.10
PP-OCRv3	Slim	5.9M	ch_PP-OCRv3_det_slim.nb	ch_ppocr_v2.0_cls_slim.nb	ch_PP-OCRv3_rec_slim.nb	v2.10
PP-OCRv2	Standard	11M	ch_PP-OCRv2_det.nb	ch_ppocr_v2.0_cls.nb	ch_PP-OCRv2_rec.nb	v2.10
PP-OCRv2	Slim	4.6M	ch_PP-OCRv2_det_slim.nb	ch_ppocr_v2.0_cls_slim.nb	ch_PP-OCRv2_rec_slim.nb	v2.10

Note: Slim models use INT8 quantization for reduced size and faster inference at minimal accuracy cost.

Sources: deploy/lite/readme.md100-106

Hardware Platform Support

Platform	Architecture	Bits	Library	ABI Setting
Android ARMv8	ARM Cortex-A53/A72/A73	64-bit	inference_lite_lib.android.armv8	`ARM_ABI=arm8`
Android ARMv7	ARM Cortex-A7/A9/A15	32-bit	inference_lite_lib.android.armv7	`ARM_ABI=arm7`
iOS ARMv8	Apple A-series	64-bit	inference_lite_lib.ios.armv8	iOS specific
iOS ARMv7	Apple A5/A6	32-bit	inference_lite_lib.ios.armv7	iOS specific

Sources: deploy/lite/readme.md37-46

Language Support

The system supports multiple languages through dictionary files:

Language	Dictionary File	Location
Chinese	`ppocr_keys_v1.txt`	ppocr/utils/ppocr_keys_v1.txt
English	`ic15_dict.txt`	ppocr/utils/ic15_dict.txt
French	`french_dict.txt`	ppocr/utils/dict/french_dict.txt
German	`german_dict.txt`	ppocr/utils/dict/german_dict.txt
Japanese	`japan_dict.txt`	ppocr/utils/dict/japan_dict.txt
Korean	`korean_dict.txt`	ppocr/utils/dict/korean_dict.txt

Sources: deploy/lite/readme.md227-235

Performance Considerations

Precision Modes

The system supports two precision modes:

FP32: Full floating-point precision (standard models)
INT8: 8-bit integer quantization (slim models)
- Reduces model size by ~70%
- Increases inference speed by 2-3x
- Minimal accuracy loss (<1-2%)

Threading Configuration

The number of threads can be configured via command-line argument (default: 10 threads):

Sources: ocr_db_crnn.cc339-444

Memory Optimization

Models are loaded once and reused across multiple images
Image preprocessing uses in-place operations where possible
NEON vectorization reduces memory bandwidth requirements

Troubleshooting

Common Issues

Q1: Model compatibility error

Error: This model is not supported, because kernel for 'io_copy' is not supported by Paddle-Lite.

Solution: Ensure paddlelite version matches prediction library version (both v2.10).

Q2: Model replacement

To use different models, simply replace the .nb files in the debug/ directory and re-push to device.

Q3: Testing with different images

Replace the test image in debug/ and push the updated folder:

Q4: Integration into mobile app

The C++ code in deploy/lite/ provides the core OCR algorithm. For a complete Android app example, see deploy/android_demo/.

Sources: deploy/lite/readme.md282-299

Summary

Mobile and edge deployment with Paddle-Lite enables efficient OCR inference on resource-constrained ARM devices through:

Model Optimization: Convert PaddlePaddle models to .nb format using paddle_lite_opt
C++ Implementation: Modular inference pipeline with detection, recognition, and classification
ARM Optimization: NEON-accelerated preprocessing and optimized model execution
Cross-compilation: Makefile-based build system for ARMv7/ARMv8 targets
Flexible Configuration: Runtime parameters via config.txt for tuning performance
Multiple Precision: Support for FP32 and INT8 quantized models

The deployment workflow involves model optimization, C++ compilation, file preparation, and execution on target devices via ADB commands.

Mobile and Edge Deployment

Overview and Architecture

Paddle-Lite Framework

Framework Components

Model Optimization Pipeline

paddle_lite_opt Parameters

Optimization Example

C++ Inference System

Core Functions and File Structure

Detection Processing

Recognition Processing

ARM NEON Optimization

Build System and Dependencies

Dependency Table

Build Commands

Deployment Workflow

Detailed Deployment Steps

Configuration System

Configuration Parameters

Supported Models and Platforms

Pre-optimized Models

Hardware Platform Support

Language Support

Performance Considerations

Precision Modes

Threading Configuration

Memory Optimization

Troubleshooting

Common Issues

Summary

On this page

Mobile and Edge Deployment

Overview and Architecture

Paddle-Lite Framework

Framework Components

Model Optimization Pipeline

paddle_lite_opt Parameters

Optimization Example

C++ Inference System

Core Functions and File Structure

Detection Processing

Recognition Processing

ARM NEON Optimization

Build System and Dependencies

Dependency Table

Build Commands

Deployment Workflow

Detailed Deployment Steps

Configuration System

Configuration Parameters

Supported Models and Platforms

Pre-optimized Models

Hardware Platform Support

Language Support

Performance Considerations

Precision Modes

Threading Configuration

Memory Optimization

Troubleshooting

Common Issues

Summary

On this page