This page covers the PP-OCRv5 General OCR pipeline: its module structure, supported text types, available models, and invocation methods. PP-OCRv5 is the primary pipeline for extracting text from images and documents in PaddleOCR 3.x.
PP-OCRv5 is the fifth generation of PaddlePaddle's OCR model series, released as part of PaddleOCR 3.0. Its primary distinction from prior versions is that a single recognition model handles five text types simultaneously: Simplified Chinese, Traditional Chinese, English, Japanese, and Pinyin. It also substantially improves recognition of handwriting, vertical text, and rare characters.
Key metrics over PP-OCRv4:
The pipeline is composed of up to five modules, two of which are mandatory and three optional preprocessing steps.
Sources: README.md67-69 docs/version3.x/pipeline_usage/OCR.en.md1-22
Pipeline Module Composition
Sources: docs/version3.x/pipeline_usage/OCR.en.md15-21
The two mandatory modules are Text Detection and Text Recognition. The three optional modules handle document-level preprocessing and can be disabled for performance when the input is already upright and unwarped.
Classifies page rotation into four classes (0°, 90°, 180°, 270°) and corrects it before downstream processing.
| Model | Top-1 Acc (%) | GPU Inference ms [Std / HP] | CPU Inference ms [Std / HP] | Size (MB) |
|---|---|---|---|---|
PP-LCNet_x1_0_doc_ori | 99.06 | 2.62 / 0.59 | 3.24 / 1.19 | 7 |
Enabled via parameter use_doc_orientation_classify=True. Controlled at pipeline level.
Corrects document curvature and perspective distortion before OCR.
| Model | CER | GPU Inference ms [Std / HP] | CPU Inference ms [Std / HP] | Size (MB) |
|---|---|---|---|---|
UVDoc | 0.179 | 19.05 / 19.05 | — / 869.82 | 30.3 |
Enabled via parameter use_doc_unwarping=True. Use only when input contains curved or warped documents; it adds significant latency.
Classifies individual text line orientation (0° or 180°) to correct upside-down lines before recognition.
| Model | Top-1 Acc (%) | GPU Inference ms [Std / HP] | CPU Inference ms [Std / HP] | Size (MB) |
|---|---|---|---|---|
PP-LCNet_x0_25_textline_ori | 98.85 | 2.16 / 0.41 | 2.37 / 0.73 | 0.96 |
PP-LCNet_x1_0_textline_ori | 99.42 | — / — | 2.98 / 2.98 | 6.5 |
Default since PaddleOCR 3.0.1: PP-LCNet_x1_0_textline_ori. Enabled via use_textline_orientation=True.
Locates text regions in the image and outputs bounding boxes.
| Model | Detection Hmean (%) | GPU ms [Std / HP] | CPU ms [Std / HP] | Size (MB) |
|---|---|---|---|---|
PP-OCRv5_server_det | 83.8 | 89.55 / 70.19 | 383.15 / 383.15 | 84.3 |
PP-OCRv5_mobile_det | 79.0 | 10.67 / 6.36 | 57.77 / 28.15 | 4.7 |
PP-OCRv4_server_det | 69.2 | 127.82 / 98.87 | 585.95 / 489.77 | 109 |
PP-OCRv4_mobile_det | 63.8 | 9.87 / 4.17 | 56.60 / 20.79 | 4.7 |
Default: PP-OCRv5_server_det (changed in 3.0.1 from mobile to server).
Reads the text content from cropped text line images produced by the detector.
PP-OCRv5 Multi-Scenario Models (primary models):
| Model | Avg Acc CH (%) | Avg Acc EN (%) | Avg Acc CHT (%) | Avg Acc JA (%) | GPU ms [Std / HP] | CPU ms [Std / HP] | Size (MB) |
|---|---|---|---|---|---|---|---|
PP-OCRv5_server_rec | 86.38 | 64.70 | 93.29 | 60.35 | 8.46 / 2.36 | 31.21 / 31.21 | 81 |
PP-OCRv5_mobile_rec | 81.29 | 66.00 | 83.55 | 54.65 | 5.43 / 1.46 | 21.20 / 5.32 | 16 |
Default: PP-OCRv5_server_rec (changed in 3.0.1).
Sources: docs/version3.x/pipeline_usage/OCR.en.md27-240
The PP-OCRv5_server_rec and PP-OCRv5_mobile_rec models cover these text categories in a single model:
| Text Type | Description |
|---|---|
| Simplified Chinese | Standard printed and handwritten simplified characters |
| Traditional Chinese | Traditional character forms (e.g., used in Taiwan, Hong Kong) |
| English | Latin-script English text including numerals |
| Japanese | Hiragana, Katakana, Kanji |
| Pinyin | Romanized phonetic notation for Chinese characters |
Additional supported scenarios within the same model:
Sources: docs/version3.x/pipeline_usage/OCR.en.md188-200 README.md236-238
For non-Chinese/English/Japanese text, dedicated language-specific recognition models are available. These are based on the PP-OCRv5 architecture.
| Model | Language / Script | Avg Acc (%) | Size (MB) |
|---|---|---|---|
en_PP-OCRv5_mobile_rec | English | 85.25 | 7.5 |
korean_PP-OCRv5_mobile_rec | Korean + English | 88.0 | 14 |
latin_PP-OCRv5_mobile_rec | Latin-script languages | 84.7 | 14 |
eslav_PP-OCRv5_mobile_rec | East Slavic (Cyrillic) | 81.6 | 14 |
cyrillic_PP-OCRv5_mobile_rec | Cyrillic script | 80.27 | 7.7 |
arabic_PP-OCRv5_mobile_rec | Arabic script | 81.27 | 7.6 |
devanagari_PP-OCRv5_mobile_rec | Devanagari (Hindi, Sanskrit) | 84.96 | 7.5 |
te_PP-OCRv5_mobile_rec | Telugu | 87.65 | 7.5 |
ta_PP-OCRv5_mobile_rec | Tamil | 94.2 | 7.5 |
th_PP-OCRv5_mobile_rec | Thai | 82.68 | 7.5 |
el_PP-OCRv5_mobile_rec | Greek | 89.28 | 7.5 |
In total, PP-OCRv5-based models cover 109 languages. For a complete model list, see Model Selection and Language Support.
Sources: docs/version3.x/pipeline_usage/OCR.en.md241-530
PP-OCRv5 Pipeline Invocation Paths
Sources: docs/quick_start.en.md38-73 docs/version3.x/pipeline_usage/OCR.en.md1-50
Key constructor parameters for PaddleOCR:
| Parameter | Type | Default | Description |
|---|---|---|---|
use_doc_orientation_classify | bool | True | Enable document orientation classification |
use_doc_unwarping | bool | True | Enable image unwarping |
use_textline_orientation | bool | True | Enable text line orientation classification |
lang | str | 'ch' | Target language for model selection |
ocr_version | str | 'PP-OCRv5' | OCR model version to use |
Sources: docs/quick_start.en.md38-90 docs/version3.x/pipeline_usage/OCR.en.md1-50
The General OCR pipeline is a sub-pipeline reused by several higher-level pipelines:
OCR Pipeline Reuse Across PaddleOCR
Sources: docs/version3.x/pipeline_usage/PP-StructureV3.en.md13-19 docs/version3.x/pipeline_usage/PP-ChatOCRv4.en.md13-24 docs/version3.x/pipeline_usage/seal_recognition.en.md14-21
| Use Case | Recommended Detection | Recommended Recognition |
|---|---|---|
| Highest accuracy, GPU server | PP-OCRv5_server_det | PP-OCRv5_server_rec |
| Mobile / edge device | PP-OCRv5_mobile_det | PP-OCRv5_mobile_rec |
| English-only, lightweight | PP-OCRv5_mobile_det | en_PP-OCRv5_mobile_rec |
| Korean text | PP-OCRv5_mobile_det | korean_PP-OCRv5_mobile_rec |
| Arabic / RTL scripts | PP-OCRv5_mobile_det | arabic_PP-OCRv5_mobile_rec |
| Legacy compatibility (v4) | PP-OCRv4_server_det | PP-OCRv4_server_rec_doc |
When a lang parameter is specified without an explicit ocr_version, PaddleOCR 3.0.2+ automatically selects the latest model version supporting that language.
Sources: README.md119-121 docs/version3.x/pipeline_usage/OCR.en.md241-530
The PP-OCRv5 pipeline supports the following deployment modes:
| Mode | Description |
|---|---|
| Python inference | Direct use via PaddleOCR class; standard and high-performance (TensorRT/ONNX) backends |
| C++ inference | Full feature parity with Python; supports Linux and Windows (added in 3.2.0) |
| Service deployment | HTTP service with SDK; supports multi-language clients (C++, Java, Go, C#, Node.js, PHP) |
| On-device (mobile) | Android deployment via Paddle-Lite; example added in 3.0.2 |
| High-performance mode | CUDA 12 support, ONNX Runtime, MKL-DNN for CPU; Paddle MKL-DNN acceleration |
For high-performance inference configuration, see High-Performance Inference.
For C++ deployment, see C++ Inference and Build System.
For service deployment, see Service Deployment.
Sources: README.md120-128 README.md200
| PaddleOCR Version | PP-OCRv5 Changes |
|---|---|
| 3.0.0 (2025-05-20) | Initial PP-OCRv5 release; server+mobile det+rec for CH/TC/EN/JA/Pinyin |
| 3.0.1 (2025-06-05) | Default models changed from mobile to server; PP-LCNet_x1_0_textline_ori added as default |
| 3.1.0 (2025-06-29) | Added multilingual recognition (37 languages); avg accuracy +30% |
| 3.2.0 (2025-08-21) | English, Thai, Greek dedicated models; C++ deployment; single-character coordinates |
| 3.3.0 (2025-10-16) | 109-language coverage; Cyrillic, Arabic, Devanagari, Telugu, Tamil |
Sources: README.md85-146
Refresh this wiki
This wiki was recently refreshed. Please wait 2 days to refresh again.