PP-OCRv5 Universal Text Recognition

Relevant source files

This page covers the PP-OCRv5 General OCR pipeline: its module structure, supported text types, available models, and invocation methods. PP-OCRv5 is the primary pipeline for extracting text from images and documents in PaddleOCR 3.x.

For document-level structured parsing (tables, formulas, layout), see PP-StructureV3 Document Parsing.
For key information extraction with LLM integration, see PP-ChatOCRv4 Intelligent Document Understanding.
For vision-language model-based recognition, see PaddleOCR-VL Vision-Language Model.
For Python API class details, see Python API Usage.

Overview

PP-OCRv5 is the fifth generation of PaddlePaddle's OCR model series, released as part of PaddleOCR 3.0. Its primary distinction from prior versions is that a single recognition model handles five text types simultaneously: Simplified Chinese, Traditional Chinese, English, Japanese, and Pinyin. It also substantially improves recognition of handwriting, vertical text, and rare characters.

Key metrics over PP-OCRv4:

+13 percentage points accuracy improvement across mixed-script scenarios
Recognition model covers 5 text types in a single model weight
Multilingual recognition models extend coverage to 109 languages across script families (Latin, Cyrillic, Arabic, Devanagari, Telugu, Tamil, Korean, etc.)

The pipeline is composed of up to five modules, two of which are mandatory and three optional preprocessing steps.

Sources: README.md67-69 docs/version3.x/pipeline_usage/OCR.en.md1-22

Pipeline Structure

Pipeline Module Composition

Sources: docs/version3.x/pipeline_usage/OCR.en.md15-21

The two mandatory modules are Text Detection and Text Recognition. The three optional modules handle document-level preprocessing and can be disabled for performance when the input is already upright and unwarped.

Module Details and Default Models

Optional: Document Image Orientation Classification

Classifies page rotation into four classes (0°, 90°, 180°, 270°) and corrects it before downstream processing.

Model	Top-1 Acc (%)	GPU Inference ms [Std / HP]	CPU Inference ms [Std / HP]	Size (MB)
`PP-LCNet_x1_0_doc_ori`	99.06	2.62 / 0.59	3.24 / 1.19	7

Enabled via parameter use_doc_orientation_classify=True. Controlled at pipeline level.

Optional: Text Image Unwarping

Corrects document curvature and perspective distortion before OCR.

Model	CER	GPU Inference ms [Std / HP]	CPU Inference ms [Std / HP]	Size (MB)
`UVDoc`	0.179	19.05 / 19.05	— / 869.82	30.3

Enabled via parameter use_doc_unwarping=True. Use only when input contains curved or warped documents; it adds significant latency.

Optional: Text Line Orientation Classification

Classifies individual text line orientation (0° or 180°) to correct upside-down lines before recognition.

Model	Top-1 Acc (%)	GPU Inference ms [Std / HP]	CPU Inference ms [Std / HP]	Size (MB)
`PP-LCNet_x0_25_textline_ori`	98.85	2.16 / 0.41	2.37 / 0.73	0.96
`PP-LCNet_x1_0_textline_ori`	99.42	— / —	2.98 / 2.98	6.5

Default since PaddleOCR 3.0.1: PP-LCNet_x1_0_textline_ori. Enabled via use_textline_orientation=True.

Text Detection

Locates text regions in the image and outputs bounding boxes.

Model	Detection Hmean (%)	GPU ms [Std / HP]	CPU ms [Std / HP]	Size (MB)
`PP-OCRv5_server_det`	83.8	89.55 / 70.19	383.15 / 383.15	84.3
`PP-OCRv5_mobile_det`	79.0	10.67 / 6.36	57.77 / 28.15	4.7
`PP-OCRv4_server_det`	69.2	127.82 / 98.87	585.95 / 489.77	109
`PP-OCRv4_mobile_det`	63.8	9.87 / 4.17	56.60 / 20.79	4.7

Default: PP-OCRv5_server_det (changed in 3.0.1 from mobile to server).

Text Recognition

Reads the text content from cropped text line images produced by the detector.

PP-OCRv5 Multi-Scenario Models (primary models):

Model	Avg Acc CH (%)	Avg Acc EN (%)	Avg Acc CHT (%)	Avg Acc JA (%)	GPU ms [Std / HP]	CPU ms [Std / HP]	Size (MB)
`PP-OCRv5_server_rec`	86.38	64.70	93.29	60.35	8.46 / 2.36	31.21 / 31.21	81
`PP-OCRv5_mobile_rec`	81.29	66.00	83.55	54.65	5.43 / 1.46	21.20 / 5.32	16

Default: PP-OCRv5_server_rec (changed in 3.0.1).

Sources: docs/version3.x/pipeline_usage/OCR.en.md27-240

Supported Text Types

The PP-OCRv5_server_rec and PP-OCRv5_mobile_rec models cover these text categories in a single model:

Text Type	Description
Simplified Chinese	Standard printed and handwritten simplified characters
Traditional Chinese	Traditional character forms (e.g., used in Taiwan, Hong Kong)
English	Latin-script English text including numerals
Japanese	Hiragana, Katakana, Kanji
Pinyin	Romanized phonetic notation for Chinese characters

Additional supported scenarios within the same model:

Handwriting: Improved handwritten text recognition
Vertical text: Top-to-bottom character layouts
Rare characters: Extended CJK character sets
Mixed-script documents: Combinations of the above types within the same image

Sources: docs/version3.x/pipeline_usage/OCR.en.md188-200 README.md236-238

Multilingual Recognition Models

For non-Chinese/English/Japanese text, dedicated language-specific recognition models are available. These are based on the PP-OCRv5 architecture.

Model	Language / Script	Avg Acc (%)	Size (MB)
`en_PP-OCRv5_mobile_rec`	English	85.25	7.5
`korean_PP-OCRv5_mobile_rec`	Korean + English	88.0	14
`latin_PP-OCRv5_mobile_rec`	Latin-script languages	84.7	14
`eslav_PP-OCRv5_mobile_rec`	East Slavic (Cyrillic)	81.6	14
`cyrillic_PP-OCRv5_mobile_rec`	Cyrillic script	80.27	7.7
`arabic_PP-OCRv5_mobile_rec`	Arabic script	81.27	7.6
`devanagari_PP-OCRv5_mobile_rec`	Devanagari (Hindi, Sanskrit)	84.96	7.5
`te_PP-OCRv5_mobile_rec`	Telugu	87.65	7.5
`ta_PP-OCRv5_mobile_rec`	Tamil	94.2	7.5
`th_PP-OCRv5_mobile_rec`	Thai	82.68	7.5
`el_PP-OCRv5_mobile_rec`	Greek	89.28	7.5

In total, PP-OCRv5-based models cover 109 languages. For a complete model list, see Model Selection and Language Support.

Sources: docs/version3.x/pipeline_usage/OCR.en.md241-530

Pipeline Invocation

PP-OCRv5 Pipeline Invocation Paths

Sources: docs/quick_start.en.md38-73 docs/version3.x/pipeline_usage/OCR.en.md1-50

Command-Line Interface

Python API

Key constructor parameters for PaddleOCR:

Parameter	Type	Default	Description
`use_doc_orientation_classify`	bool	True	Enable document orientation classification
`use_doc_unwarping`	bool	True	Enable image unwarping
`use_textline_orientation`	bool	True	Enable text line orientation classification
`lang`	str	`'ch'`	Target language for model selection
`ocr_version`	str	`'PP-OCRv5'`	OCR model version to use

Sources: docs/quick_start.en.md38-90 docs/version3.x/pipeline_usage/OCR.en.md1-50

Pipeline in the Broader System

The General OCR pipeline is a sub-pipeline reused by several higher-level pipelines:

OCR Pipeline Reuse Across PaddleOCR

Sources: docs/version3.x/pipeline_usage/PP-StructureV3.en.md13-19 docs/version3.x/pipeline_usage/PP-ChatOCRv4.en.md13-24 docs/version3.x/pipeline_usage/seal_recognition.en.md14-21

Model Selection Guide

Use Case	Recommended Detection	Recommended Recognition
Highest accuracy, GPU server	`PP-OCRv5_server_det`	`PP-OCRv5_server_rec`
Mobile / edge device	`PP-OCRv5_mobile_det`	`PP-OCRv5_mobile_rec`
English-only, lightweight	`PP-OCRv5_mobile_det`	`en_PP-OCRv5_mobile_rec`
Korean text	`PP-OCRv5_mobile_det`	`korean_PP-OCRv5_mobile_rec`
Arabic / RTL scripts	`PP-OCRv5_mobile_det`	`arabic_PP-OCRv5_mobile_rec`
Legacy compatibility (v4)	`PP-OCRv4_server_det`	`PP-OCRv4_server_rec_doc`

When a lang parameter is specified without an explicit ocr_version, PaddleOCR 3.0.2+ automatically selects the latest model version supporting that language.

Sources: README.md119-121 docs/version3.x/pipeline_usage/OCR.en.md241-530

Deployment Options

The PP-OCRv5 pipeline supports the following deployment modes:

Mode	Description
Python inference	Direct use via `PaddleOCR` class; standard and high-performance (TensorRT/ONNX) backends
C++ inference	Full feature parity with Python; supports Linux and Windows (added in 3.2.0)
Service deployment	HTTP service with SDK; supports multi-language clients (C++, Java, Go, C#, Node.js, PHP)
On-device (mobile)	Android deployment via Paddle-Lite; example added in 3.0.2
High-performance mode	CUDA 12 support, ONNX Runtime, MKL-DNN for CPU; Paddle MKL-DNN acceleration

For high-performance inference configuration, see High-Performance Inference.
For C++ deployment, see C++ Inference and Build System.
For service deployment, see Service Deployment.

Sources: README.md120-128 README.md200

Version History Summary

PaddleOCR Version	PP-OCRv5 Changes
3.0.0 (2025-05-20)	Initial PP-OCRv5 release; server+mobile det+rec for CH/TC/EN/JA/Pinyin
3.0.1 (2025-06-05)	Default models changed from mobile to server; `PP-LCNet_x1_0_textline_ori` added as default
3.1.0 (2025-06-29)	Added multilingual recognition (37 languages); avg accuracy +30%
3.2.0 (2025-08-21)	English, Thai, Greek dedicated models; C++ deployment; single-character coordinates
3.3.0 (2025-10-16)	109-language coverage; Cyrillic, Arabic, Devanagari, Telugu, Tamil

Sources: README.md85-146

PP-OCRv5 Universal Text Recognition

Relevant source files

For document-level structured parsing (tables, formulas, layout), see PP-StructureV3 Document Parsing.
For key information extraction with LLM integration, see PP-ChatOCRv4 Intelligent Document Understanding.
For vision-language model-based recognition, see PaddleOCR-VL Vision-Language Model.
For Python API class details, see Python API Usage.

Overview

Key metrics over PP-OCRv4:

+13 percentage points accuracy improvement across mixed-script scenarios
Recognition model covers 5 text types in a single model weight
Multilingual recognition models extend coverage to 109 languages across script families (Latin, Cyrillic, Arabic, Devanagari, Telugu, Tamil, Korean, etc.)

The pipeline is composed of up to five modules, two of which are mandatory and three optional preprocessing steps.

Sources: README.md67-69 docs/version3.x/pipeline_usage/OCR.en.md1-22

Pipeline Structure

Pipeline Module Composition

Sources: docs/version3.x/pipeline_usage/OCR.en.md15-21

Module Details and Default Models

Optional: Document Image Orientation Classification

Classifies page rotation into four classes (0°, 90°, 180°, 270°) and corrects it before downstream processing.

Model	Top-1 Acc (%)	GPU Inference ms [Std / HP]	CPU Inference ms [Std / HP]	Size (MB)
`PP-LCNet_x1_0_doc_ori`	99.06	2.62 / 0.59	3.24 / 1.19	7

Enabled via parameter use_doc_orientation_classify=True. Controlled at pipeline level.

Optional: Text Image Unwarping

Corrects document curvature and perspective distortion before OCR.

Model	CER	GPU Inference ms [Std / HP]	CPU Inference ms [Std / HP]	Size (MB)
`UVDoc`	0.179	19.05 / 19.05	— / 869.82	30.3

Enabled via parameter use_doc_unwarping=True. Use only when input contains curved or warped documents; it adds significant latency.

Optional: Text Line Orientation Classification

Classifies individual text line orientation (0° or 180°) to correct upside-down lines before recognition.

Model	Top-1 Acc (%)	GPU Inference ms [Std / HP]	CPU Inference ms [Std / HP]	Size (MB)
`PP-LCNet_x0_25_textline_ori`	98.85	2.16 / 0.41	2.37 / 0.73	0.96
`PP-LCNet_x1_0_textline_ori`	99.42	— / —	2.98 / 2.98	6.5

Default since PaddleOCR 3.0.1: PP-LCNet_x1_0_textline_ori. Enabled via use_textline_orientation=True.

Text Detection

Locates text regions in the image and outputs bounding boxes.

Model	Detection Hmean (%)	GPU ms [Std / HP]	CPU ms [Std / HP]	Size (MB)
`PP-OCRv5_server_det`	83.8	89.55 / 70.19	383.15 / 383.15	84.3
`PP-OCRv5_mobile_det`	79.0	10.67 / 6.36	57.77 / 28.15	4.7
`PP-OCRv4_server_det`	69.2	127.82 / 98.87	585.95 / 489.77	109
`PP-OCRv4_mobile_det`	63.8	9.87 / 4.17	56.60 / 20.79	4.7

Default: PP-OCRv5_server_det (changed in 3.0.1 from mobile to server).

Text Recognition

Reads the text content from cropped text line images produced by the detector.

PP-OCRv5 Multi-Scenario Models (primary models):

Model	Avg Acc CH (%)	Avg Acc EN (%)	Avg Acc CHT (%)	Avg Acc JA (%)	GPU ms [Std / HP]	CPU ms [Std / HP]	Size (MB)
`PP-OCRv5_server_rec`	86.38	64.70	93.29	60.35	8.46 / 2.36	31.21 / 31.21	81
`PP-OCRv5_mobile_rec`	81.29	66.00	83.55	54.65	5.43 / 1.46	21.20 / 5.32	16

Default: PP-OCRv5_server_rec (changed in 3.0.1).

Sources: docs/version3.x/pipeline_usage/OCR.en.md27-240

Supported Text Types

The PP-OCRv5_server_rec and PP-OCRv5_mobile_rec models cover these text categories in a single model:

Text Type	Description
Simplified Chinese	Standard printed and handwritten simplified characters
Traditional Chinese	Traditional character forms (e.g., used in Taiwan, Hong Kong)
English	Latin-script English text including numerals
Japanese	Hiragana, Katakana, Kanji
Pinyin	Romanized phonetic notation for Chinese characters

Additional supported scenarios within the same model:

Handwriting: Improved handwritten text recognition
Vertical text: Top-to-bottom character layouts
Rare characters: Extended CJK character sets
Mixed-script documents: Combinations of the above types within the same image

Sources: docs/version3.x/pipeline_usage/OCR.en.md188-200 README.md236-238

Multilingual Recognition Models

For non-Chinese/English/Japanese text, dedicated language-specific recognition models are available. These are based on the PP-OCRv5 architecture.

Model	Language / Script	Avg Acc (%)	Size (MB)
`en_PP-OCRv5_mobile_rec`	English	85.25	7.5
`korean_PP-OCRv5_mobile_rec`	Korean + English	88.0	14
`latin_PP-OCRv5_mobile_rec`	Latin-script languages	84.7	14
`eslav_PP-OCRv5_mobile_rec`	East Slavic (Cyrillic)	81.6	14
`cyrillic_PP-OCRv5_mobile_rec`	Cyrillic script	80.27	7.7
`arabic_PP-OCRv5_mobile_rec`	Arabic script	81.27	7.6
`devanagari_PP-OCRv5_mobile_rec`	Devanagari (Hindi, Sanskrit)	84.96	7.5
`te_PP-OCRv5_mobile_rec`	Telugu	87.65	7.5
`ta_PP-OCRv5_mobile_rec`	Tamil	94.2	7.5
`th_PP-OCRv5_mobile_rec`	Thai	82.68	7.5
`el_PP-OCRv5_mobile_rec`	Greek	89.28	7.5

In total, PP-OCRv5-based models cover 109 languages. For a complete model list, see Model Selection and Language Support.

Sources: docs/version3.x/pipeline_usage/OCR.en.md241-530

Pipeline Invocation

PP-OCRv5 Pipeline Invocation Paths

Sources: docs/quick_start.en.md38-73 docs/version3.x/pipeline_usage/OCR.en.md1-50

Command-Line Interface

Python API

Key constructor parameters for PaddleOCR:

Parameter	Type	Default	Description
`use_doc_orientation_classify`	bool	True	Enable document orientation classification
`use_doc_unwarping`	bool	True	Enable image unwarping
`use_textline_orientation`	bool	True	Enable text line orientation classification
`lang`	str	`'ch'`	Target language for model selection
`ocr_version`	str	`'PP-OCRv5'`	OCR model version to use

Sources: docs/quick_start.en.md38-90 docs/version3.x/pipeline_usage/OCR.en.md1-50

Pipeline in the Broader System

The General OCR pipeline is a sub-pipeline reused by several higher-level pipelines:

OCR Pipeline Reuse Across PaddleOCR

Sources: docs/version3.x/pipeline_usage/PP-StructureV3.en.md13-19 docs/version3.x/pipeline_usage/PP-ChatOCRv4.en.md13-24 docs/version3.x/pipeline_usage/seal_recognition.en.md14-21

Model Selection Guide

Use Case	Recommended Detection	Recommended Recognition
Highest accuracy, GPU server	`PP-OCRv5_server_det`	`PP-OCRv5_server_rec`
Mobile / edge device	`PP-OCRv5_mobile_det`	`PP-OCRv5_mobile_rec`
English-only, lightweight	`PP-OCRv5_mobile_det`	`en_PP-OCRv5_mobile_rec`
Korean text	`PP-OCRv5_mobile_det`	`korean_PP-OCRv5_mobile_rec`
Arabic / RTL scripts	`PP-OCRv5_mobile_det`	`arabic_PP-OCRv5_mobile_rec`
Legacy compatibility (v4)	`PP-OCRv4_server_det`	`PP-OCRv4_server_rec_doc`

When a lang parameter is specified without an explicit ocr_version, PaddleOCR 3.0.2+ automatically selects the latest model version supporting that language.

Sources: README.md119-121 docs/version3.x/pipeline_usage/OCR.en.md241-530

Deployment Options

The PP-OCRv5 pipeline supports the following deployment modes:

Mode	Description
Python inference	Direct use via `PaddleOCR` class; standard and high-performance (TensorRT/ONNX) backends
C++ inference	Full feature parity with Python; supports Linux and Windows (added in 3.2.0)
Service deployment	HTTP service with SDK; supports multi-language clients (C++, Java, Go, C#, Node.js, PHP)
On-device (mobile)	Android deployment via Paddle-Lite; example added in 3.0.2
High-performance mode	CUDA 12 support, ONNX Runtime, MKL-DNN for CPU; Paddle MKL-DNN acceleration

For high-performance inference configuration, see High-Performance Inference.
For C++ deployment, see C++ Inference and Build System.
For service deployment, see Service Deployment.

Sources: README.md120-128 README.md200

Version History Summary

PaddleOCR Version	PP-OCRv5 Changes
3.0.0 (2025-05-20)	Initial PP-OCRv5 release; server+mobile det+rec for CH/TC/EN/JA/Pinyin
3.0.1 (2025-06-05)	Default models changed from mobile to server; `PP-LCNet_x1_0_textline_ori` added as default
3.1.0 (2025-06-29)	Added multilingual recognition (37 languages); avg accuracy +30%
3.2.0 (2025-08-21)	English, Thai, Greek dedicated models; C++ deployment; single-character coordinates
3.3.0 (2025-10-16)	109-language coverage; Cyrillic, Arabic, Devanagari, Telugu, Tamil

Sources: README.md85-146

PP-OCRv5 Universal Text Recognition

Overview

Pipeline Structure

Module Details and Default Models

Optional: Document Image Orientation Classification

Optional: Text Image Unwarping

Optional: Text Line Orientation Classification

Text Detection

Text Recognition

Supported Text Types

Multilingual Recognition Models

Pipeline Invocation

Command-Line Interface

Python API

Pipeline in the Broader System

Model Selection Guide

Deployment Options

Version History Summary

On this page

PP-OCRv5 Universal Text Recognition

Overview

Pipeline Structure

Module Details and Default Models

Optional: Document Image Orientation Classification

Optional: Text Image Unwarping

Optional: Text Line Orientation Classification

Text Detection

Text Recognition

Supported Text Types

Multilingual Recognition Models

Pipeline Invocation

Command-Line Interface

Python API

Pipeline in the Broader System

Model Selection Guide

Deployment Options

Version History Summary

On this page