This document guides users through selecting appropriate OCR models based on three criteria: language requirements, OCR version (PP-OCRv5, PP-OCRv4, PP-OCRv3), and deployment target (server vs. mobile). Model selection involves trade-offs between accuracy, inference speed, model size, and language coverage.
Key Selection Dimensions:
Model selection directly impacts pipeline configuration through parameters like text_detection_model_name, text_recognition_model_name, ocr_version, and lang in the PaddleOCR class paddleocr/__init__.py and CLI paddleocr ocr command paddleocr/__main__.py
Sources: docs/version3.x/pipeline_usage/OCR.md1-241 docs/version3.x/pipeline_usage/OCR.en.md1-240 pyproject.toml6-61
PaddleOCR provides three major versions with distinct model registry names used in text_detection_model_name and text_recognition_model_name parameters:
Title: PP-OCR Version Evolution and Model Names
Version Selection Criteria:
| Version | When to Use | Model Registry Names |
|---|---|---|
| PP-OCRv5 | Default choice; multilingual support (Chinese, English, Traditional Chinese, Japanese); highest accuracy | PP-OCRv5_server_det, PP-OCRv5_mobile_det, PP-OCRv5_server_rec, PP-OCRv5_mobile_rec |
| PP-OCRv4 | Need 15K+ character dictionary (_doc variant); existing deployment compatibility | PP-OCRv4_server_det, PP-OCRv4_mobile_det, PP-OCRv4_server_rec, PP-OCRv4_server_rec_doc, PP-OCRv4_mobile_rec |
| PP-OCRv3 | Legacy system integration only | PP-OCRv3_mobile_rec |
Configuration Examples:
Sources: docs/version3.x/pipeline_usage/OCR.md11-240 docs/version3.x/pipeline_usage/OCR.en.md11-239 docs/version3.x/pipeline_usage/OCR.md132-169
Sources: README.md67-70 docs/version3.x/pipeline_usage/OCR.md183-240 docs/version3.x/pipeline_usage/OCR.en.md183-239
PaddleOCR provides two scale variants for most model versions:
| Characteristic | Server Models | Mobile Models |
|---|---|---|
| Primary Goal | Maximum accuracy | Deployment efficiency |
| Model Size | 80-180 MB (recognition) | 10-16 MB (recognition) |
| Inference Speed (GPU) | 8-9 ms | 5-6 ms |
| Inference Speed (CPU) | 30-40 ms | 17-21 ms |
| Accuracy | Higher (86%+) | Good (81%+) |
| Target Deployment | Servers, cloud | Edge devices, mobile |
| Memory Requirements | Higher | Lower |
Detection Model Comparison:
| Model | Hmean (%) | Model Size (MB) | GPU Time (ms) | CPU Time (ms) |
|---|---|---|---|---|
PP-OCRv5_server_det | 83.8 | 84.3 | 89.55 / 70.19 | 383.15 / 383.15 |
PP-OCRv5_mobile_det | 79.0 | 4.7 | 10.67 / 6.36 | 57.77 / 28.15 |
Recognition Model Comparison:
| Model | Avg Acc (%) | Model Size (MB) | GPU Time (ms) | CPU Time (ms) |
|---|---|---|---|---|
PP-OCRv5_server_rec | 86.38 | 81 | 8.46 / 2.36 | 31.21 / 31.21 |
PP-OCRv5_mobile_rec | 81.29 | 16 | 5.43 / 1.46 | 21.20 / 5.32 |
Sources: docs/version3.x/pipeline_usage/OCR.md118-169 docs/version3.x/pipeline_usage/OCR.en.md118-168
Configuration Examples:
Server deployment (maximum accuracy):
Mobile deployment (efficiency optimized):
Sources: docs/version3.x/pipeline_usage/OCR.md132-200 docs/quick_start.en.md68-80
PaddleOCR provides three-tiered language support with distinct model naming conventions:
Title: Language Support Architecture and Model Registry Names
Model Naming Convention:
{lang}_PP-OCRv{version}_{scale}_recPP-OCRv5_server_rec (default multilingual)korean_PP-OCRv5_mobile_rec (Korean-specific)en_PP-OCRv4_mobile_rec (English-specific PP-OCRv4)Language Selection Strategy:
| Language(s) | Recommended Model | Registry Name | Size |
|---|---|---|---|
| Chinese + English | PP-OCRv5 default | PP-OCRv5_server_rec or PP-OCRv5_mobile_rec | 81MB / 16MB |
| Korean | Korean-specific | korean_PP-OCRv5_mobile_rec | 14MB |
| Latin languages (French, Spanish, Portuguese, etc.) | Latin model | latin_PP-OCRv5_mobile_rec | 14MB |
| Arabic script | Arabic model | arabic_PP-OCRv5_mobile_rec | 7.6MB |
| Thai | Thai model | th_PP-OCRv5_mobile_rec | 7.5MB |
| Multiple diverse languages | Vision-language model | PaddleOCR-VL pipeline | 900MB |
Sources: docs/version3.x/pipeline_usage/OCR.md242-632 docs/version3.x/pipeline_usage/OCR.en.md241-631 docs/version3.x/pipeline_usage/OCR.md432-531
The default PP-OCRv5_rec models support five text types within a single model, eliminating the need to switch models for mixed-language documents:
| Model | Chinese | English | Traditional Chinese | Japanese | Model Size |
|---|---|---|---|---|---|
PP-OCRv5_server_rec | 86.38% | 64.70% | 93.29% | 60.35% | 81 MB |
PP-OCRv5_mobile_rec | 81.29% | 66.00% | 83.55% | 54.65% | 16 MB |
Use Cases:
Configuration:
Sources: docs/version3.x/pipeline_usage/OCR.md246-285 docs/version3.x/pipeline_usage/OCR.en.md245-284
For languages beyond the five types in PP-OCRv5, PaddleOCR provides specialized recognition models optimized for specific language families:
PP-OCRv5 Series (Latest):
| Language/Script | Model Name | Accuracy | Languages Covered |
|---|---|---|---|
| Korean | korean_PP-OCRv5_mobile_rec | 88.0% | Korean, English, Numbers |
| Latin | latin_PP-OCRv5_mobile_rec | 84.7% | Most Latin-based languages |
| Eastern Slavic | eslav_PP-OCRv5_mobile_rec | 81.6% | Russian, Ukrainian, Belarusian |
| Thai | th_PP-OCRv5_mobile_rec | 82.68% | Thai, English, Numbers |
| Greek | el_PP-OCRv5_mobile_rec | 89.28% | Greek, English, Numbers |
| Arabic | arabic_PP-OCRv5_mobile_rec | 81.27% | Arabic script languages |
| Cyrillic | cyrillic_PP-OCRv5_mobile_rec | 80.27% | All Cyrillic-based languages |
| Devanagari | devanagari_PP-OCRv5_mobile_rec | 84.96% | Hindi, Sanskrit, etc. |
| Telugu | te_PP-OCRv5_mobile_rec | 87.65% | Telugu, Numbers |
| Tamil | ta_PP-OCRv5_mobile_rec | 94.2% | Tamil, Numbers |
| English | en_PP-OCRv5_mobile_rec | 85.25% | English (improved accuracy) |
PP-OCRv3 Series (Legacy):
Additional languages available through PP-OCRv3 models include Japanese (japan_PP-OCRv3_mobile_rec), Kannada (ka_PP-OCRv3_mobile_rec), and others with model sizes around 8-10 MB.
Sources: docs/version3.x/pipeline_usage/OCR.md422-632 docs/version3.x/pipeline_usage/OCR.en.md422-631
Configuration Example:
Sources: docs/version3.x/pipeline_usage/OCR.md432-531 docs/version3.x/pipeline_usage/OCR.en.md388-531
For comprehensive multilingual support beyond PP-OCRv5's capabilities, PaddleOCR-VL provides a unified 0.9B parameter vision-language model supporting 111 languages:
Key Characteristics:
When to Use PaddleOCR-VL:
Configuration:
Sources: README.md61-66 README.md89-98 docs/index.en.md27-29
The PaddleOCR class paddleocr/__init__.py exposes model selection through these parameters:
Title: PaddleOCR Configuration Parameters for Model Selection
Configuration Parameter Details:
| Parameter | Type | Valid Values | Default | Resolution Logic |
|---|---|---|---|---|
ocr_version | str | "PP-OCRv5", "PP-OCRv4", "PP-OCRv3" | "PP-OCRv5" | Sets version prefix for model names |
text_detection_model_name | str | Model registry name (e.g., "PP-OCRv5_server_det") | "PP-OCRv5_server_det" | Direct model name lookup |
text_recognition_model_name | str | Model registry name (e.g., "korean_PP-OCRv5_mobile_rec") | "PP-OCRv5_server_rec" | Direct model name lookup |
lang | str | Language code (e.g., "korean", "arabic") | "ch" | Maps to language-specific model if available |
Automatic Model Selection Logic:
When lang is specified without explicit text_recognition_model_name, the system performs automatic model resolution:
Manual Model Selection (highest priority):
Model Download and Caching:
Models are downloaded on first use to:
~/.paddleocr/ or PADDLE_PDX_MODEL_SOURCE environment variable path.pdmodel (model structure) + .pdiparams (weights) + config fileshttps://paddle-model-ecology.bj.bcebos.com/paddlex/ on first invocationSources: docs/version3.x/pipeline_usage/OCR.md746-1048 docs/version3.x/pipeline_usage/OCR.en.md766-1048 pyproject.toml41-45
The paddleocr command-line interface paddleocr/__main__.py exposes model selection through these flags:
Command: paddleocr ocr [OPTIONS]
Model Selection Flags:
| CLI Flag | Type | Example Value | Equivalent Python Parameter |
|---|---|---|---|
--ocr_version | str | PP-OCRv5, PP-OCRv4, PP-OCRv3 | ocr_version |
--text_detection_model_name | str | PP-OCRv5_mobile_det | text_detection_model_name |
--text_recognition_model_name | str | korean_PP-OCRv5_mobile_rec | text_recognition_model_name |
--lang | str | korean, arabic, ch | lang |
CLI Usage Examples:
Output Format (saved to --save_path if specified):
.txt filesSources: docs/version3.x/pipeline_usage/OCR.md761-810 docs/version3.x/pipeline_usage/OCR.en.md766-825 paddleocr/__main__.py
Decision Guidelines:
PP-OCRv5_server_det + PP-OCRv5_server_recSources: docs/version3.x/pipeline_usage/OCR.md701 README.md67-75
Detection models across versions (server variants):
| Version | Model | Hmean (%) | GPU Time (ms) | Size (MB) |
|---|---|---|---|---|
| v5 | PP-OCRv5_server_det | 83.8 | 89.55 | 84.3 |
| v4 | PP-OCRv4_server_det | 69.2 | 127.82 | 109 |
| v3 | N/A (mobile only) | - | - | - |
Recognition models across versions (mobile variants):
| Version | Model | Avg Acc (%) | GPU Time (ms) | Size (MB) |
|---|---|---|---|---|
| v5 | PP-OCRv5_mobile_rec | 81.29 | 5.43 | 16 |
| v4 | PP-OCRv4_mobile_rec | 78.74 | 5.26 | 10.5 |
| v3 | PP-OCRv3_mobile_rec | 72.96 | 3.89 | 10.3 |
Key Improvements in PP-OCRv5:
Sources: docs/version3.x/pipeline_usage/OCR.md132-169 docs/version3.x/pipeline_usage/OCR.md287-334
Performance of language-specific PP-OCRv5 models (all mobile variants, 5.43ms GPU inference):
| Language | Model | Accuracy (%) | Model Size (MB) |
|---|---|---|---|
| Tamil | ta_PP-OCRv5_mobile_rec | 94.2 | 7.5 |
| Greek | el_PP-OCRv5_mobile_rec | 89.28 | 7.5 |
| Korean | korean_PP-OCRv5_mobile_rec | 88.0 | 14 |
| Telugu | te_PP-OCRv5_mobile_rec | 87.65 | 7.5 |
| English | en_PP-OCRv5_mobile_rec | 85.25 | 7.5 |
| Devanagari | devanagari_PP-OCRv5_mobile_rec | 84.96 | 7.5 |
| Latin | latin_PP-OCRv5_mobile_rec | 84.7 | 14 |
| Thai | th_PP-OCRv5_mobile_rec | 82.68 | 7.5 |
| Eastern Slavic | eslav_PP-OCRv5_mobile_rec | 81.6 | 14 |
| Arabic | arabic_PP-OCRv5_mobile_rec | 81.27 | 7.6 |
| Cyrillic | cyrillic_PP-OCRv5_mobile_rec | 80.27 | 7.7 |
All models maintain efficient inference speeds (5.43ms GPU, 21.20ms CPU) with compact sizes (7-14 MB).
Sources: docs/version3.x/pipeline_usage/OCR.md432-531
Title: Model Selection Decision Tree with Registry Names
Model Selection Checklist:
_server_ vs. _mobile_)Sources: docs/version3.x/pipeline_usage/OCR.md701-1048 docs/version3.x/pipeline_usage/OCR.en.md766-1048
Pattern 1: Default Server Deployment (Chinese-English Mixed Documents)
Pattern 2: Mobile Deployment (Korean Text, Resource-Constrained)
Pattern 3: High-Accuracy Server with TensorRT Acceleration
Pattern 4: Document-Optimized Recognition (15K+ Characters)
Pattern 5: Multi-Language Global Support (111 Languages)
Pattern 6: Latin Languages (French, Spanish, Portuguese, etc.)
Pattern 7: CLI Batch Processing with Language Auto-Selection
Sources: docs/version3.x/pipeline_usage/OCR.md746-1048 docs/version3.x/pipeline_usage/OCR.en.md766-1048 docs/quick_start.en.md66-94
For Maximum Accuracy:
PP-OCRv5_server_det + PP-OCRv5_server_recenable_hpi=True)For Maximum Speed:
PP-OCRv5_mobile_det + PP-OCRv5_mobile_recuse_doc_orientation_classify=False, use_doc_unwarping=False)limit_side_len parameter for faster processingFor Minimal Memory Footprint:
For Mixed-Language Documents:
PP-OCRv5_server_rec (supports 5 text types)Sources: docs/version3.x/pipeline_usage/OCR.md707 README.md191-199
API Parameter Changes:
| PaddleOCR 2.x Parameter | PaddleOCR 3.x Parameter | Notes |
|---|---|---|
det_model_dir | text_detection_model_name | Now uses model registry names instead of paths |
rec_model_dir | text_recognition_model_name | Automatic download from model hub |
cls_model_dir | textline_orientation_model_name | For text line orientation |
use_angle_cls | use_textline_orientation | Renamed for clarity |
lang | lang | Unchanged, but now supports 20+ languages |
Migration Examples:
Key Migration Benefits:
PP-OCRv5_server_rec handles Chinese, English, Traditional Chinese, JapanesePaddleX Dependency (Transparent to Users):
PaddleOCR 3.x uses PaddleX for underlying inference pyproject.toml42-45:
paddlex[ocr-core]>=3.4.0,<3.5.0paddlex[doc-parser], paddlex[ie], paddlex[trans]Backward Compatibility Notes:
det_model_dir, etc.) deprecated but may still work with warnings*.pdmodel, *.pdiparams) can still be loaded via text_detection_model_dir parameter (not text_detection_model_name)Sources: docs/version3.x/paddleocr_and_paddlex.md1-50 docs/version3.x/paddleocr_and_paddlex.en.md1-50 pyproject.toml41-46
Sources for entire document: README.md1-246 docs/version3.x/pipeline_usage/OCR.md1-810 docs/version3.x/pipeline_usage/OCR.en.md1-810 docs/version3.x/pipeline_usage/PP-StructureV3.md1-320 docs/version3.x/pipeline_usage/PP-StructureV3.en.md1-320 docs/index.en.md1-92 docs/quick_start.en.md1-140
Refresh this wiki
This wiki was recently refreshed. Please wait 2 days to refresh again.