Model Selection and Language Support

Relevant source files

1. Purpose and Scope

This document guides users through selecting appropriate OCR models based on three criteria: language requirements, OCR version (PP-OCRv5, PP-OCRv4, PP-OCRv3), and deployment target (server vs. mobile). Model selection involves trade-offs between accuracy, inference speed, model size, and language coverage.

Key Selection Dimensions:

Language: Single-model multilingual (PP-OCRv5), language-specific models (20+ languages), or universal VLM (111+ languages)
Version: PP-OCRv5 (default, 13% accuracy improvement), PP-OCRv4 (15K+ character support), PP-OCRv3 (legacy)
Scale: Server models (higher accuracy, 80-180MB) vs. Mobile models (efficiency-optimized, 4-16MB)

Model selection directly impacts pipeline configuration through parameters like text_detection_model_name, text_recognition_model_name, ocr_version, and lang in the PaddleOCR class paddleocr/__init__.py and CLI paddleocr ocr command paddleocr/__main__.py

Sources: docs/version3.x/pipeline_usage/OCR.md1-241 docs/version3.x/pipeline_usage/OCR.en.md1-240 pyproject.toml6-61

2. Model Version Selection

2.1 Version Overview and Model Registry Names

PaddleOCR provides three major versions with distinct model registry names used in text_detection_model_name and text_recognition_model_name parameters:

Title: PP-OCR Version Evolution and Model Names

Version Selection Criteria:

Version	When to Use	Model Registry Names
PP-OCRv5	Default choice; multilingual support (Chinese, English, Traditional Chinese, Japanese); highest accuracy	`PP-OCRv5_server_det`, `PP-OCRv5_mobile_det`, `PP-OCRv5_server_rec`, `PP-OCRv5_mobile_rec`
PP-OCRv4	Need 15K+ character dictionary (`_doc` variant); existing deployment compatibility	`PP-OCRv4_server_det`, `PP-OCRv4_mobile_det`, `PP-OCRv4_server_rec`, `PP-OCRv4_server_rec_doc`, `PP-OCRv4_mobile_rec`
PP-OCRv3	Legacy system integration only	`PP-OCRv3_mobile_rec`

Configuration Examples:

Sources: docs/version3.x/pipeline_usage/OCR.md11-240 docs/version3.x/pipeline_usage/OCR.en.md11-239 docs/version3.x/pipeline_usage/OCR.md132-169

2.2 Version Selection Decision Criteria

Sources: README.md67-70 docs/version3.x/pipeline_usage/OCR.md183-240 docs/version3.x/pipeline_usage/OCR.en.md183-239

3. Model Scale Selection: Server vs. Mobile

3.1 Scale Variant Characteristics

PaddleOCR provides two scale variants for most model versions:

Characteristic	Server Models	Mobile Models
Primary Goal	Maximum accuracy	Deployment efficiency
Model Size	80-180 MB (recognition)	10-16 MB (recognition)
Inference Speed (GPU)	8-9 ms	5-6 ms
Inference Speed (CPU)	30-40 ms	17-21 ms
Accuracy	Higher (86%+)	Good (81%+)
Target Deployment	Servers, cloud	Edge devices, mobile
Memory Requirements	Higher	Lower

Detection Model Comparison:

Model	Hmean (%)	Model Size (MB)	GPU Time (ms)	CPU Time (ms)
`PP-OCRv5_server_det`	83.8	84.3	89.55 / 70.19	383.15 / 383.15
`PP-OCRv5_mobile_det`	79.0	4.7	10.67 / 6.36	57.77 / 28.15

Recognition Model Comparison:

Model	Avg Acc (%)	Model Size (MB)	GPU Time (ms)	CPU Time (ms)
`PP-OCRv5_server_rec`	86.38	81	8.46 / 2.36	31.21 / 31.21
`PP-OCRv5_mobile_rec`	81.29	16	5.43 / 1.46	21.20 / 5.32

Sources: docs/version3.x/pipeline_usage/OCR.md118-169 docs/version3.x/pipeline_usage/OCR.en.md118-168

3.2 Scale Selection Strategy

Configuration Examples:

Server deployment (maximum accuracy):

Mobile deployment (efficiency optimized):

Sources: docs/version3.x/pipeline_usage/OCR.md132-200 docs/quick_start.en.md68-80

4. Language Support

4.1 Language Coverage and Model Registry Architecture

PaddleOCR provides three-tiered language support with distinct model naming conventions:

Title: Language Support Architecture and Model Registry Names

Model Naming Convention:

Pattern: {lang}_PP-OCRv{version}_{scale}_rec
Examples:
- PP-OCRv5_server_rec (default multilingual)
- korean_PP-OCRv5_mobile_rec (Korean-specific)
- en_PP-OCRv4_mobile_rec (English-specific PP-OCRv4)

Language Selection Strategy:

Language(s)	Recommended Model	Registry Name	Size
Chinese + English	PP-OCRv5 default	`PP-OCRv5_server_rec` or `PP-OCRv5_mobile_rec`	81MB / 16MB
Korean	Korean-specific	`korean_PP-OCRv5_mobile_rec`	14MB
Latin languages (French, Spanish, Portuguese, etc.)	Latin model	`latin_PP-OCRv5_mobile_rec`	14MB
Arabic script	Arabic model	`arabic_PP-OCRv5_mobile_rec`	7.6MB
Thai	Thai model	`th_PP-OCRv5_mobile_rec`	7.5MB
Multiple diverse languages	Vision-language model	PaddleOCR-VL pipeline	900MB

Sources: docs/version3.x/pipeline_usage/OCR.md242-632 docs/version3.x/pipeline_usage/OCR.en.md241-631 docs/version3.x/pipeline_usage/OCR.md432-531

4.2 PP-OCRv5 Multi-Type Single Model

The default PP-OCRv5_rec models support five text types within a single model, eliminating the need to switch models for mixed-language documents:

Model	Chinese	English	Traditional Chinese	Japanese	Model Size
`PP-OCRv5_server_rec`	86.38%	64.70%	93.29%	60.35%	81 MB
`PP-OCRv5_mobile_rec`	81.29%	66.00%	83.55%	54.65%	16 MB

Use Cases:

Documents with mixed Chinese-English text
Japanese documents containing Chinese characters
Pinyin annotations in educational materials
Traditional Chinese documents with modern annotations

Configuration:

Sources: docs/version3.x/pipeline_usage/OCR.md246-285 docs/version3.x/pipeline_usage/OCR.en.md245-284

4.3 Language-Specific Recognition Models

For languages beyond the five types in PP-OCRv5, PaddleOCR provides specialized recognition models optimized for specific language families:

4.3.1 Available Language Models

PP-OCRv5 Series (Latest):

Language/Script	Model Name	Accuracy	Languages Covered
Korean	`korean_PP-OCRv5_mobile_rec`	88.0%	Korean, English, Numbers
Latin	`latin_PP-OCRv5_mobile_rec`	84.7%	Most Latin-based languages
Eastern Slavic	`eslav_PP-OCRv5_mobile_rec`	81.6%	Russian, Ukrainian, Belarusian
Thai	`th_PP-OCRv5_mobile_rec`	82.68%	Thai, English, Numbers
Greek	`el_PP-OCRv5_mobile_rec`	89.28%	Greek, English, Numbers
Arabic	`arabic_PP-OCRv5_mobile_rec`	81.27%	Arabic script languages
Cyrillic	`cyrillic_PP-OCRv5_mobile_rec`	80.27%	All Cyrillic-based languages
Devanagari	`devanagari_PP-OCRv5_mobile_rec`	84.96%	Hindi, Sanskrit, etc.
Telugu	`te_PP-OCRv5_mobile_rec`	87.65%	Telugu, Numbers
Tamil	`ta_PP-OCRv5_mobile_rec`	94.2%	Tamil, Numbers
English	`en_PP-OCRv5_mobile_rec`	85.25%	English (improved accuracy)

PP-OCRv3 Series (Legacy):

Additional languages available through PP-OCRv3 models include Japanese (japan_PP-OCRv3_mobile_rec), Kannada (ka_PP-OCRv3_mobile_rec), and others with model sizes around 8-10 MB.

Sources: docs/version3.x/pipeline_usage/OCR.md422-632 docs/version3.x/pipeline_usage/OCR.en.md422-631

4.3.2 Language Model Selection

Configuration Example:

Sources: docs/version3.x/pipeline_usage/OCR.md432-531 docs/version3.x/pipeline_usage/OCR.en.md388-531

4.4 PaddleOCR-VL: Universal Language Support

For comprehensive multilingual support beyond PP-OCRv5's capabilities, PaddleOCR-VL provides a unified 0.9B parameter vision-language model supporting 111 languages:

Key Characteristics:

Single model for 111 languages including rare languages (Tibetan, Bengali)
Unified architecture: NaViT visual encoder + ERNIE-4.5-0.3B LLM
Supports complex elements: text, tables, formulas, charts
Document parsing optimized for real-world scenarios

When to Use PaddleOCR-VL:

Documents with rare or unsupported languages
Complex document parsing requirements
Need for integrated layout and content understanding
Willingness to use larger model (900M parameters vs. 16-81M for PP-OCRv5)

Configuration:

Sources: README.md61-66 README.md89-98 docs/index.en.md27-29

5. Model Configuration and Selection API

5.1 Configuration Parameters and Model Registry

The PaddleOCR class paddleocr/__init__.py exposes model selection through these parameters:

Title: PaddleOCR Configuration Parameters for Model Selection

Configuration Parameter Details:

Parameter	Type	Valid Values	Default	Resolution Logic
`ocr_version`	str	`"PP-OCRv5"`, `"PP-OCRv4"`, `"PP-OCRv3"`	`"PP-OCRv5"`	Sets version prefix for model names
`text_detection_model_name`	str	Model registry name (e.g., `"PP-OCRv5_server_det"`)	`"PP-OCRv5_server_det"`	Direct model name lookup
`text_recognition_model_name`	str	Model registry name (e.g., `"korean_PP-OCRv5_mobile_rec"`)	`"PP-OCRv5_server_rec"`	Direct model name lookup
`lang`	str	Language code (e.g., `"korean"`, `"arabic"`)	`"ch"`	Maps to language-specific model if available

Automatic Model Selection Logic:

When lang is specified without explicit text_recognition_model_name, the system performs automatic model resolution:

Manual Model Selection (highest priority):

Model Download and Caching:

Models are downloaded on first use to:

Location: ~/.paddleocr/ or PADDLE_PDX_MODEL_SOURCE environment variable path
Format: .pdmodel (model structure) + .pdiparams (weights) + config files
Automatic: Downloaded from https://paddle-model-ecology.bj.bcebos.com/paddlex/ on first invocation

Sources: docs/version3.x/pipeline_usage/OCR.md746-1048 docs/version3.x/pipeline_usage/OCR.en.md766-1048 pyproject.toml41-45

5.2 Command Line Interface (CLI) Model Selection

The paddleocr command-line interface paddleocr/__main__.py exposes model selection through these flags:

Command: paddleocr ocr [OPTIONS]

Model Selection Flags:

CLI Flag	Type	Example Value	Equivalent Python Parameter
`--ocr_version`	str	`PP-OCRv5`, `PP-OCRv4`, `PP-OCRv3`	`ocr_version`
`--text_detection_model_name`	str	`PP-OCRv5_mobile_det`	`text_detection_model_name`
`--text_recognition_model_name`	str	`korean_PP-OCRv5_mobile_rec`	`text_recognition_model_name`
`--lang`	str	`korean`, `arabic`, `ch`	`lang`

CLI Usage Examples:

Output Format (saved to --save_path if specified):

Text results in .txt files
Detection boxes visualization
Full pipeline results in JSON format

Sources: docs/version3.x/pipeline_usage/OCR.md761-810 docs/version3.x/pipeline_usage/OCR.en.md766-825 paddleocr/__main__.py

5.3 Model Selection Decision Matrix

Decision Guidelines:

Start with PP-OCRv5 defaults: PP-OCRv5_server_det + PP-OCRv5_server_rec
Switch to mobile: If deploying on edge devices or need < 20ms latency
Use language-specific: If primary language is not Chinese/English/Japanese
Use PaddleOCR-VL: If need 100+ languages or complex document parsing
Use PP-OCRv4: Only for legacy compatibility or specific character set needs (15K+)

Sources: docs/version3.x/pipeline_usage/OCR.md701 README.md67-75

6. Model Performance Comparison

6.1 Cross-Version Performance

Detection models across versions (server variants):

Version	Model	Hmean (%)	GPU Time (ms)	Size (MB)
v5	`PP-OCRv5_server_det`	83.8	89.55	84.3
v4	`PP-OCRv4_server_det`	69.2	127.82	109
v3	N/A (mobile only)	-	-	-

Recognition models across versions (mobile variants):

Version	Model	Avg Acc (%)	GPU Time (ms)	Size (MB)
v5	`PP-OCRv5_mobile_rec`	81.29	5.43	16
v4	`PP-OCRv4_mobile_rec`	78.74	5.26	10.5
v3	`PP-OCRv3_mobile_rec`	72.96	3.89	10.3

Key Improvements in PP-OCRv5:

14.6% detection accuracy improvement (v4 → v5)
13% recognition accuracy improvement across multiple scenarios
Single model for 5 text types (previously required separate models)
Better handling of handwriting, vertical text, and rare characters

Sources: docs/version3.x/pipeline_usage/OCR.md132-169 docs/version3.x/pipeline_usage/OCR.md287-334

6.2 Language Model Performance

Performance of language-specific PP-OCRv5 models (all mobile variants, 5.43ms GPU inference):

Language	Model	Accuracy (%)	Model Size (MB)
Tamil	`ta_PP-OCRv5_mobile_rec`	94.2	7.5
Greek	`el_PP-OCRv5_mobile_rec`	89.28	7.5
Korean	`korean_PP-OCRv5_mobile_rec`	88.0	14
Telugu	`te_PP-OCRv5_mobile_rec`	87.65	7.5
English	`en_PP-OCRv5_mobile_rec`	85.25	7.5
Devanagari	`devanagari_PP-OCRv5_mobile_rec`	84.96	7.5
Latin	`latin_PP-OCRv5_mobile_rec`	84.7	14
Thai	`th_PP-OCRv5_mobile_rec`	82.68	7.5
Eastern Slavic	`eslav_PP-OCRv5_mobile_rec`	81.6	14
Arabic	`arabic_PP-OCRv5_mobile_rec`	81.27	7.6
Cyrillic	`cyrillic_PP-OCRv5_mobile_rec`	80.27	7.7

All models maintain efficient inference speeds (5.43ms GPU, 21.20ms CPU) with compact sizes (7-14 MB).

Sources: docs/version3.x/pipeline_usage/OCR.md432-531

7. Best Practices and Model Selection Workflows

7.1 Model Selection Decision Tree

Title: Model Selection Decision Tree with Registry Names

Model Selection Checklist:

Language Coverage → Select model tier (PP-OCRv5 default, language-specific, or PaddleOCR-VL)
Deployment Environment → Choose scale (_server_ vs. _mobile_)
Resource Budget → Validate model size fits memory constraints
Accuracy Requirements → Enable high-performance inference if needed
Validation → Benchmark on representative dataset before production

Sources: docs/version3.x/pipeline_usage/OCR.md701-1048 docs/version3.x/pipeline_usage/OCR.en.md766-1048

7.2 Common Configuration Patterns with Code Examples

Pattern 1: Default Server Deployment (Chinese-English Mixed Documents)

Pattern 2: Mobile Deployment (Korean Text, Resource-Constrained)

Pattern 3: High-Accuracy Server with TensorRT Acceleration

Pattern 4: Document-Optimized Recognition (15K+ Characters)

Pattern 5: Multi-Language Global Support (111 Languages)

Pattern 6: Latin Languages (French, Spanish, Portuguese, etc.)

Pattern 7: CLI Batch Processing with Language Auto-Selection

Sources: docs/version3.x/pipeline_usage/OCR.md746-1048 docs/version3.x/pipeline_usage/OCR.en.md766-1048 docs/quick_start.en.md66-94

7.3 Performance Optimization Guidelines

For Maximum Accuracy:
- Use PP-OCRv5_server_det + PP-OCRv5_server_rec
- Enable high-performance inference (enable_hpi=True)
- Use TensorRT acceleration on supported GPUs
- Consider preprocessing modules for challenging images
For Maximum Speed:
- Use PP-OCRv5_mobile_det + PP-OCRv5_mobile_rec
- Disable optional preprocessing (use_doc_orientation_classify=False, use_doc_unwarping=False)
- Reduce limit_side_len parameter for faster processing
- Deploy with MKL-DNN on CPU or TensorRT on GPU
For Minimal Memory Footprint:
- Use mobile variants (16-81 MB vs. 81-173 MB for server)
- Language-specific models when single language needed (7-14 MB)
- Disable unused optional modules
For Mixed-Language Documents:
- First try PP-OCRv5_server_rec (supports 5 text types)
- If languages not covered, use PaddleOCR-VL for comprehensive support
- Avoid switching between language-specific models (slower)

Sources: docs/version3.x/pipeline_usage/OCR.md707 README.md191-199

7.4 Migration from Legacy Versions (PaddleOCR 2.x to 3.x)

API Parameter Changes:

PaddleOCR 2.x Parameter	PaddleOCR 3.x Parameter	Notes
`det_model_dir`	`text_detection_model_name`	Now uses model registry names instead of paths
`rec_model_dir`	`text_recognition_model_name`	Automatic download from model hub
`cls_model_dir`	`textline_orientation_model_name`	For text line orientation
`use_angle_cls`	`use_textline_orientation`	Renamed for clarity
`lang`	`lang`	Unchanged, but now supports 20+ languages

Migration Examples:

Key Migration Benefits:

Automatic Model Management: Models downloaded from central registry (no manual download)
Unified Multilingual Support: Single PP-OCRv5_server_rec handles Chinese, English, Traditional Chinese, Japanese
Improved Accuracy: 13% average accuracy improvement with PP-OCRv5
Simplified Configuration: Model names replace paths

PaddleX Dependency (Transparent to Users):

PaddleOCR 3.x uses PaddleX for underlying inference pyproject.toml42-45:

Dependency: paddlex[ocr-core]>=3.4.0,<3.5.0
Optional Features: paddlex[doc-parser], paddlex[ie], paddlex[trans]
User Impact: Transparent; no direct PaddleX knowledge required for basic usage

Backward Compatibility Notes:

Old parameter names (det_model_dir, etc.) deprecated but may still work with warnings
Model files from 2.x (*.pdmodel, *.pdiparams) can still be loaded via text_detection_model_dir parameter (not text_detection_model_name)
Recommended to migrate to 3.x registry-based model names for maintainability

Sources: docs/version3.x/paddleocr_and_paddlex.md1-50 docs/version3.x/paddleocr_and_paddlex.en.md1-50 pyproject.toml41-46

Sources for entire document: README.md1-246 docs/version3.x/pipeline_usage/OCR.md1-810 docs/version3.x/pipeline_usage/OCR.en.md1-810 docs/version3.x/pipeline_usage/PP-StructureV3.md1-320 docs/version3.x/pipeline_usage/PP-StructureV3.en.md1-320 docs/index.en.md1-92 docs/quick_start.en.md1-140

Model Selection and Language Support

Relevant source files

1. Purpose and Scope

Key Selection Dimensions:

Language: Single-model multilingual (PP-OCRv5), language-specific models (20+ languages), or universal VLM (111+ languages)
Version: PP-OCRv5 (default, 13% accuracy improvement), PP-OCRv4 (15K+ character support), PP-OCRv3 (legacy)
Scale: Server models (higher accuracy, 80-180MB) vs. Mobile models (efficiency-optimized, 4-16MB)

Sources: docs/version3.x/pipeline_usage/OCR.md1-241 docs/version3.x/pipeline_usage/OCR.en.md1-240 pyproject.toml6-61

2. Model Version Selection

2.1 Version Overview and Model Registry Names

PaddleOCR provides three major versions with distinct model registry names used in text_detection_model_name and text_recognition_model_name parameters:

Title: PP-OCR Version Evolution and Model Names

Version Selection Criteria:

Version	When to Use	Model Registry Names
PP-OCRv5	Default choice; multilingual support (Chinese, English, Traditional Chinese, Japanese); highest accuracy	`PP-OCRv5_server_det`, `PP-OCRv5_mobile_det`, `PP-OCRv5_server_rec`, `PP-OCRv5_mobile_rec`
PP-OCRv4	Need 15K+ character dictionary (`_doc` variant); existing deployment compatibility	`PP-OCRv4_server_det`, `PP-OCRv4_mobile_det`, `PP-OCRv4_server_rec`, `PP-OCRv4_server_rec_doc`, `PP-OCRv4_mobile_rec`
PP-OCRv3	Legacy system integration only	`PP-OCRv3_mobile_rec`

Configuration Examples:

Sources: docs/version3.x/pipeline_usage/OCR.md11-240 docs/version3.x/pipeline_usage/OCR.en.md11-239 docs/version3.x/pipeline_usage/OCR.md132-169

2.2 Version Selection Decision Criteria

Sources: README.md67-70 docs/version3.x/pipeline_usage/OCR.md183-240 docs/version3.x/pipeline_usage/OCR.en.md183-239

3. Model Scale Selection: Server vs. Mobile

3.1 Scale Variant Characteristics

PaddleOCR provides two scale variants for most model versions:

Characteristic	Server Models	Mobile Models
Primary Goal	Maximum accuracy	Deployment efficiency
Model Size	80-180 MB (recognition)	10-16 MB (recognition)
Inference Speed (GPU)	8-9 ms	5-6 ms
Inference Speed (CPU)	30-40 ms	17-21 ms
Accuracy	Higher (86%+)	Good (81%+)
Target Deployment	Servers, cloud	Edge devices, mobile
Memory Requirements	Higher	Lower

Detection Model Comparison:

Model	Hmean (%)	Model Size (MB)	GPU Time (ms)	CPU Time (ms)
`PP-OCRv5_server_det`	83.8	84.3	89.55 / 70.19	383.15 / 383.15
`PP-OCRv5_mobile_det`	79.0	4.7	10.67 / 6.36	57.77 / 28.15

Recognition Model Comparison:

Model	Avg Acc (%)	Model Size (MB)	GPU Time (ms)	CPU Time (ms)
`PP-OCRv5_server_rec`	86.38	81	8.46 / 2.36	31.21 / 31.21
`PP-OCRv5_mobile_rec`	81.29	16	5.43 / 1.46	21.20 / 5.32

Sources: docs/version3.x/pipeline_usage/OCR.md118-169 docs/version3.x/pipeline_usage/OCR.en.md118-168

3.2 Scale Selection Strategy

Configuration Examples:

Server deployment (maximum accuracy):

Mobile deployment (efficiency optimized):

Sources: docs/version3.x/pipeline_usage/OCR.md132-200 docs/quick_start.en.md68-80

4. Language Support

4.1 Language Coverage and Model Registry Architecture

PaddleOCR provides three-tiered language support with distinct model naming conventions:

Title: Language Support Architecture and Model Registry Names

Model Naming Convention:

Pattern: {lang}_PP-OCRv{version}_{scale}_rec
Examples:
- PP-OCRv5_server_rec (default multilingual)
- korean_PP-OCRv5_mobile_rec (Korean-specific)
- en_PP-OCRv4_mobile_rec (English-specific PP-OCRv4)

Language Selection Strategy:

Language(s)	Recommended Model	Registry Name	Size
Chinese + English	PP-OCRv5 default	`PP-OCRv5_server_rec` or `PP-OCRv5_mobile_rec`	81MB / 16MB
Korean	Korean-specific	`korean_PP-OCRv5_mobile_rec`	14MB
Latin languages (French, Spanish, Portuguese, etc.)	Latin model	`latin_PP-OCRv5_mobile_rec`	14MB
Arabic script	Arabic model	`arabic_PP-OCRv5_mobile_rec`	7.6MB
Thai	Thai model	`th_PP-OCRv5_mobile_rec`	7.5MB
Multiple diverse languages	Vision-language model	PaddleOCR-VL pipeline	900MB

Sources: docs/version3.x/pipeline_usage/OCR.md242-632 docs/version3.x/pipeline_usage/OCR.en.md241-631 docs/version3.x/pipeline_usage/OCR.md432-531

4.2 PP-OCRv5 Multi-Type Single Model

The default PP-OCRv5_rec models support five text types within a single model, eliminating the need to switch models for mixed-language documents:

Model	Chinese	English	Traditional Chinese	Japanese	Model Size
`PP-OCRv5_server_rec`	86.38%	64.70%	93.29%	60.35%	81 MB
`PP-OCRv5_mobile_rec`	81.29%	66.00%	83.55%	54.65%	16 MB

Use Cases:

Documents with mixed Chinese-English text
Japanese documents containing Chinese characters
Pinyin annotations in educational materials
Traditional Chinese documents with modern annotations

Configuration:

Sources: docs/version3.x/pipeline_usage/OCR.md246-285 docs/version3.x/pipeline_usage/OCR.en.md245-284

4.3 Language-Specific Recognition Models

For languages beyond the five types in PP-OCRv5, PaddleOCR provides specialized recognition models optimized for specific language families:

4.3.1 Available Language Models

PP-OCRv5 Series (Latest):

Language/Script	Model Name	Accuracy	Languages Covered
Korean	`korean_PP-OCRv5_mobile_rec`	88.0%	Korean, English, Numbers
Latin	`latin_PP-OCRv5_mobile_rec`	84.7%	Most Latin-based languages
Eastern Slavic	`eslav_PP-OCRv5_mobile_rec`	81.6%	Russian, Ukrainian, Belarusian
Thai	`th_PP-OCRv5_mobile_rec`	82.68%	Thai, English, Numbers
Greek	`el_PP-OCRv5_mobile_rec`	89.28%	Greek, English, Numbers
Arabic	`arabic_PP-OCRv5_mobile_rec`	81.27%	Arabic script languages
Cyrillic	`cyrillic_PP-OCRv5_mobile_rec`	80.27%	All Cyrillic-based languages
Devanagari	`devanagari_PP-OCRv5_mobile_rec`	84.96%	Hindi, Sanskrit, etc.
Telugu	`te_PP-OCRv5_mobile_rec`	87.65%	Telugu, Numbers
Tamil	`ta_PP-OCRv5_mobile_rec`	94.2%	Tamil, Numbers
English	`en_PP-OCRv5_mobile_rec`	85.25%	English (improved accuracy)

PP-OCRv3 Series (Legacy):

Additional languages available through PP-OCRv3 models include Japanese (japan_PP-OCRv3_mobile_rec), Kannada (ka_PP-OCRv3_mobile_rec), and others with model sizes around 8-10 MB.

Sources: docs/version3.x/pipeline_usage/OCR.md422-632 docs/version3.x/pipeline_usage/OCR.en.md422-631

4.3.2 Language Model Selection

Configuration Example:

Sources: docs/version3.x/pipeline_usage/OCR.md432-531 docs/version3.x/pipeline_usage/OCR.en.md388-531

4.4 PaddleOCR-VL: Universal Language Support

For comprehensive multilingual support beyond PP-OCRv5's capabilities, PaddleOCR-VL provides a unified 0.9B parameter vision-language model supporting 111 languages:

Key Characteristics:

Single model for 111 languages including rare languages (Tibetan, Bengali)
Unified architecture: NaViT visual encoder + ERNIE-4.5-0.3B LLM
Supports complex elements: text, tables, formulas, charts
Document parsing optimized for real-world scenarios

When to Use PaddleOCR-VL:

Documents with rare or unsupported languages
Complex document parsing requirements
Need for integrated layout and content understanding
Willingness to use larger model (900M parameters vs. 16-81M for PP-OCRv5)

Configuration:

Sources: README.md61-66 README.md89-98 docs/index.en.md27-29

5. Model Configuration and Selection API

5.1 Configuration Parameters and Model Registry

The PaddleOCR class paddleocr/__init__.py exposes model selection through these parameters:

Title: PaddleOCR Configuration Parameters for Model Selection

Configuration Parameter Details:

Parameter	Type	Valid Values	Default	Resolution Logic
`ocr_version`	str	`"PP-OCRv5"`, `"PP-OCRv4"`, `"PP-OCRv3"`	`"PP-OCRv5"`	Sets version prefix for model names
`text_detection_model_name`	str	Model registry name (e.g., `"PP-OCRv5_server_det"`)	`"PP-OCRv5_server_det"`	Direct model name lookup
`text_recognition_model_name`	str	Model registry name (e.g., `"korean_PP-OCRv5_mobile_rec"`)	`"PP-OCRv5_server_rec"`	Direct model name lookup
`lang`	str	Language code (e.g., `"korean"`, `"arabic"`)	`"ch"`	Maps to language-specific model if available

Automatic Model Selection Logic:

When lang is specified without explicit text_recognition_model_name, the system performs automatic model resolution:

Manual Model Selection (highest priority):

Model Download and Caching:

Models are downloaded on first use to:

Location: ~/.paddleocr/ or PADDLE_PDX_MODEL_SOURCE environment variable path
Format: .pdmodel (model structure) + .pdiparams (weights) + config files
Automatic: Downloaded from https://paddle-model-ecology.bj.bcebos.com/paddlex/ on first invocation

Sources: docs/version3.x/pipeline_usage/OCR.md746-1048 docs/version3.x/pipeline_usage/OCR.en.md766-1048 pyproject.toml41-45

5.2 Command Line Interface (CLI) Model Selection

The paddleocr command-line interface paddleocr/__main__.py exposes model selection through these flags:

Command: paddleocr ocr [OPTIONS]

Model Selection Flags:

CLI Flag	Type	Example Value	Equivalent Python Parameter
`--ocr_version`	str	`PP-OCRv5`, `PP-OCRv4`, `PP-OCRv3`	`ocr_version`
`--text_detection_model_name`	str	`PP-OCRv5_mobile_det`	`text_detection_model_name`
`--text_recognition_model_name`	str	`korean_PP-OCRv5_mobile_rec`	`text_recognition_model_name`
`--lang`	str	`korean`, `arabic`, `ch`	`lang`

CLI Usage Examples:

Output Format (saved to --save_path if specified):

Text results in .txt files
Detection boxes visualization
Full pipeline results in JSON format

Sources: docs/version3.x/pipeline_usage/OCR.md761-810 docs/version3.x/pipeline_usage/OCR.en.md766-825 paddleocr/__main__.py

5.3 Model Selection Decision Matrix

Decision Guidelines:

Start with PP-OCRv5 defaults: PP-OCRv5_server_det + PP-OCRv5_server_rec
Switch to mobile: If deploying on edge devices or need < 20ms latency
Use language-specific: If primary language is not Chinese/English/Japanese
Use PaddleOCR-VL: If need 100+ languages or complex document parsing
Use PP-OCRv4: Only for legacy compatibility or specific character set needs (15K+)

Sources: docs/version3.x/pipeline_usage/OCR.md701 README.md67-75

6. Model Performance Comparison

6.1 Cross-Version Performance

Detection models across versions (server variants):

Version	Model	Hmean (%)	GPU Time (ms)	Size (MB)
v5	`PP-OCRv5_server_det`	83.8	89.55	84.3
v4	`PP-OCRv4_server_det`	69.2	127.82	109
v3	N/A (mobile only)	-	-	-

Recognition models across versions (mobile variants):

Version	Model	Avg Acc (%)	GPU Time (ms)	Size (MB)
v5	`PP-OCRv5_mobile_rec`	81.29	5.43	16
v4	`PP-OCRv4_mobile_rec`	78.74	5.26	10.5
v3	`PP-OCRv3_mobile_rec`	72.96	3.89	10.3

Key Improvements in PP-OCRv5:

14.6% detection accuracy improvement (v4 → v5)
13% recognition accuracy improvement across multiple scenarios
Single model for 5 text types (previously required separate models)
Better handling of handwriting, vertical text, and rare characters

Sources: docs/version3.x/pipeline_usage/OCR.md132-169 docs/version3.x/pipeline_usage/OCR.md287-334

6.2 Language Model Performance

Performance of language-specific PP-OCRv5 models (all mobile variants, 5.43ms GPU inference):

Language	Model	Accuracy (%)	Model Size (MB)
Tamil	`ta_PP-OCRv5_mobile_rec`	94.2	7.5
Greek	`el_PP-OCRv5_mobile_rec`	89.28	7.5
Korean	`korean_PP-OCRv5_mobile_rec`	88.0	14
Telugu	`te_PP-OCRv5_mobile_rec`	87.65	7.5
English	`en_PP-OCRv5_mobile_rec`	85.25	7.5
Devanagari	`devanagari_PP-OCRv5_mobile_rec`	84.96	7.5
Latin	`latin_PP-OCRv5_mobile_rec`	84.7	14
Thai	`th_PP-OCRv5_mobile_rec`	82.68	7.5
Eastern Slavic	`eslav_PP-OCRv5_mobile_rec`	81.6	14
Arabic	`arabic_PP-OCRv5_mobile_rec`	81.27	7.6
Cyrillic	`cyrillic_PP-OCRv5_mobile_rec`	80.27	7.7

All models maintain efficient inference speeds (5.43ms GPU, 21.20ms CPU) with compact sizes (7-14 MB).

Sources: docs/version3.x/pipeline_usage/OCR.md432-531

7. Best Practices and Model Selection Workflows

7.1 Model Selection Decision Tree

Title: Model Selection Decision Tree with Registry Names

Model Selection Checklist:

Language Coverage → Select model tier (PP-OCRv5 default, language-specific, or PaddleOCR-VL)
Deployment Environment → Choose scale (_server_ vs. _mobile_)
Resource Budget → Validate model size fits memory constraints
Accuracy Requirements → Enable high-performance inference if needed
Validation → Benchmark on representative dataset before production

Sources: docs/version3.x/pipeline_usage/OCR.md701-1048 docs/version3.x/pipeline_usage/OCR.en.md766-1048

7.2 Common Configuration Patterns with Code Examples

Pattern 1: Default Server Deployment (Chinese-English Mixed Documents)

Pattern 2: Mobile Deployment (Korean Text, Resource-Constrained)

Pattern 3: High-Accuracy Server with TensorRT Acceleration

Pattern 4: Document-Optimized Recognition (15K+ Characters)

Pattern 5: Multi-Language Global Support (111 Languages)

Pattern 6: Latin Languages (French, Spanish, Portuguese, etc.)

Pattern 7: CLI Batch Processing with Language Auto-Selection

Sources: docs/version3.x/pipeline_usage/OCR.md746-1048 docs/version3.x/pipeline_usage/OCR.en.md766-1048 docs/quick_start.en.md66-94

7.3 Performance Optimization Guidelines

For Maximum Accuracy:
- Use PP-OCRv5_server_det + PP-OCRv5_server_rec
- Enable high-performance inference (enable_hpi=True)
- Use TensorRT acceleration on supported GPUs
- Consider preprocessing modules for challenging images
For Maximum Speed:
- Use PP-OCRv5_mobile_det + PP-OCRv5_mobile_rec
- Disable optional preprocessing (use_doc_orientation_classify=False, use_doc_unwarping=False)
- Reduce limit_side_len parameter for faster processing
- Deploy with MKL-DNN on CPU or TensorRT on GPU
For Minimal Memory Footprint:
- Use mobile variants (16-81 MB vs. 81-173 MB for server)
- Language-specific models when single language needed (7-14 MB)
- Disable unused optional modules
For Mixed-Language Documents:
- First try PP-OCRv5_server_rec (supports 5 text types)
- If languages not covered, use PaddleOCR-VL for comprehensive support
- Avoid switching between language-specific models (slower)

Sources: docs/version3.x/pipeline_usage/OCR.md707 README.md191-199

7.4 Migration from Legacy Versions (PaddleOCR 2.x to 3.x)

API Parameter Changes:

PaddleOCR 2.x Parameter	PaddleOCR 3.x Parameter	Notes
`det_model_dir`	`text_detection_model_name`	Now uses model registry names instead of paths
`rec_model_dir`	`text_recognition_model_name`	Automatic download from model hub
`cls_model_dir`	`textline_orientation_model_name`	For text line orientation
`use_angle_cls`	`use_textline_orientation`	Renamed for clarity
`lang`	`lang`	Unchanged, but now supports 20+ languages

Migration Examples:

Key Migration Benefits:

Automatic Model Management: Models downloaded from central registry (no manual download)
Unified Multilingual Support: Single PP-OCRv5_server_rec handles Chinese, English, Traditional Chinese, Japanese
Improved Accuracy: 13% average accuracy improvement with PP-OCRv5
Simplified Configuration: Model names replace paths

PaddleX Dependency (Transparent to Users):

PaddleOCR 3.x uses PaddleX for underlying inference pyproject.toml42-45:

Dependency: paddlex[ocr-core]>=3.4.0,<3.5.0
Optional Features: paddlex[doc-parser], paddlex[ie], paddlex[trans]
User Impact: Transparent; no direct PaddleX knowledge required for basic usage

Backward Compatibility Notes:

Old parameter names (det_model_dir, etc.) deprecated but may still work with warnings
Model files from 2.x (*.pdmodel, *.pdiparams) can still be loaded via text_detection_model_dir parameter (not text_detection_model_name)
Recommended to migrate to 3.x registry-based model names for maintainability

Sources: docs/version3.x/paddleocr_and_paddlex.md1-50 docs/version3.x/paddleocr_and_paddlex.en.md1-50 pyproject.toml41-46

Model Selection and Language Support

1. Purpose and Scope

2. Model Version Selection

2.1 Version Overview and Model Registry Names

2.2 Version Selection Decision Criteria

3. Model Scale Selection: Server vs. Mobile

3.1 Scale Variant Characteristics

3.2 Scale Selection Strategy

4. Language Support

4.1 Language Coverage and Model Registry Architecture

4.2 PP-OCRv5 Multi-Type Single Model

4.3 Language-Specific Recognition Models

4.3.1 Available Language Models

4.3.2 Language Model Selection

4.4 PaddleOCR-VL: Universal Language Support

5. Model Configuration and Selection API

5.1 Configuration Parameters and Model Registry

5.2 Command Line Interface (CLI) Model Selection

5.3 Model Selection Decision Matrix

6. Model Performance Comparison

6.1 Cross-Version Performance

6.2 Language Model Performance

7. Best Practices and Model Selection Workflows

7.1 Model Selection Decision Tree

7.2 Common Configuration Patterns with Code Examples

7.3 Performance Optimization Guidelines

7.4 Migration from Legacy Versions (PaddleOCR 2.x to 3.x)

On this page

Model Selection and Language Support

1. Purpose and Scope

2. Model Version Selection

2.1 Version Overview and Model Registry Names

2.2 Version Selection Decision Criteria

3. Model Scale Selection: Server vs. Mobile

3.1 Scale Variant Characteristics

3.2 Scale Selection Strategy

4. Language Support

4.1 Language Coverage and Model Registry Architecture

4.2 PP-OCRv5 Multi-Type Single Model

4.3 Language-Specific Recognition Models

4.3.1 Available Language Models

4.3.2 Language Model Selection

4.4 PaddleOCR-VL: Universal Language Support

5. Model Configuration and Selection API

5.1 Configuration Parameters and Model Registry

5.2 Command Line Interface (CLI) Model Selection

5.3 Model Selection Decision Matrix

6. Model Performance Comparison

6.1 Cross-Version Performance

6.2 Language Model Performance

7. Best Practices and Model Selection Workflows

7.1 Model Selection Decision Tree

7.2 Common Configuration Patterns with Code Examples

7.3 Performance Optimization Guidelines

7.4 Migration from Legacy Versions (PaddleOCR 2.x to 3.x)

On this page