This page documents the development and testing tools provided by PaddleOCR to support model training, benchmarking, validation, and data annotation. These tools form the infrastructure that enables developers to train custom models, validate model performance across different configurations, and prepare training datasets.
For information about model training itself, see Model Training System. For deployment-related testing, see Deployment and Inference. This page focuses specifically on the tooling infrastructure that supports the development workflow.
PaddleOCR provides three primary development tool categories:
TIPC Testing System Architecture with Code Entities
The diagram maps shell scripts to their key functions and shows how configuration files flow through the testing pipeline. Function names like func_parser_value(), func_inference(), and add_profiler_step() are the actual implementations in the codebase.
Sources: test_tipc/prepare.sh1-202 test_tipc/test_train_inference_python.sh1-344 test_tipc/benchmark_train.sh1-295 test_tipc/common_func.sh ppocr/utils/profiler.py1-131 </old_str>
<old_str>
PaddleOCR integrates with Paddle's profiler API through the ppocr/utils/profiler.py module.
ProfilerOptions Class (ppocr/utils/profiler.py27-85)
The ProfilerOptions class parses semicolon-separated key-value configuration strings:
Configuration options stored in self._options dict:
| Option | Type | Default | Description |
|---|---|---|---|
batch_range | list[int] | [10, 20] | Profiling range (start, end) iterations |
state | str | "All" | Profiling scope: CPU, GPU, or All |
sorted_key | str | "total" | Sort metric for summary: calls, total, max, min, ave |
tracer_option | str | "Default" | Detail level: Default, OpDetail, AllOpDetail |
profile_path | str | "/tmp/profile" | Output path for serialized profile data |
timer_only | bool | True | If True, only throughput; if False, detailed operator statistics |
exit_on_finished | bool | True | Exit after profiling completes |
The _parse_from_string() method (ppocr/utils/profiler.py62-79) splits on semicolons and equals signs to populate the options dictionary.
add_profiler_step() Function (ppocr/utils/profiler.py87-131)
Global function called in training loops to enable profiling:
Implementation details:
_prof, _profiler_step_id, _profiler_options variablesprofiler.Profiler(scheduler=(...), on_trace_ready=profiler.export_chrome_tracing(...), timer_only=...)_prof.start()_prof.step() to advance profiler_prof.stop()_prof.summary(op_detail=True, thread_sep=False, time_unit="ms")./profiler_log/ directoryexit_on_finished=TrueSources: ppocr/utils/profiler.py1-131 </old_str>
<new_str>
TIPC Testing Workflow with Code-Level Details
This diagram shows the precise command-line invocations and parameter names used by TIPC scripts. Each box contains actual shell variables, command flags, and file paths used in the implementation.
Sources: test_tipc/prepare.sh10-277 test_tipc/test_train_inference_python.sh14-343 test_tipc/benchmark_train.sh67-294
TIPC (Train/Inference/Python/C++) is an automated testing framework that validates PaddleOCR models across multiple dimensions: training modes, inference configurations, hardware backends, and precision levels. It provides reproducible testing workflows and performance benchmarking capabilities.
TIPC Testing Workflow and Configuration Processing
This diagram shows how TIPC processes configuration files through three main scripts: prepare.sh for environment setup, test_train_inference_python.sh for comprehensive testing, and benchmark_train.sh for performance profiling. Each component extracts parameters from the .txt configuration files and executes different testing phases.
Sources: test_tipc/prepare.sh1-50 test_tipc/test_train_inference_python.sh1-100 test_tipc/benchmark_train.sh1-150
TIPC uses structured text configuration files (.txt) that define the complete testing matrix. The file is divided into sections separated by ## markers:
| Section | Purpose | Key Parameters |
|---|---|---|
train_params | Training configuration | model_name, python, gpu_list, epoch_num, batch_size_per_card |
eval_params | Evaluation settings | eval_py, evaluation script parameters |
infer_params | Inference testing matrix | use_gpu_list, use_mkldnn_list, cpu_threads_list, precision_list |
train_benchmark_params | Benchmarking configuration | batch_size, fp_items, epoch, profile_option |
to_static_train_benchmark_params | Dynamic-to-static training | Special trainer configuration |
Example configuration structure from test_tipc/configs/layoutxlm_ser/train_infer_python.txt1-60:
===========================train_params===========================
model_name:layoutxlm_ser
python:python3.7
gpu_list:0|0,1
Global.use_gpu:True|True
Global.auto_cast:fp32
Global.epoch_num:lite_train_lite_infer=1|whole_train_whole_infer=17
...
trainer:norm_train
norm_train:tools/train.py -c config.yml -o Global.print_batch_step=1
The configuration uses special syntax:
key:value pairs for simple parameterskey:mode1=value1|mode2=value2 for mode-dependent valuesfunc_parser_value() and func_parser_key() functions extract these values in shell scriptsSources: test_tipc/configs/layoutxlm_ser/train_infer_python.txt1-60
TIPC supports multiple testing modes defined by the MODE variable:
benchmark_train
benchmark_log/train_log/, speed metrics in benchmark_log/index/lite_train_lite_infer
lite_train_whole_infer
whole_train_whole_infer
whole_infer
Sources: test_tipc/prepare.sh6-11 test_tipc/test_train_inference_python.sh5-6
prepare.sh - Environment Setup
Key functions and workflow:
Mode parsing (test_tipc/prepare.sh10-22):
Conditional dataset preparation - Uses model name pattern matching to download appropriate resources:
paddleocr.bj.bcebos.com/dataset/Pretrained model download - Model-specific pretrained weights:
test_train_inference_python.sh - Testing Execution
Main execution flow (test_tipc/test_train_inference_python.sh216-343):
Training phase:
paddle.distributed.launch${python} ${run_train} ${set_use_gpu} ${set_save_model} ...${LOG_PATH}/${trainer}_gpus_${gpu}_autocast_${autocast}/train.logEvaluation phase (test_tipc/test_train_inference_python.sh308-316):
eval_py != "null"Export phase (test_tipc/test_train_inference_python.sh318-326):
tools/export_model.py to convert training checkpoint to inference model${save_log}/ directoryInference phase - Call func_inference() function (test_tipc/test_train_inference_python.sh99-179):
use_mkldnn × cpu_threads × batch_size × precision
${python} ${inference_py} use_gpu=False --enable_mkldnn=${use_mkldnn} --cpu_threads=${threads} ...use_trt × precision × batch_size
Status checking - After each phase, call status_check() to verify exit status and log results to results_python.log
benchmark_train.sh - Performance Profiling
Specialized workflow for benchmarking (test_tipc/benchmark_train.sh1-295):
Configuration modification (test_tipc/benchmark_train.sh145-148):
Dynamic epoch calculation (test_tipc/benchmark_train.sh18-28) - Adjusts epoch count based on device number (e.g., 4 GPUs = 4× epochs)
Profiling execution (test_tipc/benchmark_train.sh204-220):
profile_option parameter: batch_range=[10,20];state=GPU;timer_only=Truetimeout 5m bash test_tipc/test_train_inference_python.sh ...profiling_log/Speed measurement (test_tipc/benchmark_train.sh221-236) - Run without profiler for accurate throughput:
Log parsing (test_tipc/benchmark_train.sh238-252) - Call external analysis script:
Sources: test_tipc/prepare.sh1-202 test_tipc/test_train_inference_python.sh1-344 test_tipc/benchmark_train.sh1-295
PDF2Word is a Windows application developed by PaddleOCR community member whjdark that converts PDF documents to editable Word format using PP-StructureV2 layout analysis and recovery models.
PDF2Word Processing Pipeline
The diagram shows how PDF2Word processes documents through PP-StructureV2's layout analysis, OCR, and table recognition modules before reconstructing formatted Word output.
Three distribution versions (ppstructure/pdf2word/README.md8-9):
Windows Application (ppstructure/pdf2word/README.md8-12):
pdf2word.exePython Script (ppstructure/pdf2word/README.md24-31):
Launches GUI interface from Python environment.
PaddleOCR whl Package - For Linux/Mac users or those with Python environments, directly use the paddleocr whl package which includes PDF2Word functionality as documented in the PP-StructureV3 usage guide (ppstructure/pdf2word/README.md33-35).
PDF2Word is built on PP-StructureV2 (see PP-StructureV2 System for details):
Processing Pipeline:
.docx filePackaging: Uses QPT (Quick Python Tools) framework to package Python application as Windows executable.
paddleocr whl package directly instead (ppstructure/pdf2word/README.md17-22)paddleocr.bj.bcebos.com on first runSources: ppstructure/pdf2word/README.md1-50
Running benchmark tests:
Running lite training and inference tests:
Inference-only testing:
Sources: test_tipc/docs/benchmark_train.md10-28
TIPC generates structured performance reports in JSON format. Example benchmark output from test_tipc/docs/benchmark_train.md38-40:
Log Directory Structure (test_tipc/docs/benchmark_train.md42-53):
benchmark_log/
├── index/ # Speed metric files
│ ├── PaddleOCR_det_mv3_db_v2_0_bs8_fp32_SingleP_DP_N1C1_speed
│ └── PaddleOCR_det_mv3_db_v2_0_bs8_fp32_SingleP_DP_N1C4_speed
├── profiling_log/ # Profiler output
│ └── PaddleOCR_det_mv3_db_v2_0_bs8_fp32_SingleP_DP_N1C1_profiling
└── train_log/ # Raw training logs
├── PaddleOCR_det_mv3_db_v2_0_bs8_fp32_SingleP_DP_N1C1_log
└── PaddleOCR_det_mv3_db_v2_0_bs8_fp32_SingleP_DP_N1C4_log
Performance Benchmarks
Sample performance data for various models on single NVIDIA V100 16G GPU (test_tipc/docs/benchmark_train.md59-76):
| Model | Config File | Large Dataset FP32 FPS | Small Dataset FP32 FPS | Large Dataset FP16 FPS | Small Dataset FP16 FPS |
|---|---|---|---|---|---|
| ch_ppocr_mobile_v2.0_det | config | 53.836 | 53.343 / 53.914 / 52.785 | 45.574 | 45.57 / 46.292 / 46.213 |
| ch_ppocr_mobile_v2.0_rec | config | 2083.311 | 2043.194 / 2066.372 / 2093.317 | 2153.261 | 2167.561 / 2165.726 / 2155.614 |
| ch_PP-OCRv2_det | config | 13.87 | 13.386 / 13.529 / 13.428 | 17.847 | 17.746 / 17.908 / 17.96 |
| det_mv3_db_v2.0 | config | 61.802 | 62.078 / 61.802 / 62.008 | 82.947 | 84.294 / 84.457 / 84.005 |
Sources: test_tipc/docs/benchmark_train.md1-77
PDF2Word is a Windows application developed by PaddleOCR community member whjdark that converts PDF documents to editable Word format using PP-StructureV2 layout analysis and recovery models.
Three versions are available (ppstructure/pdf2word/README.md8-9):
Windows Application (ppstructure/pdf2word/README.md8-12):
pdf2word.exePython Script (ppstructure/pdf2word/README.md24-31):
PaddleOCR whl Package - For Linux/Mac users or those with Python environments, directly use the paddleocr package which includes PDF2Word functionality as documented in the PP-StructureV3 usage guide (ppstructure/pdf2word/README.md33-35).
PDF2Word is built on PP-StructureV2 (legacy version, see Legacy Systems Reference for more details):
The application is packaged using QPT (Quick Python Tools) for cross-platform distribution.
paddleocr whl package directly (ppstructure/pdf2word/README.md17-22)Sources: ppstructure/pdf2word/README.md1-50
PPOCRLabel is a GUI-based data annotation tool specifically designed for OCR tasks. It provides visual annotation capabilities with keyboard shortcuts and auto-labeling features powered by pre-trained PaddleOCR models.
For detailed documentation on PPOCRLabel functionality, see PPOCRLabel Annotation Tool.
Key features:
Integration with Training: PPOCRLabel generates annotation files compatible with PaddleOCR's SimpleDataSet loader, allowing seamless transition from annotation to training (test_tipc/configs/det_r50_dcn_fce_ctw_v2_0/det_r50_vd_dcn_fce_ctw.yml64-68).
Sources: General knowledge from system overview
Beyond the primary tools, PaddleOCR provides additional utilities for document processing:
Tools for converting structured OCR results back into formatted documents:
Training and inference configurations use YAML files with specific structure (test_tipc/configs/layoutxlm_ser/ser_layoutxlm_xfund_zh.yml1-15):
These YAML files define:
For more details on model configuration structure, see Model Configuration Files.
Sources: test_tipc/configs/layoutxlm_ser/ser_layoutxlm_xfund_zh.yml1-123 test_tipc/configs/det_r50_dcn_fce_ctw_v2_0/det_r50_vd_dcn_fce_ctw.yml1-140
PaddleOCR's development tools provide comprehensive infrastructure for the complete development lifecycle:
These tools integrate seamlessly with the training system (see Model Training System) and deployment infrastructure (see Deployment and Inference) to form a complete development workflow from data annotation through training validation to deployment testing.
Refresh this wiki
This wiki was recently refreshed. Please wait 2 days to refresh again.