TIPC Testing and Benchmarking

Relevant source files

TIPC (Training-Inference Pipeline Consistency) is the automated testing and benchmarking framework used in PaddleOCR to validate that models train correctly, export cleanly, and run inference reproducibly across hardware configurations. This page covers the TIPC shell scripts, config file format, supported test modes, and benchmark output structure.

For information about the Python inference system being exercised by TIPC tests, see Python Inference System. For C++ inference testing, see C++ Inference and Build System. For service deployment testing, see Service Deployment.

Framework Overview

TIPC tests are driven by three coordinating shell scripts and a per-model plain-text configuration file. The scripts share common parsing utilities in test_tipc/common_func.sh.

TIPC Script Architecture

Sources: test_tipc/prepare.sh1-10 test_tipc/test_train_inference_python.sh1-10 test_tipc/benchmark_train.sh1-10

Test Modes

Every invocation of prepare.sh or test_train_inference_python.sh requires a MODE argument. The mode controls which data subset is fetched and which test paths are executed.

Mode	Training data scale	Inference data scale	Typical use
`lite_train_lite_infer`	Small (e.g. icdar2015_lite)	Small	CI smoke test
`lite_train_whole_infer`	Small	Full	Inference quality gate
`whole_train_whole_infer`	Full	Full	Full accuracy test
`whole_infer`	None	Full	Inference-only validation
`klquant_whole_infer`	None	Full	KL quantization inference test
`cpp_infer`	None	Full	C++ inference path
`serving_infer`	None	Full	Serving inference path
`benchmark_train`	Benchmark dataset	N/A	Throughput benchmarking

Sources: test_tipc/prepare.sh4-10 test_tipc/test_train_inference_python.sh4-6

Config File Format

Each model has a config file under test_tipc/configs/<model_name>/train_infer_python.txt. The file is a numbered plain-text file; scripts parse specific line numbers using func_parser_value and func_parser_key from common_func.sh.

The file is divided into named sections separated by ## markers:

===========================train_params===========================
===========================eval_params===========================
===========================infer_params===========================
===========================infer_benchmark_params==========================
===========================train_benchmark_params==========================

Config File Line Map

The shell script test_train_inference_python.sh reads lines by fixed offsets:

Line #	Key	Example value
1	`model_name`	`layoutxlm_ser`
2	`python`	`python3.7`
3	`gpu_list`	`0\|0,1`
4	`Global.use_gpu`	`True\|True`
5	`Global.auto_cast`	`fp32`
6	`Global.epoch_num`	`lite_train_lite_infer=1\|whole_train_whole_infer=17`
7	`Global.save_model_dir`	`./output/`
8	`Train.loader.batch_size_per_card`	`lite_train_lite_infer=4\|...`
9	`Architecture.Backbone.checkpoints`	`null`
10	`train_model_name`	`latest`
11	`train_infer_img_dir`	path to inference image
14	`trainer`	`norm_train`
15	`norm_train`	`tools/train.py -c <yml> -o ...`
16	`pact_train`	quantization-aware training command or `null`
17	`fpgm_train`	FPGM pruning training command or `null`
18	`distill_train`	distillation training command or `null`
23	`eval`	eval script path or `null`
27	`Global.save_inference_dir`	`./output/`
28	`Architecture.Backbone.checkpoints`	(export weight key)
29-32	`norm_export`, `quant_export`, `fpgm_export`, `distill_export`	`tools/export_model.py -c <yml>`
36	`infer_model`	model dir(s) for inference
37	`infer_export`	export commands for inference models
38	`infer_quant`	`False` or `True`
39	`inference`	inference script (e.g. `tools/infer/predict_det.py`)
40-50	inference flags	`--use_gpu`, `--enable_mkldnn`, `--cpu_threads`, `--use_tensorrt`, `--precision`, ...

Multi-value fields use | as a separator (tested in sequence) and mode-specific values use mode=value syntax.

Sources: test_tipc/test_train_inference_python.sh15-90 test_tipc/configs/layoutxlm_ser/train_infer_python.txt1-60

Script Roles

`prepare.sh`

Downloads pretrained models and datasets, installs Python dependencies. Behavior branches on MODE:

benchmark_train — downloads full benchmark datasets for each named model (icdar2015_benckmark, ic15_data_benckmark, pubtabnet, XFUND, etc.)
lite_train_lite_infer — downloads icdar2015_lite, ic15_data, and slim inference archives; installs auto_log, paddleslim
whole_train_whole_infer — downloads full icdar2015, ic15_data, pubtabnet
whole_infer — downloads inference models and test images

Model-specific branches within each mode handle special pretrained checkpoints (e.g. MobileNetV3_large_x0_5_pretrained.pdparams, PPLCNetV3_x0_75_ocr_det.pdparams).

Sources: test_tipc/prepare.sh23-202

`test_train_inference_python.sh`

Main test orchestrator. Parses the config file then:

Iterates over gpu_list
Iterates over autocast_list (fp32, amp)
Iterates over trainer_list (norm_train, pact_train, fpgm_train, distill_train, to_static_train)
Runs training, evaluation, export, and inference
Calls status_check on every command exit code and writes pass/fail to ${LOG_PATH}/results_python.log

The func_inference function test_tipc/test_train_inference_python.sh99-179 loops over all combinations of:

CPU: use_mkldnn × cpu_threads × batch_size × precision (fp32, int8)
GPU: use_trt × precision (fp32, fp16, int8) × batch_size

`benchmark_train.sh`

Wraps test_train_inference_python.sh for throughput benchmarking. Takes an optional third parameter to select a specific configuration:

dynamic_bs8_fp32_DP_N1C1

Parameter format: {modeltype}_{bs{N}}_{fp_item}_{run_mode}_{device_num}

Field	Examples
`modeltype`	`dynamic`, `dynamicTostatic`
`batch_size`	`bs8`, `bs16`
`fp_item`	`fp32`, `fp16`, `amp`
`run_mode`	`DP` (DataParallel)
`device_num`	`N1C1`, `N1C4`, `N1C8` (nodes × cards)

When dynamicTostatic is detected, the script rewrites the config file to replace trainer:norm_train with trainer:to_static_train before running.

Sources: test_tipc/benchmark_train.sh67-184

Test Execution Flow

Training + Inference Pipeline (non-benchmark modes)

Sources: test_tipc/test_train_inference_python.sh216-343

Profiler Integration

The ppocr/utils/profiler.py module integrates PaddlePaddle's built-in profiler for operator-level timing during training benchmarks.

ProfilerOptions parses a semicolon-delimited string of options ppocr/utils/profiler.py48-84:

Option	Default	Description
`batch_range`	`[10, 20]`	Batch steps to profile
`state`	`All`	`CPU`, `GPU`, or `All`
`sorted_key`	`total`	Sort dimension for summary
`tracer_option`	`Default`	`Default`, `OpDetail`, `AllOpDetail`
`profile_path`	`/tmp/profile`	Chrome tracing output path
`timer_only`	`True`	If `True`, only throughput is shown; if `False`, full operator breakdown

add_profiler_step(options_str) ppocr/utils/profiler.py87-130 is called once per training step. It starts the profiler on the first call and stops it after batch_range[1] steps, printing a summary and optionally exiting.

In benchmark_train.sh, a profiling run is executed first with profile_option set, then a clean timing run is executed with the profile option set to null test_tipc/benchmark_train.sh204-234

Sources: ppocr/utils/profiler.py1-131 test_tipc/benchmark_train.sh133-234

Benchmark Log Structure

After a benchmark run, logs are written to $BENCHMARK_LOG_DIR/benchmark_log/ with the following layout:

benchmark_log/
├── index/
│   └── PaddleOCR_{model}_{bs}_{precision}_{run_mode}_{device_num}_speed
├── profiling_log/
│   └── PaddleOCR_{model}_{bs}_{precision}_{run_mode}_{device_num}_profiling
└── train_log/
    └── PaddleOCR_{model}_{bs}_{precision}_{run_mode}_{device_num}_log

The analysis.py script (from BENCHMARK_ROOT) parses each train_log file and writes a JSON speed record to index/. An example output record:

Sources: test_tipc/benchmark_train.sh239-290 test_tipc/docs/benchmark_train.md34-53

`analysis.py` Invocation Parameters

The log analysis script is called with these arguments (from benchmark_train.sh):

Parameter	Value source	Description
`--filename`	`${log_path}/${log_name}`	Raw training log
`--speed_log_file`	`${speed_log_path}/${speed_log_name}`	Output JSON path
`--model_name`	`{model}_{bs}_{precision}_{run_mode}`	Model identifier
`--base_batch_size`	`batch_size` from config	Batch size used
`--run_mode`	`DP`	Data parallel mode
`--fp_item`	`fp32` or `fp16`	Precision
`--keyword`	`ips:`	Log line keyword for throughput
`--skip_steps`	`2`	Warmup steps to skip
`--device_num`	`N1C1`, `N1C4`, etc.	Device topology
`--speed_unit`	`samples/s` or `images/s`	Unit label
`--convergence_key`	`loss:`	Keyword for convergence value

Sources: test_tipc/benchmark_train.sh239-255

Supported Trainer Types

test_train_inference_python.sh selects the training command and export command based on the trainer value in the config's trainer_list:

Sources: test_tipc/test_train_inference_python.sh247-270

Quick Start

Running a Smoke Test

Running a Benchmark

Results are checked in test_tipc/output/<model_name>/<mode>/results_python.log. A line prefixed with PASS or FAIL is written for each command executed.

Sources: test_tipc/docs/benchmark_train.md6-31 test_tipc/test_train_inference_python.sh95-97

Single-Card Performance Reference

The following are representative throughput values measured on a single Nvidia V100 16G GPU (from test_tipc/docs/benchmark_train.md):

Model	fp32 fps (full dataset)	fp16 fps (full dataset)
`ch_ppocr_mobile_v2.0_det`	53.836	45.574
`ch_ppocr_mobile_v2.0_rec`	2083.311	2153.261
`ch_ppocr_server_v2.0_det`	20.716	20.592
`ch_ppocr_server_v2.0_rec`	528.56	1189.788
`det_mv3_db_v2.0`	61.802	82.947
`det_r50_vd_db_v2.0`	29.955	51.097
`det_r50_vd_east_v2.0`	42.485	67.610
`ch_PP-OCRv2_det`	13.87	17.847
`layoutxlm_ser`	18.001	21.982
`PP-Structure-table`	14.151	16.285
`ch_PP-OCRv3_det`	8.622	14.203

Sources: test_tipc/docs/benchmark_train.md54-76

TIPC Testing and Benchmarking

Relevant source files

Framework Overview

TIPC tests are driven by three coordinating shell scripts and a per-model plain-text configuration file. The scripts share common parsing utilities in test_tipc/common_func.sh.

TIPC Script Architecture

Sources: test_tipc/prepare.sh1-10 test_tipc/test_train_inference_python.sh1-10 test_tipc/benchmark_train.sh1-10

Test Modes

Every invocation of prepare.sh or test_train_inference_python.sh requires a MODE argument. The mode controls which data subset is fetched and which test paths are executed.

Mode	Training data scale	Inference data scale	Typical use
`lite_train_lite_infer`	Small (e.g. icdar2015_lite)	Small	CI smoke test
`lite_train_whole_infer`	Small	Full	Inference quality gate
`whole_train_whole_infer`	Full	Full	Full accuracy test
`whole_infer`	None	Full	Inference-only validation
`klquant_whole_infer`	None	Full	KL quantization inference test
`cpp_infer`	None	Full	C++ inference path
`serving_infer`	None	Full	Serving inference path
`benchmark_train`	Benchmark dataset	N/A	Throughput benchmarking

Sources: test_tipc/prepare.sh4-10 test_tipc/test_train_inference_python.sh4-6

Config File Format

The file is divided into named sections separated by ## markers:

===========================train_params===========================
===========================eval_params===========================
===========================infer_params===========================
===========================infer_benchmark_params==========================
===========================train_benchmark_params==========================

Config File Line Map

The shell script test_train_inference_python.sh reads lines by fixed offsets:

Line #	Key	Example value
1	`model_name`	`layoutxlm_ser`
2	`python`	`python3.7`
3	`gpu_list`	`0\|0,1`
4	`Global.use_gpu`	`True\|True`
5	`Global.auto_cast`	`fp32`
6	`Global.epoch_num`	`lite_train_lite_infer=1\|whole_train_whole_infer=17`
7	`Global.save_model_dir`	`./output/`
8	`Train.loader.batch_size_per_card`	`lite_train_lite_infer=4\|...`
9	`Architecture.Backbone.checkpoints`	`null`
10	`train_model_name`	`latest`
11	`train_infer_img_dir`	path to inference image
14	`trainer`	`norm_train`
15	`norm_train`	`tools/train.py -c <yml> -o ...`
16	`pact_train`	quantization-aware training command or `null`
17	`fpgm_train`	FPGM pruning training command or `null`
18	`distill_train`	distillation training command or `null`
23	`eval`	eval script path or `null`
27	`Global.save_inference_dir`	`./output/`
28	`Architecture.Backbone.checkpoints`	(export weight key)
29-32	`norm_export`, `quant_export`, `fpgm_export`, `distill_export`	`tools/export_model.py -c <yml>`
36	`infer_model`	model dir(s) for inference
37	`infer_export`	export commands for inference models
38	`infer_quant`	`False` or `True`
39	`inference`	inference script (e.g. `tools/infer/predict_det.py`)
40-50	inference flags	`--use_gpu`, `--enable_mkldnn`, `--cpu_threads`, `--use_tensorrt`, `--precision`, ...

Multi-value fields use | as a separator (tested in sequence) and mode-specific values use mode=value syntax.

Sources: test_tipc/test_train_inference_python.sh15-90 test_tipc/configs/layoutxlm_ser/train_infer_python.txt1-60

Script Roles

`prepare.sh`

Downloads pretrained models and datasets, installs Python dependencies. Behavior branches on MODE:

benchmark_train — downloads full benchmark datasets for each named model (icdar2015_benckmark, ic15_data_benckmark, pubtabnet, XFUND, etc.)
lite_train_lite_infer — downloads icdar2015_lite, ic15_data, and slim inference archives; installs auto_log, paddleslim
whole_train_whole_infer — downloads full icdar2015, ic15_data, pubtabnet
whole_infer — downloads inference models and test images

Model-specific branches within each mode handle special pretrained checkpoints (e.g. MobileNetV3_large_x0_5_pretrained.pdparams, PPLCNetV3_x0_75_ocr_det.pdparams).

Sources: test_tipc/prepare.sh23-202

`test_train_inference_python.sh`

Main test orchestrator. Parses the config file then:

Iterates over gpu_list
Iterates over autocast_list (fp32, amp)
Iterates over trainer_list (norm_train, pact_train, fpgm_train, distill_train, to_static_train)
Runs training, evaluation, export, and inference
Calls status_check on every command exit code and writes pass/fail to ${LOG_PATH}/results_python.log

The func_inference function test_tipc/test_train_inference_python.sh99-179 loops over all combinations of:

CPU: use_mkldnn × cpu_threads × batch_size × precision (fp32, int8)
GPU: use_trt × precision (fp32, fp16, int8) × batch_size

`benchmark_train.sh`

Wraps test_train_inference_python.sh for throughput benchmarking. Takes an optional third parameter to select a specific configuration:

dynamic_bs8_fp32_DP_N1C1

Parameter format: {modeltype}_{bs{N}}_{fp_item}_{run_mode}_{device_num}

Field	Examples
`modeltype`	`dynamic`, `dynamicTostatic`
`batch_size`	`bs8`, `bs16`
`fp_item`	`fp32`, `fp16`, `amp`
`run_mode`	`DP` (DataParallel)
`device_num`	`N1C1`, `N1C4`, `N1C8` (nodes × cards)

When dynamicTostatic is detected, the script rewrites the config file to replace trainer:norm_train with trainer:to_static_train before running.

Sources: test_tipc/benchmark_train.sh67-184

Test Execution Flow

Training + Inference Pipeline (non-benchmark modes)

Sources: test_tipc/test_train_inference_python.sh216-343

Profiler Integration

The ppocr/utils/profiler.py module integrates PaddlePaddle's built-in profiler for operator-level timing during training benchmarks.

ProfilerOptions parses a semicolon-delimited string of options ppocr/utils/profiler.py48-84:

Option	Default	Description
`batch_range`	`[10, 20]`	Batch steps to profile
`state`	`All`	`CPU`, `GPU`, or `All`
`sorted_key`	`total`	Sort dimension for summary
`tracer_option`	`Default`	`Default`, `OpDetail`, `AllOpDetail`
`profile_path`	`/tmp/profile`	Chrome tracing output path
`timer_only`	`True`	If `True`, only throughput is shown; if `False`, full operator breakdown

In benchmark_train.sh, a profiling run is executed first with profile_option set, then a clean timing run is executed with the profile option set to null test_tipc/benchmark_train.sh204-234

Sources: ppocr/utils/profiler.py1-131 test_tipc/benchmark_train.sh133-234

Benchmark Log Structure

After a benchmark run, logs are written to $BENCHMARK_LOG_DIR/benchmark_log/ with the following layout:

benchmark_log/
├── index/
│   └── PaddleOCR_{model}_{bs}_{precision}_{run_mode}_{device_num}_speed
├── profiling_log/
│   └── PaddleOCR_{model}_{bs}_{precision}_{run_mode}_{device_num}_profiling
└── train_log/
    └── PaddleOCR_{model}_{bs}_{precision}_{run_mode}_{device_num}_log

The analysis.py script (from BENCHMARK_ROOT) parses each train_log file and writes a JSON speed record to index/. An example output record:

Sources: test_tipc/benchmark_train.sh239-290 test_tipc/docs/benchmark_train.md34-53

`analysis.py` Invocation Parameters

The log analysis script is called with these arguments (from benchmark_train.sh):

Parameter	Value source	Description
`--filename`	`${log_path}/${log_name}`	Raw training log
`--speed_log_file`	`${speed_log_path}/${speed_log_name}`	Output JSON path
`--model_name`	`{model}_{bs}_{precision}_{run_mode}`	Model identifier
`--base_batch_size`	`batch_size` from config	Batch size used
`--run_mode`	`DP`	Data parallel mode
`--fp_item`	`fp32` or `fp16`	Precision
`--keyword`	`ips:`	Log line keyword for throughput
`--skip_steps`	`2`	Warmup steps to skip
`--device_num`	`N1C1`, `N1C4`, etc.	Device topology
`--speed_unit`	`samples/s` or `images/s`	Unit label
`--convergence_key`	`loss:`	Keyword for convergence value

Sources: test_tipc/benchmark_train.sh239-255

Supported Trainer Types

test_train_inference_python.sh selects the training command and export command based on the trainer value in the config's trainer_list:

Sources: test_tipc/test_train_inference_python.sh247-270

Quick Start

Running a Smoke Test

Running a Benchmark

Results are checked in test_tipc/output/<model_name>/<mode>/results_python.log. A line prefixed with PASS or FAIL is written for each command executed.

Sources: test_tipc/docs/benchmark_train.md6-31 test_tipc/test_train_inference_python.sh95-97

Single-Card Performance Reference

The following are representative throughput values measured on a single Nvidia V100 16G GPU (from test_tipc/docs/benchmark_train.md):

Model	fp32 fps (full dataset)	fp16 fps (full dataset)
`ch_ppocr_mobile_v2.0_det`	53.836	45.574
`ch_ppocr_mobile_v2.0_rec`	2083.311	2153.261
`ch_ppocr_server_v2.0_det`	20.716	20.592
`ch_ppocr_server_v2.0_rec`	528.56	1189.788
`det_mv3_db_v2.0`	61.802	82.947
`det_r50_vd_db_v2.0`	29.955	51.097
`det_r50_vd_east_v2.0`	42.485	67.610
`ch_PP-OCRv2_det`	13.87	17.847
`layoutxlm_ser`	18.001	21.982
`PP-Structure-table`	14.151	16.285
`ch_PP-OCRv3_det`	8.622	14.203

Sources: test_tipc/docs/benchmark_train.md54-76

TIPC Testing and Benchmarking

Framework Overview

Test Modes

Config File Format

Config File Line Map

Script Roles

`prepare.sh`

`test_train_inference_python.sh`

`benchmark_train.sh`

Test Execution Flow

Profiler Integration

Benchmark Log Structure

`analysis.py` Invocation Parameters

Supported Trainer Types

Quick Start

Running a Smoke Test

Running a Benchmark

Single-Card Performance Reference

On this page

TIPC Testing and Benchmarking

Framework Overview

Test Modes

Config File Format

Config File Line Map

Script Roles

`prepare.sh`

`test_train_inference_python.sh`

`benchmark_train.sh`

Test Execution Flow

Profiler Integration

Benchmark Log Structure

`analysis.py` Invocation Parameters

Supported Trainer Types

Quick Start

Running a Smoke Test

Running a Benchmark

Single-Card Performance Reference

On this page

TIPC Testing and Benchmarking

Framework Overview

Test Modes

Config File Format

Config File Line Map

Script Roles

prepare.sh

test_train_inference_python.sh

benchmark_train.sh

Test Execution Flow

Profiler Integration

Benchmark Log Structure

analysis.py Invocation Parameters

Supported Trainer Types

Quick Start

Running a Smoke Test

Running a Benchmark

Single-Card Performance Reference

On this page

TIPC Testing and Benchmarking

Framework Overview

Test Modes

Config File Format

Config File Line Map

Script Roles

prepare.sh

test_train_inference_python.sh

benchmark_train.sh

Test Execution Flow

Profiler Integration

Benchmark Log Structure

analysis.py Invocation Parameters

Supported Trainer Types

Quick Start

Running a Smoke Test

Running a Benchmark

Single-Card Performance Reference

On this page

`prepare.sh`

`test_train_inference_python.sh`

`benchmark_train.sh`

`analysis.py` Invocation Parameters

`prepare.sh`

`test_train_inference_python.sh`

`benchmark_train.sh`

`analysis.py` Invocation Parameters