⚠️ CRITICAL DEPRECATION NOTICE: PP-StructureV2 is the deprecated second-generation document structure analysis pipeline in PaddleOCR. This system is no longer maintained and will be removed in a future release. This documentation is preserved for reference only.
Users must migrate to:
PP-StructureV2 provides basic layout analysis, table recognition, OCR, formula recognition, and KIE capabilities through the StructureSystem class ppstructure/predict_system.py44-272 However, it suffers from limitations that are resolved in PP-StructureV3:
Related Documentation:
Sources: ppstructure/README.md1-17 ppstructure/predict_system.py44-272
Before diving into PP-StructureV2 details, understand why migration to PP-StructureV3 is essential:
| Aspect | PP-StructureV2 | PP-StructureV3 | Impact |
|---|---|---|---|
| Layout Categories | 5-10 categories (PubLayNet/CDLA) | 20 categories (PP-DocLayout) | V3 handles more document types |
| Table Recognition | 3-model pipeline (DB+CRNN+SLANet) | SLANeXt with wired/wireless classification | 10-15% accuracy improvement |
| Chart Support | None | PP-Chart2Table module | V3 extracts data from charts |
| Document Preprocessing | Manual orientation correction only | Built-in orientation + unwarping | Better handling of scanned docs |
| Reading Order | Basic top-to-bottom sort | Advanced multi-column detection | More accurate document recovery |
| API Integration | Standalone scripts | PaddleOCR 3.x wheel package | Easier installation and usage |
| Module Coordination | Manual StructureSystem configuration | Automatic pipeline management | Simplified development |
| Maintenance Status | Deprecated | Active development | V3 receives updates and bug fixes |
Architecture Comparison:
Migration Urgency: PP-StructureV2 code in ppstructure/ directory will be completely removed in a future PaddleOCR release. All production systems must migrate to V3.
Sources: ppstructure/README.md1-17 README.md28-76
PP-StructureV2 implements a modular pipeline architecture where documents are processed through sequential stages: layout detection, specialized recognition (OCR, tables, formulas), and optional post-processing for layout recovery. This architecture is replaced by PP-StructureV3's integrated pipeline 2.3.
Key Code Entities:
StructureSystem ppstructure/predict_system.py44-272: Main orchestrator classLayoutPredictor: Detects document regions (text, table, figure, title, etc.)TextSystem: OCR detection + recognition from tools/infer/predict_system.pyTableSystem ppstructure/table/predict_table.py: Three-stage table recognitionTextRecognizer: Formula recognition with LaTeX outputSources: ppstructure/predict_system.py44-272 ppstructure/utility.py28-156
The StructureSystem class is the central orchestrator that coordinates all recognition modules based on layout analysis results.
Initialization Logic:
structure (document parsing) or kie (key information extraction)Key Configuration Parameters ppstructure/utility.py28-156:
| Parameter | Type | Default | Description |
|---|---|---|---|
mode | str | "structure" | Pipeline mode: structure or kie |
layout | bool | True | Enable layout analysis |
table | bool | True | Enable table recognition |
ocr | bool | True | Enable OCR for text regions |
formula | bool | False | Enable formula recognition |
image_orientation | bool | False | Enable orientation correction |
recovery | bool | False | Enable layout recovery to DOCX |
Sources: ppstructure/predict_system.py44-96 ppstructure/utility.py28-156
Key Optimizations ppstructure/predict_system.py137-148:
_filter_text_res method checks bbox overlap using _has_intersectionOutput Structure:
Sources: ppstructure/predict_system.py98-202 ppstructure/predict_system.py204-271
PP-StructureV2's table recognition uses a three-model pipeline that combines text detection, text recognition, and structure prediction.
Model Responsibilities:
| Model | Input | Output | Purpose |
|---|---|---|---|
| DB (Detection) | Table image | Text box coordinates | Locate individual text lines |
| CRNN (Recognition) | Cropped text boxes | Text string + confidence | Read text content |
| SLANet (Structure) | Table image | HTML structure + cell coords | Predict table layout, cell spans |
Key Implementation:
Sources: ppstructure/table/README.md15-28 ppstructure/table/README_ch.md15-31
Performance on PubTabNet evaluation dataset:
| Method | Acc | TEDS | Speed (CPU) |
|---|---|---|---|
| EDD | - | 88.30% | - |
| TableRec-RARE | 71.73% | 93.88% | 779ms |
| SLANet | 76.31% | 95.89% | 766ms |
Metrics:
Sources: ppstructure/table/README.md30-42 ppstructure/table/README_ch.md34-48
TEDS evaluation compares predicted HTML tables against ground truth:
Ground Truth Format ppstructure/table/README.md101-104:
PMC5755158_010_01.png <html><body><table><thead><tr><td></td>...
Evaluation Command ppstructure/table/README.md113-123:
Output:
teds: 95.89
Sources: ppstructure/table/README.md98-155 ppstructure/table/README_ch.md101-159
Layout analysis identifies and classifies document regions before applying specialized recognition.
PP-StructureV2 supports three layout analysis models based on PicoDet architecture:
| Model | Dataset | Supported Classes | Use Case |
|---|---|---|---|
| English Layout | PubLayNet | Text, Title, List, Table, Figure | English documents |
| Chinese Layout | CDLA | Text, Title, Figure, Figure caption, Table, Table caption, Header, Footer, Reference, Equation | Chinese academic papers |
| Table Layout | TableBank | Table | Table-only detection |
LayoutPredictor Interface ppstructure/layout/predict_layout.py:
{bbox, label, score}Configuration ppstructure/utility.py53-64:
Sources: ppstructure/layout/README.md24-26 ppstructure/utility.py53-64
Model Details ppstructure/layout/README.md160-174:
Sources: ppstructure/layout/README.md24-33 ppstructure/layout/README_ch.md23-29
Formula recognition extracts mathematical equations and converts them to LaTeX format.
Integration ppstructure/predict_system.py83-89:
Processing ppstructure/predict_system.py171-174:
{"latex": "\\frac{1}{2}"}Configuration ppstructure/utility.py44-51:
Sources: ppstructure/predict_system.py83-89 ppstructure/predict_system.py171-174 ppstructure/utility.py44-51
⚠️ DEPRECATED: KIE functionality in PP-StructureV2 is deprecated. Use PP-ChatOCRv4 instead (2.4).
When mode="kie", StructureSystem uses SerRePredictor for Semantic Entity Recognition (SER) and Relation Extraction (RE):
Configuration ppstructure/utility.py66-72:
Inference ppstructure/predict_system.py196-200:
Sources: ppstructure/predict_system.py91-94 ppstructure/predict_system.py196-200 ppstructure/utility.py66-72
Layout recovery converts document analysis results into editable formats (DOCX, Markdown) while preserving layout.
PP-StructureV2 provides two layout recovery approaches:
| Method | Input | Advantages | Disadvantages |
|---|---|---|---|
| Standard PDF Parse | Standard PDF only | Better for non-paper docs, preserves pagination | Some Chinese/English garbled, table formatting issues |
| Image Format PDF Parse | PDF or images | Better OCR for papers, handles 111 languages | Layout depends on analysis accuracy, spacing/fonts need improvement |
Method Selection ppstructure/predict_system.py320-330:
Sources: ppstructure/recovery/README.md22-33 ppstructure/recovery/README_ch.md27-31
Key Functions:
sorted_layout_boxes ppstructure/recovery/recovery_to_doc.py87-155
layout fieldconvert_info_docx ppstructure/recovery/recovery_to_doc.py32-84
python-docx DocumentHtmlToDocx ppstructure/recovery/table_process.py142-325
convert_info_markdown ppstructure/recovery/recovery_to_markdown.py129-187
Sources: ppstructure/recovery/recovery_to_doc.py32-155 ppstructure/recovery/table_process.py142-325 ppstructure/recovery/recovery_to_markdown.py129-187
Enable Recovery ppstructure/utility.py114-130:
Dependencies ppstructure/recovery/requirements.txt1-5:
python-docx: DOCX file creationbeautifulsoup4: HTML parsingfonttools>=4.43.0: Font handlingfire>=0.3.0: Command-line interfaceExecution Flow ppstructure/predict_system.py371-395:
Sources: ppstructure/predict_system.py371-395 ppstructure/utility.py114-130 ppstructure/recovery/requirements.txt1-5
Core Pipeline Arguments ppstructure/utility.py28-156:
Sources: ppstructure/utility.py28-156
Basic Structure Analysis:
Table Recognition Only ppstructure/table/README.md73-81:
Layout Recovery to DOCX ppstructure/recovery/README_ch.md183-195:
Sources: ppstructure/table/README.md73-81 ppstructure/recovery/README_ch.md183-195
Result Saving ppstructure/predict_system.py274-302:
Directory Structure:
output/
└── structure/
└── image_name/
├── res_0.txt # JSON results for page 0
├── show_0.jpg # Visualization with bounding boxes
├── [20,30,400,500]_0.xlsx # Table at bbox [20,30,400,500]
├── [50,600,350,800]_0.jpg # Figure at bbox [50,600,350,800]
└── image_name_ocr.docx # Layout recovery (if enabled)
Sources: ppstructure/predict_system.py274-302 ppstructure/predict_system.py344-370
draw_structure_result ppstructure/utility.py159-240 generates annotated images:
Color Assignment:
catid2color dictWord-Level Boxes ppstructure/utility.py243-298:
cal_ocr_word_box(rec_str, box, rec_word_info)Sources: ppstructure/utility.py159-298
Performance Tracking ppstructure/predict_system.py99-109:
Each module accumulates timing information, logged at the end ppstructure/predict_system.py396:
Sources: ppstructure/predict_system.py99-109 ppstructure/predict_system.py396
⚠️ PP-StructureV2 is unmaintained. All production deployments must migrate to PP-StructureV3 to receive:
Deprecation Timeline ppstructure/README.md4:
"This directory will be removed at an appropriate time in the future, and maintenance for the second generation will be discontinued."
The ppstructure/ directory containing V2 code will be completely deleted in a future PaddleOCR release (expected Q2 2025).
Sources: ppstructure/README.md1-16 README.md28-76
Old Code (PP-StructureV2):
New Code (PP-StructureV3):
Key Differences:
save_to_markdown, save_to_json)Sources: ppstructure/predict_system.py44-96 README.md156-173
Old Code (PP-StructureV2 KIE):
New Code (PP-ChatOCRv4):
Why PP-ChatOCRv4 is Better:
Sources: ppstructure/README.md8-9 README.md36-38
Old Code:
New Code:
V3 Table Improvements:
Sources: ppstructure/table/README.md15-48 README.md171-172
Complete mapping of V2 arguments to V3 parameters:
| PP-StructureV2 Argument | Type | PP-StructureV3 Parameter | Notes |
|---|---|---|---|
--mode='structure' | str | Default behavior | No parameter needed |
--mode='kie' | str | Use PPChatOCRv4Doc instead | Separate class |
--layout=True | bool | use_region_detection=True | Default True in V3 |
--table=True | bool | use_table_recognition=True | Default True in V3 |
--ocr=True | bool | Always enabled | Cannot disable in V3 |
--formula=True | bool | use_formula_recognition=True | Uses UniMERNet in V3 |
--image_orientation=True | bool | use_doc_orientation_classify=True | Better accuracy in V3 |
--recovery=True | bool | Auto-enabled | Use save_to_markdown() |
--det_model_dir | path | text_detection_model_name="PP-OCRv5_server_det" | Model name not path |
--rec_model_dir | path | text_recognition_model_name="PP-OCRv5_server_rec" | Model name not path |
--table_model_dir | path | table_model_name="SLANeXt" | Model name not path |
--layout_model_dir | path | layout_model_name="PP-DocLayoutV3" | Model name not path |
--layout_dict_path | path | Auto-configured | No manual dict needed |
--use_pdf2docx_api | bool | Removed | Use OCR-based markdown |
--recovery_to_markdown | bool | res.save_to_markdown() | Built-in method |
Model Name Examples in V3:
"PP-OCRv5_server_det", "PP-OCRv5_mobile_det""PP-OCRv5_server_rec", "PP-OCRv5_mobile_rec""PP-DocLayoutV3", "PP-DocLayoutV2""SLANeXt", "SLANet""UniMERNet", "PP-FormulaNet"Sources: ppstructure/utility.py28-156
Pre-Migration Steps:
ppstructure importspip install paddleocr[all]>=3.0Migration Steps:
from ppstructure.predict_system import StructureSystem with from paddleocr import PPStructureV3_model_dir args)args.mode="kie" code with PPChatOCRv4Doc class.print(), .save_to_*() methods)use_chart_recognition, use_doc_unwarpingPost-Migration Validation:
ppstructure/ directory references)Sources: ppstructure/README.md1-17
API Breaking Changes:
ppstructure/ directory → V3 in paddleocr wheel packageStructureSystem → PPStructureV3(result_list, time_dict) → V3 returns PipelineResult objectsmode="kie" → V3 requires separate PPChatOCRv4Doc classKnown Migration Issues:
| Issue | V2 Behavior | V3 Solution |
|---|---|---|
| Custom trained models | Load via --xxx_model_dir | Not directly supported; use model conversion |
PDF parsing with pdf2docx | --use_pdf2docx_api=True | Removed; use OCR-based markdown export |
| Time profiling dict | Returns detailed time_dict | Enable show_log=True for timing |
| Region filtering | Manual _filter_text_res | Auto-handled by pipeline |
| Custom preprocessing | Add before StructureSystem | Use use_doc_preprocessor=True |
Compatibility Notes:
.pdparams, .pdmodel) are not compatible with V3 pipelinesave_structure_res JSON format differs from V3's save_to_json formatReferences:
Sources: ppstructure/README.md1-17 ppstructure/predict_system.py44-96
PP-StructureV2 supports multi-process inference for batch processing:
Multi-Process Launch ppstructure/predict_system.py401-415:
Image Distribution ppstructure/predict_system.py307:
Each process handles every Nth image, where N is total_process_num.
Usage:
Sources: ppstructure/predict_system.py307 ppstructure/predict_system.py401-415
PP-StructureV2 provides a comprehensive document understanding system with:
StructureSystem orchestrates layout, OCR, table, and formula recognition⚠️ Migration Recommended: Users should transition to PP-StructureV3 (2.3) for improved accuracy and features, and PP-ChatOCRv4 (2.4) for KIE tasks.
Key Code Entry Points:
Sources: ppstructure/predict_system.py44-415 ppstructure/README.md1-17
Refresh this wiki
This wiki was recently refreshed. Please wait 2 days to refresh again.