Community and Contributing

Relevant source files

This page covers how to engage with the PaddleOCR community, how to report issues and contribute code or documentation, and where to find official resources. It describes the governance structure, communication channels, and the contribution workflow as defined in the repository. For information on installation and environment setup, see page 1.2. For version migration notes, see page 1.3.

Project Governance

PaddleOCR is maintained by the PaddleOCR PMC (Project Management Committee) under the Apache 2.0 license. The repository lives at PaddlePaddle/PaddleOCR on GitHub.

The documentation site is configured in mkdocs.yml and built using MkDocs Material. Documentation sources reside in the docs/ directory; edits target the main branch via the configured edit_uri: edit/main/docs/ mkdocs.yml12

Community contribution guidelines are documented in two files referenced in the navigation mkdocs.yml390-392:

File	Purpose
`community/community_contribution.md`	Community contribution guide (PR process, code standards)
`community/code_and_doc.md`	Code and documentation appendix (style rules, checklist)

Sources: mkdocs.yml1-12 mkdocs.yml390-392

Community Channels

The following channels are listed in the official documentation and README:

Channel	Purpose	Reference
GitHub Issues	Bug reports, feature requests, usage questions	`github.com/PaddlePaddle/PaddleOCR/issues`
GitHub Discussions	General Q&A, RFC proposals	`github.com/PaddlePaddle/PaddleOCR/discussions`
PaddlePaddle Community	Developer network, broader ecosystem	`github.com/PaddlePaddle/community`
AI Studio	OCR courses, workshops, competitions, hackathons	`aistudio.baidu.com`
PaddleOCR Official Website	Online demos, free API, MCP service	`www.paddleocr.com`
Twitter / X	Announcements, news	`@PaddlePaddle`
WeChat Public Account	Chinese-language announcements	Official WeChat account

Sources: docs/index.en.md85-91 README.md39-49

The best practice project showcase (open to submissions) is hosted on AI Studio and is referenced from the index page docs/index.en.md86-87

Repository Structure Relevant to Contributors

Diagram: Repository Layout for Contributors

Sources: README.md6 mkdocs.yml390-392 mkdocs.yml292-393

Localization

The README is available in 9 languages. Each is a separate file under readme/:

Language	File
English	`README.md` (root)
Simplified Chinese	`readme/README_cn.md`
Traditional Chinese	`readme/README_tcn.md`
Japanese	`readme/README_ja.md`
Korean	`readme/README_ko.md`
French	`readme/README_fr.md`
Russian	`readme/README_ru.md`
Spanish	`readme/README_es.md`
Arabic	`readme/README_ar.md`

The documentation site supports Chinese and English via the mkdocs-material i18n plugin, configured in mkdocs.yml85-218 The locale: zh (Simplified Chinese) is the default, and locale: en is the English site at /en/.

Sources: README.md6 mkdocs.yml85-191

Contribution Workflow

Diagram: Contribution Flow from Contributor to Merged Code

Sources: mkdocs.yml12 README.md1-50

Key Directories for Code Contributions

Area	Directory	Notes
Python package	`paddleocr/`	Core API, pipeline classes
Training configs	`configs/`	YAML model configuration files
C++ inference	`deploy/cpp_infer/`	CMake-based C++ code
Deployment	`deploy/`	Serving, mobile, ONNX
Tests / TIPC	`test_tipc/`	Training-inference-prediction consistency tests
Documentation	`docs/`	MkDocs markdown source

Sources: mkdocs.yml292-393

Documentation Contribution

Documentation source files follow the naming convention <name>.md (Chinese, default) and <name>.en.md (English). The MkDocs i18n plugin picks them up automatically based on the suffix structure mkdocs.yml86-87

Navigation is defined entirely in mkdocs.yml292-393 Adding a new page requires both a new .md file under docs/ and a corresponding entry in the nav: block.

Issue Reporting

When opening a GitHub Issue, it is useful to include:

PaddleOCR version (pip show paddleocr)
PaddlePaddle version (python -c "import paddle; print(paddle.__version__)")
Operating system and hardware (CPU/GPU model)
Minimal reproduction script and error traceback
Which pipeline is affected (e.g., PP-OCRv5, PP-StructureV3, PP-ChatOCRv4)

The Python package exposes the following top-level pipeline classes, which are the typical entry points referenced in bug reports:

Class	Import	Pipeline
`PaddleOCR`	`from paddleocr import PaddleOCR`	General OCR (PP-OCRv5)
`PPStructureV3`	`from paddleocr import PPStructureV3`	Document parsing
`TextDetection`	`from paddleocr import TextDetection`	Text detection module
`TextRecognition`	`from paddleocr import TextRecognition`	Text recognition module
`DocPreprocessor`	`from paddleocr import DocPreprocessor`	Document preprocessing

Sources: docs/quick_start.en.md66-172

Third-Party Integrations

PaddleOCR is embedded in or used by several major open-source projects README.md49:

Project	How it uses PaddleOCR
MinerU	PDF and document text extraction
RAGFlow	Document ingestion for RAG pipelines
Umi-OCR	Desktop OCR application
OmniParser	UI screenshot parsing
cherry-studio	AI assistant document handling
pathway	Real-time data processing pipelines

If you are integrating PaddleOCR into a downstream project, the recommended entry point is the paddleocr PyPI package. The MCP server (see page 3.4) exposes pipelines to LLM agent applications.

Sources: README.md49 docs/index.en.md16

Model and Demo Platforms

Beyond GitHub, PaddleOCR models and demos are hosted on:

Platform	Content
HuggingFace	`PaddlePaddle/PaddleOCR-VL` model weights and online demo spaces
ModelScope	PaddleOCR-VL demo application
AI Studio	PP-OCRv5, PP-StructureV3, PP-ChatOCRv4 web demo apps
PyPI	`paddleocr` package (install via `pip install paddleocr`)

Model downloads default to HuggingFace (since version 3.0.2). The download source can be changed by setting the environment variable PADDLE_PDX_MODEL_SOURCE=BOS to use Baidu Object Storage instead README.md193

Sources: README.md54-59 README.md191-195

Community and Contributing

Relevant source files

Project Governance

PaddleOCR is maintained by the PaddleOCR PMC (Project Management Committee) under the Apache 2.0 license. The repository lives at PaddlePaddle/PaddleOCR on GitHub.

Community contribution guidelines are documented in two files referenced in the navigation mkdocs.yml390-392:

File	Purpose
`community/community_contribution.md`	Community contribution guide (PR process, code standards)
`community/code_and_doc.md`	Code and documentation appendix (style rules, checklist)

Sources: mkdocs.yml1-12 mkdocs.yml390-392

Community Channels

The following channels are listed in the official documentation and README:

Channel	Purpose	Reference
GitHub Issues	Bug reports, feature requests, usage questions	`github.com/PaddlePaddle/PaddleOCR/issues`
GitHub Discussions	General Q&A, RFC proposals	`github.com/PaddlePaddle/PaddleOCR/discussions`
PaddlePaddle Community	Developer network, broader ecosystem	`github.com/PaddlePaddle/community`
AI Studio	OCR courses, workshops, competitions, hackathons	`aistudio.baidu.com`
PaddleOCR Official Website	Online demos, free API, MCP service	`www.paddleocr.com`
Twitter / X	Announcements, news	`@PaddlePaddle`
WeChat Public Account	Chinese-language announcements	Official WeChat account

Sources: docs/index.en.md85-91 README.md39-49

The best practice project showcase (open to submissions) is hosted on AI Studio and is referenced from the index page docs/index.en.md86-87

Repository Structure Relevant to Contributors

Diagram: Repository Layout for Contributors

Sources: README.md6 mkdocs.yml390-392 mkdocs.yml292-393

Localization

The README is available in 9 languages. Each is a separate file under readme/:

Language	File
English	`README.md` (root)
Simplified Chinese	`readme/README_cn.md`
Traditional Chinese	`readme/README_tcn.md`
Japanese	`readme/README_ja.md`
Korean	`readme/README_ko.md`
French	`readme/README_fr.md`
Russian	`readme/README_ru.md`
Spanish	`readme/README_es.md`
Arabic	`readme/README_ar.md`

Sources: README.md6 mkdocs.yml85-191

Contribution Workflow

Diagram: Contribution Flow from Contributor to Merged Code

Sources: mkdocs.yml12 README.md1-50

Key Directories for Code Contributions

Area	Directory	Notes
Python package	`paddleocr/`	Core API, pipeline classes
Training configs	`configs/`	YAML model configuration files
C++ inference	`deploy/cpp_infer/`	CMake-based C++ code
Deployment	`deploy/`	Serving, mobile, ONNX
Tests / TIPC	`test_tipc/`	Training-inference-prediction consistency tests
Documentation	`docs/`	MkDocs markdown source

Sources: mkdocs.yml292-393

Documentation Contribution

Navigation is defined entirely in mkdocs.yml292-393 Adding a new page requires both a new .md file under docs/ and a corresponding entry in the nav: block.

Issue Reporting

When opening a GitHub Issue, it is useful to include:

PaddleOCR version (pip show paddleocr)
PaddlePaddle version (python -c "import paddle; print(paddle.__version__)")
Operating system and hardware (CPU/GPU model)
Minimal reproduction script and error traceback
Which pipeline is affected (e.g., PP-OCRv5, PP-StructureV3, PP-ChatOCRv4)

The Python package exposes the following top-level pipeline classes, which are the typical entry points referenced in bug reports:

Class	Import	Pipeline
`PaddleOCR`	`from paddleocr import PaddleOCR`	General OCR (PP-OCRv5)
`PPStructureV3`	`from paddleocr import PPStructureV3`	Document parsing
`TextDetection`	`from paddleocr import TextDetection`	Text detection module
`TextRecognition`	`from paddleocr import TextRecognition`	Text recognition module
`DocPreprocessor`	`from paddleocr import DocPreprocessor`	Document preprocessing

Sources: docs/quick_start.en.md66-172

Third-Party Integrations

PaddleOCR is embedded in or used by several major open-source projects README.md49:

Project	How it uses PaddleOCR
MinerU	PDF and document text extraction
RAGFlow	Document ingestion for RAG pipelines
Umi-OCR	Desktop OCR application
OmniParser	UI screenshot parsing
cherry-studio	AI assistant document handling
pathway	Real-time data processing pipelines

If you are integrating PaddleOCR into a downstream project, the recommended entry point is the paddleocr PyPI package. The MCP server (see page 3.4) exposes pipelines to LLM agent applications.

Sources: README.md49 docs/index.en.md16

Model and Demo Platforms

Beyond GitHub, PaddleOCR models and demos are hosted on:

Platform	Content
HuggingFace	`PaddlePaddle/PaddleOCR-VL` model weights and online demo spaces
ModelScope	PaddleOCR-VL demo application
AI Studio	PP-OCRv5, PP-StructureV3, PP-ChatOCRv4 web demo apps
PyPI	`paddleocr` package (install via `pip install paddleocr`)

Sources: README.md54-59 README.md191-195

Community and Contributing

Project Governance

Community Channels

Repository Structure Relevant to Contributors

Localization

Contribution Workflow

Key Directories for Code Contributions

Documentation Contribution

Issue Reporting

Third-Party Integrations

Model and Demo Platforms

On this page

Community and Contributing

Project Governance

Community Channels

Repository Structure Relevant to Contributors

Localization

Contribution Workflow

Key Directories for Code Contributions

Documentation Contribution

Issue Reporting

Third-Party Integrations

Model and Demo Platforms

On this page