MCP Server Integration

Relevant source files

This page covers the paddleocr-mcp package, which exposes PaddleOCR pipeline capabilities as Model Context Protocol (MCP) tools for integration with LLM agent applications such as Claude for Desktop and VS Code Copilot. It is a separate, independently installable package located in mcp_server/.

For information about the underlying PaddleOCR Python API used in local mode, see 3.3 Python API Usage. For service deployment that the self-hosted mode connects to, see 5.4 Service Deployment.

Package Overview

The paddleocr-mcp package (mcp_server/) is a lightweight server built on FastMCP v2. It bridges PaddleOCR inference capabilities and LLM agent frameworks by registering PaddleOCR pipelines as callable MCP tools.

The package's entry point is paddleocr_mcp.__main__:main (see mcp_server/pyproject.toml28-29) and the CLI command is paddleocr_mcp.

Exposed MCP tools by pipeline:

Pipeline Name	MCP Tool	Output
`OCR`	`ocr`	Plain text or JSON with bounding boxes
`PP-StructureV3`	`pp_structure_v3`	Markdown document
`PaddleOCR-VL`	`paddleocr_vl`	Markdown document
`PaddleOCR-VL-1.5`	`paddleocr_vl` (v1.5)	Markdown document

Sources: mcp_server/pyproject.toml mcp_server/paddleocr_mcp/__main__.py mcp_server/paddleocr_mcp/pipelines.py docs/version3.x/deployment/mcp_server.en.md

Architecture

Component Relationships

Diagram: MCP Server Component Architecture

Sources: mcp_server/paddleocr_mcp/__main__.py mcp_server/paddleocr_mcp/pipelines.py mcp_server/pyproject.toml

Class Hierarchy

Diagram: Pipeline Handler Class Hierarchy

Sources: mcp_server/paddleocr_mcp/pipelines.py130-340 mcp_server/paddleocr_mcp/pipelines.py524-663 mcp_server/paddleocr_mcp/pipelines.py665-900

Request Flow

Diagram: Request Processing Flow

Sources: mcp_server/paddleocr_mcp/pipelines.py344-398 mcp_server/paddleocr_mcp/pipelines.py89-128

Working Modes

The server supports four ppocr_source values, each routing inference differently:

Mode	`ppocr_source` value	Description	Auth Required
Local Python Library	`local`	Runs PaddleOCR in-process	None
PaddleOCR Official Service	`aistudio`	Calls Baidu AI Studio hosted API	`PADDLEOCR_MCP_AISTUDIO_ACCESS_TOKEN`
Qianfan Platform	`qianfan`	Calls Baidu Qianfan cloud API	`PADDLEOCR_MCP_QIANFAN_API_KEY`
Self-hosted Service	`self_hosted`	Calls user-deployed PaddleOCR HTTP server	None (URL required)

In local mode, PipelineHandler.__init__ calls _create_local_engine() at construction time, which instantiates either PaddleOCR, PPStructureV3, or PaddleOCRVL from the paddleocr package. The engine is then wrapped in _EngineWrapper, which runs inference in a dedicated background thread to avoid blocking the asyncio event loop.

In service modes, _call_service() sends HTTP POST requests via httpx.AsyncClient to a base URL plus an endpoint path (e.g., /ocr or /layout-parsing).

Qianfan platform support is limited to PP-StructureV3 and PaddleOCR-VL pipelines — this is enforced in _validate_args() in mcp_server/paddleocr_mcp/__main__.py140-145

Sources: mcp_server/paddleocr_mcp/pipelines.py156-180 mcp_server/paddleocr_mcp/pipelines.py467-505 mcp_server/paddleocr_mcp/__main__.py104-145

Transport Modes

The server supports two MCP transport protocols:

stdio (default): Process-based communication, suitable for desktop MCP hosts (Claude for Desktop, VS Code).
Streamable HTTP: Suitable for remote deployment or multiple concurrent clients. Enabled with --http. Default host 127.0.0.1, default port 8000.

The FastMCP instance is configured in async_main() and started with either mcp.run_async() (stdio) or mcp.run_async(transport="streamable-http", ...) (HTTP).

Sources: mcp_server/paddleocr_mcp/__main__.py184-194

Installation

Optional extras are defined in mcp_server/pyproject.toml19-26 The local extra pulls in paddleocr[doc-parser]>=3.4; local-cpu additionally adds paddlepaddle>=3.2.1.

Verify installation:

Sources: mcp_server/pyproject.toml docs/version3.x/deployment/mcp_server.en.md90-123

MCP Host Configuration

Claude for Desktop (Quick Setup)

Locate claude_desktop_config.json:

OS	Path
macOS	`~/Library/Application Support/Claude/claude_desktop_config.json`
Windows	`%APPDATA%\Claude\claude_desktop_config.json`
Linux	`~/.config/Claude/claude_desktop_config.json`

Local mode:

PaddleOCR official service (aistudio) mode:

Self-hosted service mode:

Using `uvx` (No Manual Install)

uvx can launch the server without a prior pip install. Use --from paddleocr-mcp paddleocr_mcp or --from paddleocr_mcp[local-cpu] paddleocr_mcp as the command/args pair.

Sources: docs/version3.x/deployment/mcp_server.en.md139-380

Parameter Reference

All parameters can be set as environment variables or CLI arguments. Environment variables take the form PADDLEOCR_MCP_<NAME>.

Environment Variable	CLI Argument	Type	Default	Description
`PADDLEOCR_MCP_PIPELINE`	`--pipeline`	`str`	`"OCR"`	Pipeline to run. Values: `"OCR"`, `"PP-StructureV3"`, `"PaddleOCR-VL"`, `"PaddleOCR-VL-1.5"`
`PADDLEOCR_MCP_PPOCR_SOURCE`	`--ppocr_source`	`str`	`"local"`	Source backend. Values: `"local"`, `"aistudio"`, `"qianfan"`, `"self_hosted"`
`PADDLEOCR_MCP_SERVER_URL`	`--server_url`	`str`	`None`	Base URL (required for non-local modes)
`PADDLEOCR_MCP_AISTUDIO_ACCESS_TOKEN`	`--aistudio_access_token`	`str`	`None`	AI Studio token (required for `aistudio` mode)
`PADDLEOCR_MCP_QIANFAN_API_KEY`	`--qianfan_api_key`	`str`	`None`	Qianfan API key (required for `qianfan` mode)
`PADDLEOCR_MCP_TIMEOUT`	`--timeout`	`int`	`60`	HTTP read timeout (seconds)
`PADDLEOCR_MCP_DEVICE`	`--device`	`str`	`None`	Inference device (`local` mode only)
`PADDLEOCR_MCP_PIPELINE_CONFIG`	`--pipeline_config`	`str`	`None`	Path to PaddleOCR pipeline YAML config (`local` mode only)
—	`--http`	`bool`	`False`	Use Streamable HTTP transport instead of stdio
—	`--host`	`str`	`"127.0.0.1"`	HTTP host address
—	`--port`	`int`	`8000`	HTTP port
—	`--verbose`	`bool`	`False`	Enable verbose logging

Argument parsing and validation is implemented in _parse_args() and _validate_args() in mcp_server/paddleocr_mcp/__main__.py27-145

Sources: mcp_server/paddleocr_mcp/__main__.py27-101 docs/version3.x/deployment/mcp_server.en.md408-423

OCR Tool Output Modes

The ocr tool (registered in OCRHandler.register_tools) accepts an output_mode parameter:

`output_mode`	Format	Use case
`"simple"` (default)	Plain text string + confidence summary	General text extraction
`"detailed"`	JSON with `text`, `confidence`, `bbox` per line	When coordinates are needed

_format_output() in OCRHandler mcp_server/paddleocr_mcp/pipelines.py638-663 produces the final string or JSON. Bounding boxes come from rec_boxes in local mode, and from rec_boxes in the service response's prunedResult.

Layout parsing tools (PP-StructureV3, PaddleOCR-VL) produce Markdown output (text) or a mixed list of TextContent and ImageContent objects when return_images=True. The Markdown-to-image interleaving is handled by _parse_markdown_with_images() in _LayoutParsingHandler mcp_server/paddleocr_mcp/pipelines.py808-840

Sources: mcp_server/paddleocr_mcp/pipelines.py524-663 mcp_server/paddleocr_mcp/pipelines.py767-806

Local Mode Performance Tuning

For the PP-StructureV3 pipeline in local mode, disabling unused sub-modules substantially reduces memory and latency. The following shows the approach using PPStructureV3 with lightweight models, then exporting the config:

Set PADDLEOCR_MCP_PIPELINE_CONFIG to the exported YAML path to apply this configuration.

Note: PaddleOCR-VL series pipelines are not recommended for CPU inference due to large VLM model sizes.

Sources: docs/version3.x/deployment/mcp_server.en.md173-203

Running the Server via CLI

Sources: docs/version3.x/deployment/mcp_server.en.md382-405

Known Limitations

In local mode, the tools cannot process Base64-encoded PDF inputs — only Base64-encoded images are supported. This limitation is noted in _process_input_for_local() mcp_server/paddleocr_mcp/pipelines.py399-421
In local mode, file type is not inferred from the model's file_type prompt, so some complex URLs may fail.
For PP-StructureV3 and PaddleOCR-VL pipelines, results containing embedded images will significantly increase LLM token consumption. Use prompt instructions to exclude image content when not needed.

Sources: docs/version3.x/deployment/mcp_server.en.md425-429 mcp_server/paddleocr_mcp/pipelines.py399-421

MCP Server Integration

Relevant source files

For information about the underlying PaddleOCR Python API used in local mode, see 3.3 Python API Usage. For service deployment that the self-hosted mode connects to, see 5.4 Service Deployment.

Package Overview

The package's entry point is paddleocr_mcp.__main__:main (see mcp_server/pyproject.toml28-29) and the CLI command is paddleocr_mcp.

Exposed MCP tools by pipeline:

Pipeline Name	MCP Tool	Output
`OCR`	`ocr`	Plain text or JSON with bounding boxes
`PP-StructureV3`	`pp_structure_v3`	Markdown document
`PaddleOCR-VL`	`paddleocr_vl`	Markdown document
`PaddleOCR-VL-1.5`	`paddleocr_vl` (v1.5)	Markdown document

Sources: mcp_server/pyproject.toml mcp_server/paddleocr_mcp/__main__.py mcp_server/paddleocr_mcp/pipelines.py docs/version3.x/deployment/mcp_server.en.md

Architecture

Component Relationships

Diagram: MCP Server Component Architecture

Sources: mcp_server/paddleocr_mcp/__main__.py mcp_server/paddleocr_mcp/pipelines.py mcp_server/pyproject.toml

Class Hierarchy

Diagram: Pipeline Handler Class Hierarchy

Sources: mcp_server/paddleocr_mcp/pipelines.py130-340 mcp_server/paddleocr_mcp/pipelines.py524-663 mcp_server/paddleocr_mcp/pipelines.py665-900

Request Flow

Diagram: Request Processing Flow

Sources: mcp_server/paddleocr_mcp/pipelines.py344-398 mcp_server/paddleocr_mcp/pipelines.py89-128

Working Modes

The server supports four ppocr_source values, each routing inference differently:

Mode	`ppocr_source` value	Description	Auth Required
Local Python Library	`local`	Runs PaddleOCR in-process	None
PaddleOCR Official Service	`aistudio`	Calls Baidu AI Studio hosted API	`PADDLEOCR_MCP_AISTUDIO_ACCESS_TOKEN`
Qianfan Platform	`qianfan`	Calls Baidu Qianfan cloud API	`PADDLEOCR_MCP_QIANFAN_API_KEY`
Self-hosted Service	`self_hosted`	Calls user-deployed PaddleOCR HTTP server	None (URL required)

In service modes, _call_service() sends HTTP POST requests via httpx.AsyncClient to a base URL plus an endpoint path (e.g., /ocr or /layout-parsing).

Qianfan platform support is limited to PP-StructureV3 and PaddleOCR-VL pipelines — this is enforced in _validate_args() in mcp_server/paddleocr_mcp/__main__.py140-145

Sources: mcp_server/paddleocr_mcp/pipelines.py156-180 mcp_server/paddleocr_mcp/pipelines.py467-505 mcp_server/paddleocr_mcp/__main__.py104-145

Transport Modes

The server supports two MCP transport protocols:

stdio (default): Process-based communication, suitable for desktop MCP hosts (Claude for Desktop, VS Code).
Streamable HTTP: Suitable for remote deployment or multiple concurrent clients. Enabled with --http. Default host 127.0.0.1, default port 8000.

The FastMCP instance is configured in async_main() and started with either mcp.run_async() (stdio) or mcp.run_async(transport="streamable-http", ...) (HTTP).

Sources: mcp_server/paddleocr_mcp/__main__.py184-194

Installation

Optional extras are defined in mcp_server/pyproject.toml19-26 The local extra pulls in paddleocr[doc-parser]>=3.4; local-cpu additionally adds paddlepaddle>=3.2.1.

Verify installation:

Sources: mcp_server/pyproject.toml docs/version3.x/deployment/mcp_server.en.md90-123

MCP Host Configuration

Claude for Desktop (Quick Setup)

Locate claude_desktop_config.json:

OS	Path
macOS	`~/Library/Application Support/Claude/claude_desktop_config.json`
Windows	`%APPDATA%\Claude\claude_desktop_config.json`
Linux	`~/.config/Claude/claude_desktop_config.json`

Local mode:

PaddleOCR official service (aistudio) mode:

Self-hosted service mode:

Using `uvx` (No Manual Install)

uvx can launch the server without a prior pip install. Use --from paddleocr-mcp paddleocr_mcp or --from paddleocr_mcp[local-cpu] paddleocr_mcp as the command/args pair.

Sources: docs/version3.x/deployment/mcp_server.en.md139-380

Parameter Reference

All parameters can be set as environment variables or CLI arguments. Environment variables take the form PADDLEOCR_MCP_<NAME>.

Environment Variable	CLI Argument	Type	Default	Description
`PADDLEOCR_MCP_PIPELINE`	`--pipeline`	`str`	`"OCR"`	Pipeline to run. Values: `"OCR"`, `"PP-StructureV3"`, `"PaddleOCR-VL"`, `"PaddleOCR-VL-1.5"`
`PADDLEOCR_MCP_PPOCR_SOURCE`	`--ppocr_source`	`str`	`"local"`	Source backend. Values: `"local"`, `"aistudio"`, `"qianfan"`, `"self_hosted"`
`PADDLEOCR_MCP_SERVER_URL`	`--server_url`	`str`	`None`	Base URL (required for non-local modes)
`PADDLEOCR_MCP_AISTUDIO_ACCESS_TOKEN`	`--aistudio_access_token`	`str`	`None`	AI Studio token (required for `aistudio` mode)
`PADDLEOCR_MCP_QIANFAN_API_KEY`	`--qianfan_api_key`	`str`	`None`	Qianfan API key (required for `qianfan` mode)
`PADDLEOCR_MCP_TIMEOUT`	`--timeout`	`int`	`60`	HTTP read timeout (seconds)
`PADDLEOCR_MCP_DEVICE`	`--device`	`str`	`None`	Inference device (`local` mode only)
`PADDLEOCR_MCP_PIPELINE_CONFIG`	`--pipeline_config`	`str`	`None`	Path to PaddleOCR pipeline YAML config (`local` mode only)
—	`--http`	`bool`	`False`	Use Streamable HTTP transport instead of stdio
—	`--host`	`str`	`"127.0.0.1"`	HTTP host address
—	`--port`	`int`	`8000`	HTTP port
—	`--verbose`	`bool`	`False`	Enable verbose logging

Argument parsing and validation is implemented in _parse_args() and _validate_args() in mcp_server/paddleocr_mcp/__main__.py27-145

Sources: mcp_server/paddleocr_mcp/__main__.py27-101 docs/version3.x/deployment/mcp_server.en.md408-423

OCR Tool Output Modes

The ocr tool (registered in OCRHandler.register_tools) accepts an output_mode parameter:

`output_mode`	Format	Use case
`"simple"` (default)	Plain text string + confidence summary	General text extraction
`"detailed"`	JSON with `text`, `confidence`, `bbox` per line	When coordinates are needed

Sources: mcp_server/paddleocr_mcp/pipelines.py524-663 mcp_server/paddleocr_mcp/pipelines.py767-806

Local Mode Performance Tuning

Set PADDLEOCR_MCP_PIPELINE_CONFIG to the exported YAML path to apply this configuration.

Note: PaddleOCR-VL series pipelines are not recommended for CPU inference due to large VLM model sizes.

Sources: docs/version3.x/deployment/mcp_server.en.md173-203

Running the Server via CLI

Sources: docs/version3.x/deployment/mcp_server.en.md382-405

Known Limitations

In local mode, the tools cannot process Base64-encoded PDF inputs — only Base64-encoded images are supported. This limitation is noted in _process_input_for_local() mcp_server/paddleocr_mcp/pipelines.py399-421
In local mode, file type is not inferred from the model's file_type prompt, so some complex URLs may fail.
For PP-StructureV3 and PaddleOCR-VL pipelines, results containing embedded images will significantly increase LLM token consumption. Use prompt instructions to exclude image content when not needed.

Sources: docs/version3.x/deployment/mcp_server.en.md425-429 mcp_server/paddleocr_mcp/pipelines.py399-421

MCP Server Integration

Package Overview

Architecture

Component Relationships

Class Hierarchy

Request Flow

Working Modes

Transport Modes

Installation

MCP Host Configuration

Claude for Desktop (Quick Setup)

Using `uvx` (No Manual Install)

Parameter Reference

OCR Tool Output Modes

Local Mode Performance Tuning

Running the Server via CLI

Known Limitations

On this page

MCP Server Integration

Package Overview

Architecture

Component Relationships

Class Hierarchy

Request Flow

Working Modes

Transport Modes

Installation

MCP Host Configuration

Claude for Desktop (Quick Setup)

Using `uvx` (No Manual Install)

Parameter Reference

OCR Tool Output Modes

Local Mode Performance Tuning

Running the Server via CLI

Known Limitations

On this page

MCP Server Integration

Package Overview

Architecture

Component Relationships

Class Hierarchy

Request Flow

Working Modes

Transport Modes

Installation

MCP Host Configuration

Claude for Desktop (Quick Setup)

Using uvx (No Manual Install)

Parameter Reference

OCR Tool Output Modes

Local Mode Performance Tuning

Running the Server via CLI

Known Limitations

On this page

MCP Server Integration

Package Overview

Architecture

Component Relationships

Class Hierarchy

Request Flow

Working Modes

Transport Modes

Installation

MCP Host Configuration

Claude for Desktop (Quick Setup)

Using uvx (No Manual Install)

Parameter Reference

OCR Tool Output Modes

Local Mode Performance Tuning

Running the Server via CLI

Known Limitations

On this page

Using `uvx` (No Manual Install)

Using `uvx` (No Manual Install)