This page documents support for non-NVIDIA GPU accelerators in PaddleOCR, including Chinese domestic AI chips and other hardware platforms. It covers device-specific requirements, installation procedures, inference backend compatibility, and deployment patterns for Kunlunxin XPU, Huawei Ascend NPU, Cambricon MLU, Hygon DCU, MetaX GPU, Iluvatar GPU, and Apple Silicon.
For NVIDIA GPU support and TensorRT optimization, see 7.1. For CPU-specific optimizations, see 7.3.
PaddleOCR supports inference and deployment on the following alternative accelerator platforms:
| Accelerator Type | Device Identifier | PaddlePaddle Plugin | Docker Support | Production Status |
|---|---|---|---|---|
| Kunlunxin XPU | xpu | paddle-kunlunxin-xpu | ✅ | ✅ Verified on XPU R200 |
| Huawei Ascend NPU | npu | paddle-custom-npu | ✅ | ✅ Verified on 910B |
| Cambricon MLU | mlu | paddle-custom-mlu | ✅ | ✅ Verified on MLU370 |
| Hygon DCU | dcu | paddle-custom-dcu | ✅ | ✅ Verified on Z100 |
| MetaX GPU | metax_gpu | paddle-metax-gpu | ✅ | ✅ Verified on C550 |
| Iluvatar GPU | iluvatar_gpu | paddle-iluvatar-gpu | ✅ | ✅ Verified on BI-V150 |
| Apple Silicon | cpu | Native PaddlePaddle | ❌ | ✅ Verified on M4 |
Sources: docs/version3.x/pipeline_usage/PaddleOCR-VL.en.md41-155 docs/version3.x/pipeline_usage/PaddleOCR-VL.md41-155
Different accelerators support different inference acceleration frameworks. The following table shows compatibility for PaddleOCR-VL (similar patterns apply to other pipelines):
Inference Backend Compatibility for PaddleOCR-VL
Sources: docs/version3.x/pipeline_usage/PaddleOCR-VL.en.md45-127 docs/version3.x/pipeline_usage/PaddleOCR-VL.md45-127
Hardware: Kunlunxin XPU R200 and compatible devices
Installation:
Device Specification:
Docker Images:
ccr-2vdh3abv-pub.cnc.bj.baidubce.com/paddlepaddle/paddleocr-vl:latest-kunlunxin-xpuccr-2vdh3abv-pub.cnc.bj.baidubce.com/paddlepaddle/paddleocr-genai-fastdeploy-server:latest-kunlunxin-xpuSupported Backends: PaddlePaddle, FastDeploy (experimental vLLM support in progress)
Sources: docs/version3.x/pipeline_usage/PaddleOCR-VL.en.md148-149 Installation commands from XPU tutorial pattern
Hardware: Huawei Ascend 910B and compatible NPUs
Installation:
Docker Run Requirements:
Important Note: Native PaddlePaddle inference is limited on NPU. Use vLLM backend for production:
Supported Backends: vLLM (recommended), limited PaddlePaddle support
Sources: docs/version3.x/pipeline_usage/PaddleOCR-VL-Huawei-Ascend-NPU.en.md1-96 docs/version3.x/pipeline_usage/PaddleOCR-VL-Huawei-Ascend-NPU.md1-96
Hardware: Hygon DCU Z100 and compatible devices
Installation:
Device Specification:
Docker Images:
ccr-2vdh3abv-pub.cnc.bj.baidubce.com/paddlepaddle/paddleocr-vl:latest-hygon-dcuccr-2vdh3abv-pub.cnc.bj.baidubce.com/paddlepaddle/paddleocr-genai-vllm-server:latest-hygon-dcuSupported Backends: PaddlePaddle, vLLM
Sources: docs/version3.x/pipeline_usage/PaddleOCR-VL.en.md150
Hardware: MetaX C550 and compatible GPUs
Installation:
Docker Run Requirements:
Device Specification:
Supported Backends: PaddlePaddle, FastDeploy
Sources: docs/version3.x/pipeline_usage/PaddleOCR-VL-MetaX-GPU.en.md1-92 docs/version3.x/pipeline_usage/PaddleOCR-VL-MetaX-GPU.md1-92
Hardware: Iluvatar BI-V150 (天数天垓 150) and compatible GPUs
Installation:
Docker Run Requirements:
Device Specification:
Supported Backends: PaddlePaddle, FastDeploy
Sources: docs/version3.x/pipeline_usage/PaddleOCR-VL-Iluvatar-GPU.en.md1-96 docs/version3.x/pipeline_usage/PaddleOCR-VL-Iluvatar-GPU.md1-96
Hardware: Apple M1, M2, M3, M4 chips
Installation:
Device Specification:
MLX-VLM Acceleration (Recommended):
Supported Backends: PaddlePaddle (CPU mode), MLX-VLM (recommended for VLM models)
Sources: docs/version3.x/pipeline_usage/PaddleOCR-VL-Apple-Silicon.en.md1-109 docs/version3.x/pipeline_usage/PaddleOCR-VL-Apple-Silicon.md1-109
Device Initialization and Backend Selection Flow
The device parameter flows through the system as follows:
CLI Specification: --device parameter in command line
Python API Specification: device parameter in pipeline constructor
Internal Processing: Device string is parsed and validated
{device_type} or {device_type}:{device_id}xpu, xpu:0, dcu:1, iluvatar_gpu:0PaddlePaddle Device Setting: Internally calls paddle.set_device()
Sources: docs/version3.x/pipeline_usage/PaddleOCR-VL.en.md580-595 docs/version3.x/pipeline_usage/PaddleOCR-VL.md552-569
Alternative accelerators support parallel inference across multiple devices:
Parallel Execution Pattern:
Multi-Device Parallel Processing Architecture
When multiple devices are specified:
Sources: docs/version3.x/pipeline_usage/instructions/parallel_inference.en.md1-32
For production deployment on alternative accelerators, PaddleOCR provides Docker Compose configurations that combine:
Docker Compose Deployment Architecture for Alternative Accelerators
Compose File Location: deploy/paddleocr_vl_docker/accelerators/kunlunxin-xpu/
Key Environment Variables (.env):
Starting the Service:
Service Endpoints:
http://localhost:8080/predicthttp://localhost:8118/v1/chat/completionsCompose File Location: deploy/paddleocr_vl_docker/accelerators/hygon-dcu/
Key Difference: Uses vLLM backend instead of FastDeploy
Device Selection in compose.yaml:
Change Service Port:
Adjust Device Assignment:
Mount Custom Configuration:
Sources: docs/version3.x/pipeline_usage/PaddleOCR-VL-NVIDIA-Blackwell.en.md193-292 docs/version3.x/pipeline_usage/PaddleOCR-VL-Iluvatar-GPU.en.md135-233
Docker images for alternative accelerators follow a consistent naming pattern:
ccr-2vdh3abv-pub.cnc.bj.baidubce.com/paddlepaddle/{image-type}:{version}-{accelerator}-{offline}
Components:
{image-type}: paddleocr-vl, paddleocr-genai-vllm-server, paddleocr-genai-fastdeploy-server{version}: latest or paddleocr{major}.{minor} (e.g., paddleocr3.3){accelerator}: kunlunxin-xpu, hygon-dcu, metax-gpu, iluvatar-gpu, huawei-npu{offline}: Optional -offline suffix for images with bundled modelsExamples:
paddleocr-vl:latest-kunlunxin-xpu (online, requires internet for model download)paddleocr-vl:latest-kunlunxin-xpu-offline (offline, models included)paddleocr-vl:paddleocr3.3-hygon-dcu (specific version)paddleocr-genai-vllm-server:latest-hygon-dcu-offline (vLLM server, offline)Image Size Reference:
| Accelerator | Base Image Size | Offline Image Size | VLM Server (Offline) |
|---|---|---|---|
| Kunlunxin XPU | ~12 GB | ~14 GB | ~15 GB |
| Hygon DCU | ~10 GB | ~12 GB | ~15 GB |
| MetaX GPU | ~32 GB | ~34 GB | ~39 GB |
| Iluvatar GPU | ~37 GB | ~39 GB | ~40 GB |
| Huawei NPU | ~28 GB | ~30 GB | ~20 GB |
Sources: docs/version3.x/pipeline_usage/PaddleOCR-VL-NVIDIA-Blackwell.en.md46-51 Various accelerator tutorial files
For Training:
For Production Inference:
For Development/Testing:
Backend Selection Impact:
| Accelerator | Native PaddlePaddle | Acceleration Backend | Speedup Factor |
|---|---|---|---|
| Kunlunxin XPU | Baseline | FastDeploy | ~2-3x |
| Hygon DCU | Baseline | vLLM | ~3-5x |
| Huawei NPU | Limited | vLLM | Required |
| MetaX GPU | Baseline | FastDeploy | ~2-4x |
| Iluvatar GPU | Baseline | FastDeploy | ~2-4x |
| Apple Silicon | Baseline | MLX-VLM | ~5-10x |
Memory Optimization:
--shm-size appropriately (recommend 64g for VLM models)Issue: Device Plugin Not Found
Issue: Docker Container Cannot Access Device
Issue: vLLM/FastDeploy Version Incompatibility
Issue: Out of Memory on Accelerator
Sources: docs/version3.x/pipeline_usage/PaddleOCR-VL.en.md139-141 docs/version3.x/pipeline_usage/PaddleOCR-VL-NVIDIA-Blackwell.en.md109-120
PaddlePaddle Version Requirements:
Checking Compatibility:
Image Tag Versioning:
latest-*: Tracks latest PaddleOCR releasepaddleocr{major}.{minor}-*: Pinned to specific PaddleOCR versionSources: docs/version3.x/pipeline_usage/PaddleOCR-VL-Apple-Silicon.en.md36 docs/version3.x/pipeline_usage/PaddleOCR-VL-Iluvatar-GPU.en.md69
Refresh this wiki
This wiki was recently refreshed. Please wait 2 days to refresh again.