The mineru command is the primary command-line interface for parsing PDF and image files into structured Markdown and JSON formats. This page covers the main CLI tool implementation, parameter options, backend routing, and execution flow.
For information about other interfaces:
The main CLI entry point is implemented in mineru/cli/client.py21-223 using the Click framework. The command accepts arguments, configures the environment, loads files, and delegates to the core parsing orchestration.
Sources: mineru/cli/client.py21-223 mineru/cli/common.py414-484
| Parameter | Short | Type | Description |
|---|---|---|---|
--path | -p | Path | Input PDF/image file or directory (required) |
--output | -o | Path | Output directory (required) |
| Parameter | Short | Type | Default | Valid Options |
|---|---|---|---|---|
--backend | -b | Choice | hybrid-auto-engine | pipeline, vlm-http-client, hybrid-http-client, vlm-auto-engine, hybrid-auto-engine |
| Parameter | Short | Type | Default | Description | Applicable Backends |
|---|---|---|---|---|---|
--method | -m | Choice | auto | Parsing method: auto, txt, ocr | pipeline, hybrid-* |
--lang | -l | Choice | ch | OCR language code | pipeline, hybrid-* |
--start | -s | int | 0 | Starting page number (0-indexed) | All |
--end | -e | int | None | Ending page number (0-indexed) | All |
--formula | -f | bool | True | Enable formula parsing | All |
--table | -t | bool | True | Enable table parsing | All |
--url | -u | str | None | Server URL for http-client backends | *-http-client |
| Parameter | Short | Type | Default | Description | Applicable Backends |
|---|---|---|---|---|---|
--device | -d | str | auto-detect | Device mode: cpu, cuda, cuda:0, npu, npu:0, mps | pipeline |
--vram | int | auto-detect | GPU VRAM limit per process (GB) | pipeline | |
--source | Choice | huggingface | Model source: huggingface, modelscope, local | All |
| Parameter | Short | Description |
|---|---|---|
--version | -v | Display version and exit |
--help | Show help message and exit |
Sources: mineru/cli/client.py23-151 docs/en/usage/cli_tools.md7-28
Sources: mineru/cli/client.py154-221 mineru/cli/common.py414-556
The CLI routes requests to different processing backends based on the --backend parameter. The routing logic includes prefix stripping and engine resolution.
Sources: mineru/cli/common.py414-556
The implementation strips backend prefixes to normalize engine names:
vlm- prefixhybrid- prefixAfter stripping, if the backend is auto-engine, the system calls get_vlm_engine() with inference_engine='auto' to automatically select the best available engine based on hardware and installed dependencies.
Sources: mineru/cli/common.py447-474 mineru/cli/common.py520-545
The CLI processes PDF and image files. Supported formats are defined in mineru/cli/common.py27-28:
Sources: mineru/cli/client.py213-220 mineru/cli/common.py32-43
When --start and/or --end parameters are provided, the CLI extracts a subset of pages using pypdfium2:
The function _prepare_pdf_bytes() at mineru/cli/common.py85-91 calls convert_pdf_bytes_to_bytes_by_pypdfium2() which:
pdfium.PdfDocument(pdf_bytes)Page import failures are logged but don't stop processing:
Sources: mineru/cli/common.py54-92
For non-http-client backends, the CLI automatically configures hardware-related environment variables:
Sources: mineru/cli/client.py163-182
Different backends require specific environment variable configurations set in do_parse():
VLM Backend (vlm-*):
Hybrid Backend (hybrid-*):
Sources: mineru/cli/common.py456-457 mineru/cli/common.py475-476 mineru/cli/common.py529-530 mineru/cli/common.py547-548
The CLI also sets some global environment variables:
TOKENIZERS_PARALLELISM = "false" - Disables tokenizer parallelism to avoid multiprocessing issues mineru/cli/common.py30MINERU_LMDEPLOY_DEVICE=maca mineru/cli/common.py22-24Sources: mineru/cli/common.py22-30
The prepare_env() function creates the output directory structure:
For hybrid backend, the parse_method is modified to hybrid_{parse_method}.
Sources: mineru/cli/common.py46-51
output_dir/
└── {pdf_file_name}/
└── {parse_method}/ # e.g., "auto", "ocr", "vlm", "hybrid_auto"
├── {pdf_file_name}.md
├── {pdf_file_name}_content_list.json
├── {pdf_file_name}_content_list_v2.json (VLM/Hybrid only)
├── {pdf_file_name}_middle.json
├── {pdf_file_name}_model.json
├── {pdf_file_name}_layout.pdf
├── {pdf_file_name}_span.pdf (Pipeline only)
├── {pdf_file_name}_origin.pdf
└── images/
└── *.jpg
The _process_output() function at mineru/cli/common.py94-168 handles all output file generation based on flags:
Sources: mineru/cli/common.py94-168
By default in do_parse() and aio_do_parse():
f_draw_layout_bbox = Truef_draw_span_bbox = Truef_dump_md = Truef_dump_middle_json = Truef_dump_model_output = Truef_dump_orig_pdf = Truef_dump_content_list = Truef_make_md_mode = MakeMode.MM_MDSources: mineru/cli/common.py424-431
The CLI supports passing additional arguments not defined as Click options through the context:
The arg_parse() function in mineru/utils/cli_parser.py4-38 parses --param value style arguments from ctx.args, converting values to appropriate types (bool, int, float, str). This mechanism allows passing vLLM/LMDeploy-specific parameters directly through the CLI.
Sources: mineru/cli/client.py161 mineru/utils/cli_parser.py4-38
The CLI provides both synchronous and asynchronous entry points:
do_parse() - Synchronous, returns when complete mineru/cli/common.py414-484aio_do_parse() - Asynchronous, can be awaited mineru/cli/common.py486-556Both functions have identical signatures and similar logic but differ in:
The CLI itself uses do_parse() synchronously, while other interfaces like Gradio and FastAPI may use aio_do_parse().
Sources: mineru/cli/common.py414-556
The implementation enforces strict mode compatibility:
In do_parse (sync mode):
vlm-vllm-async-engine is used mineru/cli/common.py450-451hybrid-vllm-async-engine is used mineru/cli/common.py468-470In aio_do_parse (async mode):
vlm-vllm-engine is used mineru/cli/common.py523-524hybrid-vllm-engine is used mineru/cli/common.py541-542Sources: mineru/cli/common.py450-451 mineru/cli/common.py468-470 mineru/cli/common.py523-524 mineru/cli/common.py541-542
All CLI tools follow a consistent logging setup pattern using loguru:
This allows users to control verbosity through the MINERU_LOG_LEVEL environment variable (e.g., DEBUG, INFO, WARNING, ERROR).
Sources: mineru/cli/client.py9-11 mineru/cli/gradio_app.py16-18 mineru/cli/fast_api.py19-21
Sources: docs/en/usage/cli_tools.md1-61 docs/zh/usage/cli_tools.md1-56 docs/en/usage/quick_usage.md10-54
The CLI serves as the primary user-facing entry point that orchestrates the complete parsing pipeline:
The CLI implementation in client.py is a thin wrapper that:
do_parse() in common.pyThe common.py module contains the core orchestration logic shared by all interfaces (CLI, API, Gradio), ensuring consistent behavior across different entry points.
Sources: mineru/cli/client.py1-224 mineru/cli/common.py1-570
Refresh this wiki