The Gradio Web UI provides an interactive, browser-based interface for PDF document parsing using MinerU. It offers a user-friendly alternative to command-line usage, with real-time preview, configuration controls, and multilingual support.
For information about the FastAPI service interface, see 4.2. For the command-line interface, see 4.1. For the core parsing orchestration that the Gradio UI invokes, see 3.3.
The Gradio web application is implemented in mineru/cli/gradio_app.py1-507 and launched via the mineru-gradio command. The application provides a visual interface for configuring parsing options, uploading documents, and viewing results in real-time.
Command-line invocation:
The main entry point is the main() function at mineru/cli/gradio_app.py252-502 which creates the Gradio interface and launches the server.
Sources: mineru/cli/gradio_app.py1-507 docs/en/usage/cli_tools.md40-61 docs/zh/usage/cli_tools.md40-56
Architecture: Gradio Web UI Data Flow
The architecture follows a three-layer pattern: the browser-based Gradio interface, application orchestration layer, and core processing engine. The parse_pdf() function serves as the main orchestrator, coordinating file handling, parsing, and output generation.
Sources: mineru/cli/gradio_app.py26-62 mineru/cli/gradio_app.py112-133 mineru/cli/gradio_app.py252-502
The Gradio application accepts several command-line options via Click decorators:
| Option | Type | Default | Description |
|---|---|---|---|
--enable-example | bool | True | Enable example files from ./examples directory |
--enable-http-client | bool | False | Enable http-client backend options in UI |
--enable-api | bool | True | Enable Gradio API endpoints |
--max-convert-pages | int | 1000 | Maximum pages to convert |
--server-name | str | None | Server host (e.g., 0.0.0.0) |
--server-port | int | None | Server port (e.g., 7860) |
--latex-delimiters-type | str | 'all' | LaTeX delimiter type: a ($), b ((),[]), or all |
These options are defined at mineru/cli/gradio_app.py199-257
VLM Engine Initialization:
The application automatically initializes the VLM engine based on the detected inference engine type:
At mineru/cli/gradio_app.py383-400 the system determines whether to pre-load models or enable http-client mode based on the engine type (transformers, mlx-engine trigger http-client mode, while vllm-async-engine, lmdeploy-engine pre-load models).
Sources: mineru/cli/gradio_app.py199-257 mineru/cli/gradio_app.py383-400 docs/en/usage/cli_tools.md40-61
UI Component Layout
The interface is organized into two main columns using Gradio's layout system:
Left Panel (Input): mineru/cli/gradio_app.py405-438
suffixes)pipeline, vlm-auto-engine, hybrid-auto-engine, and optionally vlm-http-client, hybrid-http-client (mineru/cli/gradio_app.py411-415)Right Panel (Output): mineru/cli/gradio_app.py440-458
Sources: mineru/cli/gradio_app.py402-458 mineru/cli/gradio_app.py357-369
The update_interface() function dynamically adjusts UI visibility based on backend selection:
At mineru/cli/gradio_app.py357-369 this function:
The backend dropdown is connected to this function via event handlers at mineru/cli/gradio_app.py461-475
Sources: mineru/cli/gradio_app.py326-369 mineru/cli/gradio_app.py461-475
Async Parsing Workflow
The core parsing is handled by two async functions:
to_markdown() mineru/cli/gradio_app.py112-133:
"ch (Chinese, English)" → "ch")parse_pdf() to perform actual parsingparse_pdf() mineru/cli/gradio_app.py26-62:
safe_stem() mineru/cli/gradio_app.py173-176read_fn() from common.pyparse_method based on backend typeaio_do_parse() (async version from common.py) for asynchronous processingSources: mineru/cli/gradio_app.py26-62 mineru/cli/gradio_app.py112-133 mineru/cli/common.py486-556
The application implements bilingual support for English and Chinese using Gradio's I18n class:
Translation dictionaries are defined at mineru/cli/gradio_app.py260-323 The i18n() function is used throughout the UI to retrieve localized strings:
Dynamic Text Updates:
Three helper functions provide context-sensitive translations:
get_formula_label(backend_choice) mineru/cli/gradio_app.py326-334 - Formula checkbox label varies by backendget_formula_info(backend_choice) mineru/cli/gradio_app.py336-344 - Formula info text varies by backendget_backend_info(backend_choice) mineru/cli/gradio_app.py346-354 - Backend description varies by typeSources: mineru/cli/gradio_app.py260-323 mineru/cli/gradio_app.py326-354
Two utility functions handle output file processing:
compress_directory_to_zip() mineru/cli/gradio_app.py64-86:
Recursively compresses all files from the output directory into a downloadable ZIP archive.
replace_image_with_base64() mineru/cli/gradio_app.py93-110:
Converts image references in Markdown to inline base64-encoded data URIs for self-contained display in the Gradio Markdown component.
Both functions are called in to_markdown() at mineru/cli/gradio_app.py119-128 to prepare output for the user.
Sources: mineru/cli/gradio_app.py64-110 mineru/cli/gradio_app.py119-128
When --enable-api is set to True (default), Gradio exposes HTTP API endpoints for programmatic access:
At mineru/cli/gradio_app.py482-491 functions are registered with API names:
When enabled, users can access:
/api/to_pdf - Convert uploaded files to PDF format/api/to_markdown - Full parsing pipeline endpointThe API documentation is automatically available at the / path when the server is running. Footer links are configured at mineru/cli/gradio_app.py493-502
Sources: mineru/cli/gradio_app.py478-502 docs/en/usage/cli_tools.md48-51
The application optionally loads example files from an examples directory:
At mineru/cli/gradio_app.py430-438 the system:
--enable-example flag is True./examples directory in current working directoryExamples component that users can click to load pre-selected filesValid file suffixes are defined at mineru/cli/common.py27-28:
Sources: mineru/cli/gradio_app.py430-438 mineru/cli/common.py27-28 docs/en/usage/cli_tools.md44-47
The application loads a custom HTML header from mineru/resources/header.html1-142 containing:
The header is injected into the Gradio interface at mineru/cli/gradio_app.py145-147:
And rendered at mineru/cli/gradio_app.py403:
Sources: mineru/cli/gradio_app.py145-147 mineru/cli/gradio_app.py403 mineru/resources/header.html1-142
The application supports three LaTeX delimiter types for rendering mathematical formulas in Markdown:
| Type | Display | Inline | Description |
|---|---|---|---|
'a' | $$..$$ | $...$ | Standard dollar sign delimiters |
'b' | \[..\] | \(..\) | Escaped bracket delimiters |
'all' | Both | Both | Both types supported (default) |
Configuration is handled at mineru/cli/gradio_app.py135-144 and mineru/cli/gradio_app.py374-381:
The selected delimiter type is passed to the Markdown component at mineru/cli/gradio_app.py444-450:
Sources: mineru/cli/gradio_app.py135-144 mineru/cli/gradio_app.py374-381 mineru/cli/gradio_app.py444-450 docs/en/usage/cli_tools.md56-59
The UI provides extensive language support for OCR processing, organized into two categories:
Primary Languages mineru/cli/gradio_app.py149-162:
Extended Languages mineru/cli/gradio_app.py163-169:
Language selection is used in the to_markdown() function at mineru/cli/gradio_app.py113-116 where the language prefix is extracted:
This enables users to select from 16+ language groups with detailed script support information displayed in the dropdown.
Sources: mineru/cli/gradio_app.py149-171 mineru/cli/gradio_app.py113-116 docs/en/usage/cli_tools.md16-17
The Gradio server is launched at mineru/cli/gradio_app.py496-502:
Configuration Parameters:
server_name: Defaults to None (localhost only), set to "0.0.0.0" for external accessserver_port: Defaults to None (random port), typically set to 7860show_api: Controls API documentation visibilityi18n: Internationalization configuration objectDocker Deployment:
The Gradio UI is included in Docker Compose configurations at docker/docker-compose.yaml as the mineru-gradio service, typically running on port 7860.
Multi-Service Architecture:
When deployed via Docker Compose, the Gradio UI can connect to a separate mineru-openai-server service for distributed inference, as documented in deployment configurations.
Sources: mineru/cli/gradio_app.py496-502 docs/en/usage/quick_usage.md36-43 docs/zh/usage/quick_usage.md36-43
Refresh this wiki