This page explains how to configure document processing pipelines programmatically when using Docling as a Python library. It covers the configuration classes, their relationships, and how to customize pipeline behavior for different input formats.
For information about using the DocumentConverter class itself, see DocumentConverter API. For conceptual information about pipeline architecture, see Processing Pipelines.
Docling's configuration system is hierarchical, with PipelineOptions as the base class and format-specific options that map input formats to pipelines and their configurations.
Sources: docling/datamodel/pipeline_options.py70-74 docling/datamodel/base_models.py37-43 docling/document_converter.py75-85
The configuration system connects input formats to processing pipelines through the FormatOption class, which specifies:
StandardPdfPipeline, SimplePipeline)Sources: docling/document_converter.py75-85 docling/document_converter.py249-257
Format options map input formats to their processing configuration. The DocumentConverter accepts a format_options dictionary in its constructor.
The base FormatOption class contains:
Key attributes:
pipeline_cls: Pipeline class reference (not an instance)pipeline_options: Configuration instance for the pipelinebackend: Backend class referencebackend_options: Configuration for backend behaviorSources: docling/document_converter.py75-85
Docling provides pre-configured format option classes for common formats:
| Format Option Class | Input Format | Default Pipeline | Default Backend |
|---|---|---|---|
PdfFormatOption | PDF | StandardPdfPipeline | DoclingParseDocumentBackend |
ImageFormatOption | IMAGE | StandardPdfPipeline | ImageDocumentBackend |
WordFormatOption | DOCX | SimplePipeline | MsWordDocumentBackend |
ExcelFormatOption | XLSX | SimplePipeline | MsExcelDocumentBackend |
PowerpointFormatOption | PPTX | SimplePipeline | MsPowerpointDocumentBackend |
HTMLFormatOption | HTML | SimplePipeline | HTMLDocumentBackend |
MarkdownFormatOption | MD | SimplePipeline | MarkdownDocumentBackend |
AudioFormatOption | AUDIO | AsrPipeline | NoOpBackend |
Sources: docling/document_converter.py87-156
If you don't provide custom format options, DocumentConverter uses defaults from the _get_default_option() function:
Sources: docling/document_converter.py158-186
The PdfPipelineOptions class configures the StandardPdfPipeline, which performs multi-stage processing including OCR, layout detection, and table structure extraction.
Sources: docling/datamodel/pipeline_options.py1010-1243
OCR Configuration (ocr_options):
OcrAutoOptions, TesseractOcrOptions, EasyOcrOptions, RapidOcrOptions, OcrMacOptionsforce_full_page_ocr: Forces OCR on all pages regardless of embedded textLayout Detection (layout_options):
DOCLING_LAYOUT_EGRET_LARGE)Table Structure (table_structure_options):
mode: TableFormerMode.ACCURATE or TableFormerMode.FASTdo_cell_matching: Aligns detected cells with contentAcceleration (accelerator_options):
device: CPU, CUDA, MPS, XPU for hardware accelerationnum_threads: Controls parallelizationTiming and Resource Control:
document_timeout: Maximum processing time in seconds (default: 180)Sources: docling/datamodel/pipeline_options.py1010-1243 docling/datamodel/pipeline_options.py121-461 docling/datamodel/pipeline_options.py76-118
Sources: docling/document_converter.py209-260 docling/datamodel/pipeline_options.py1010-1243
Vision-Language Model (VLM) pipelines process documents using multimodal AI models. The configuration system supports both inline models (running locally) and API-based models (remote inference).
Sources: docling/datamodel/pipeline_options.py1360-1449 docling/datamodel/pipeline_options_vlm_model.py1-320
The VlmConvertOptions.from_preset() method provides pre-configured VLM setups:
Available presets can be listed with:
Sources: docling/datamodel/pipeline_options.py1360-1449 docling/datamodel/stage_model_specs.py50-89
For local model inference using HuggingFace models:
Key parameters:
repo_id: HuggingFace model identifierinference_framework: MLX (Apple Silicon), TRANSFORMERS (general), VLLM (high-throughput)response_format: Expected output format (DOCTAGS, MARKDOWN, HTML, OTSL)load_in_8bit: Quantization to reduce memory usagetemperature: 0.0 for deterministic outputSources: docling/datamodel/pipeline_options_vlm_model.py120-320
For remote inference through OpenAI-compatible APIs:
Key parameters:
url: API endpoint (OpenAI-compatible)model: Model identifier for the APIconcurrency: Maximum concurrent API requeststimeout: Request timeout in secondsheaders: Custom HTTP headers (e.g., authentication)Sources: docling/datamodel/pipeline_options_vlm_model.py322-462
For formats that don't require complex multi-stage processing (DOCX, XLSX, HTML, etc.):
Sources: docling/datamodel/pipeline_options.py1005-1008
For audio and video transcription:
Sources: docling/datamodel/pipeline_options.py1246-1282 docling/datamodel/pipeline_options_asr_model.py1-34
Backend options control the behavior of document parsers independently from pipeline processing.
Sources: docling/datamodel/backend_options.py11-28
Sources: docling/datamodel/backend_options.py31-43
Sources: docling/datamodel/backend_options.py46-58
Enrichment models add additional processing after initial document parsing (e.g., picture classification, description, chart extraction).
Sources: docling/datamodel/pipeline_options.py464-655
Sources: docling/datamodel/pipeline_options.py679-745 docling/datamodel/picture_classification_options.py1-31
Sources: docling/datamodel/pipeline_options.py1010-1243
Here's a comprehensive example showing multiple configuration aspects:
Sources: docling/datamodel/pipeline_options.py1010-1243 docling/document_converter.py209-260
When pipeline_options is not provided in a FormatOption, the pipeline's default options are automatically set:
Sources: docling/document_converter.py79-84
The FormatOption class uses Pydantic's model_validator to ensure pipeline options are initialized:
Sources: docling/document_converter.py79-84
The DocumentConverter caches pipeline instances by hashing their options to avoid redundant initialization:
Sources: docling/document_converter.py267-272
Key principles:
pipeline_cls=StandardPdfPipeline, not StandardPdfPipeline()backend=DoclingParseDocumentBackend, not DoclingParseDocumentBackend()InputFormat enums to FormatOption instancesSources: docling/document_converter.py209-293 docling/document_converter.py75-85
Refresh this wiki