Plugin System

Relevant source files

Purpose and Scope

This document describes Docling's plugin architecture, which enables extensibility through custom implementations of core components. The plugin system allows developers to register and use custom models for OCR, layout detection, table structure recognition, vision-language processing, and enrichment tasks without modifying Docling's core codebase.

For information about configuring specific model types, see Configuration and Pipeline Options. For details on the built-in model implementations, see AI/ML Models.

Sources: pyproject.toml70 pyproject.toml85-86 CHANGELOG.md107

Architecture Overview

Docling's plugin system is built on the pluggy library and Python's entry point mechanism. It uses a factory pattern where plugins register implementations that factories can instantiate based on configuration.

Diagram 1: Plugin System Architecture

The plugin system operates through these layers:

Entry Point System: Plugins register themselves via Python package entry points in their pyproject.toml files
Plugin Manager: The pluggy-based manager discovers and loads all registered plugins at runtime
Factory Layer: Factories use the plugin registry to instantiate models based on user configuration
Model Specifications: Configuration objects specify which plugin implementation to use

Sources: pyproject.toml70 pyproject.toml85-86 CHANGELOG.md107

Entry Point Registration

Plugins register themselves using Python's entry point mechanism. The entry point group name is docling.

Default Plugin Registration

Docling's built-in models are registered through the default plugin:

This entry point references the module at docling.models.plugins.defaults, which registers all built-in model implementations.

Sources: pyproject.toml85-86

External Plugin Registration

External packages can register plugins by declaring similar entry points in their own pyproject.toml:

The plugin module must implement the required hook specifications that pluggy expects. When installed, the plugin is automatically discovered by Docling's plugin manager.

Sources: pyproject.toml85-86 CHANGELOG.md832

Plugin Types and Extension Points

Docling provides extension points for five primary model categories. Each category follows a factory pattern where plugins register implementations that factories can instantiate.

Diagram 2: Plugin Extension Points and Built-in Implementations

Sources: CHANGELOG.md107 CHANGELOG.md272 CHANGELOG.md543 CHANGELOG.md62

OCR Engine Plugins

OCR plugins provide text extraction capabilities. The system supports multiple OCR engine implementations through the AutoOCR selector, which automatically chooses the best available engine.

Built-in implementations:

RapidOCR (default): Fast GPU-accelerated OCR
EasyOCR: Alternative OCR engine
Tesseract: Open-source OCR engine with PSM options support
OCRMac: macOS-native OCR (macOS only)
AutoOCR: Automatic engine selection based on availability

Configuration is done through OcrOptions which specifies the engine type and parameters.

Sources: CHANGELOG.md272 CHANGELOG.md273 CHANGELOG.md186

Layout Model Plugins

Layout model plugins detect document structure (paragraphs, headings, tables, figures, etc.). Multiple implementations can be registered and selected via LayoutOptions.

Built-in implementations:

Heron (default): 258M parameter layout detection model
RT-DETR: Alternative layout detection approach
Egret: Additional layout model option

The factory pattern allows runtime selection: LayoutOptions(model="heron") or LayoutOptions(model="rt-detr").

Sources: CHANGELOG.md386 CHANGELOG.md543

Table Structure Plugins

Table structure plugins parse detected table regions into cell structures. Plugins must handle both cell detection and structure recognition (rows, columns, spans).

Built-in implementations:

TableFormer: Default implementation using OTSL (Optimized Table-Structure Language) tokenization
TableCropsLayoutModel: Experimental alternative approach

Configuration uses TableStructureOptions with parameters like do_cell_matching and mode.

Sources: CHANGELOG.md106 CHANGELOG.md53

VLM (Vision-Language Model) Plugins

VLM plugins provide vision-language model capabilities for document understanding. The system supports multiple backends (Transformers, MLX, vLLM, API-based).

Built-in implementations:

GraniteDocling: IBM's granite-docling model
SmolDocling: Smaller VLM option
Phi-4, Pixtral, DeepSeek-OCR: Additional VLM options
Backend support: Transformers, MLX (Apple Silicon), vLLM, Ollama, LM Studio, watsonx.ai

VLM plugins can be inline (locally executed) or API-based (remote inference).

Sources: CHANGELOG.md343 CHANGELOG.md29 CHANGELOG.md62 CHANGELOG.md545

Enrichment Model Plugins

Enrichment plugins add metadata and classifications to document elements (code blocks, formulas, picture descriptions).

Built-in implementations:

Picture Classifier v2.0: Classifies detected pictures
Code/Formula enrichment: Identifies and marks code blocks and mathematical formulas

Sources: CHANGELOG.md11 CHANGELOG.md446

Plugin Discovery and Loading

The plugin discovery process occurs during Docling initialization. The following diagram shows the loading sequence:

Diagram 3: Plugin Discovery and Loading Sequence

Discovery Process

Entry Point Scanning: At startup, pluggy scans all installed packages for entry points in the docling group
Module Loading: Each discovered entry point's module is imported
Hook Registration: Plugins register their implementations by calling registration functions
Factory Integration: Factories query the plugin registry when creating models

CLI Plugin Inspection

The CLI provides a command to inspect registered plugins:

This command displays:

All registered plugins
Available implementations for each plugin type
Layout and Table model options

Sources: CHANGELOG.md56

Creating Custom Plugins

To create a custom plugin, follow this pattern:

Step 1: Project Structure

Create a Python package with the following structure:

my_docling_plugin/
├── pyproject.toml
├── src/
│   └── my_plugin/
│       ├── __init__.py
│       └── docling_plugin.py

Step 2: Define Entry Point

In pyproject.toml, declare the entry point:

Step 3: Implement Plugin Module

The plugin module must implement hook functions that register model implementations. The exact hook specifications depend on the plugin type.

For example, an OCR plugin would register an OCR engine class, a layout plugin would register a layout model class, etc. The registration typically happens through function calls to the plugin manager.

Step 4: Install and Verify

Install the plugin package:

Verify registration:

Your plugin should appear in the output.

Sources: pyproject.toml85-86 CHANGELOG.md56

Plugin Configuration

Once plugins are registered, they can be selected through configuration objects. Each plugin type uses specific option classes:

Plugin Type	Configuration Class	Selection Method
OCR	`OcrOptions`	Specify `engine` parameter
Layout	`LayoutOptions`	Specify `model` parameter
Table	`TableStructureOptions`	Specify `mode` parameter
VLM	`VlmOptions`	Specify model spec and backend
Enrichment	`ConvertPipelineOptions`	Enable via flags

Example configuration selecting specific plugins:

The factory layer uses these options to instantiate the selected plugin implementations.

Sources: CHANGELOG.md543 CHANGELOG.md53

Factory Pattern Integration

The plugin system integrates with Docling's factory pattern. Factories are responsible for:

Querying the plugin registry for available implementations
Selecting the implementation based on configuration
Instantiating the model with appropriate parameters
Managing hardware acceleration settings

Diagram 4: Factory Pattern with Plugin Registry

Factories abstract the instantiation logic, allowing pipelines to work with any registered implementation without modification.

Sources: CHANGELOG.md107

External Plugin Examples

The Docling ecosystem includes community-developed plugins:

docling-surya

A plugin providing SuryaOCR integration as an alternative OCR engine. Installation:

Once installed, it can be selected via configuration:

Sources: CHANGELOG.md137

Hardware Acceleration

Plugins can leverage hardware acceleration through AcceleratorOptions. The factory layer passes acceleration settings to plugin implementations:

Plugin implementations must respect these settings when initializing models. For CUDA-based models, this typically involves moving the model to the specified device. For MPS (Apple Silicon), it uses Metal Performance Shaders. For XPU (Intel GPUs), it uses Intel Extension for PyTorch.

Sources: CHANGELOG.md47 CHANGELOG.md16

Dependency Management

Plugins are installed as Python packages with their own dependencies. Docling uses pluggy as the core dependency for the plugin system:

Plugins declare their dependencies in their own pyproject.toml, keeping them isolated from Docling's core dependencies. This allows plugins to use different versions of ML frameworks or introduce new dependencies without affecting Docling's core.

Optional dependencies for built-in model implementations are declared as extras:

Sources: pyproject.toml70 pyproject.toml93-112

Thread Safety

Plugin implementations must be thread-safe when used in concurrent contexts, particularly in the ThreadedStandardPdfPipeline. The plugin system itself is thread-safe, but individual plugin implementations must handle concurrent access appropriately.

For models with non-thread-safe backends (e.g., pypdfium2), Docling provides synchronization primitives. Plugins should document their thread-safety characteristics and use appropriate locking mechanisms if needed.

Sources: CHANGELOG.md202

Summary

The Docling plugin system provides a flexible, standardized way to extend functionality:

Entry Point Based: Uses Python's package entry points for automatic discovery
Factory Integration: Seamlessly integrates with Docling's factory pattern
Type Safety: Configuration objects provide type-safe selection of implementations
Hardware Aware: Supports GPU acceleration through AcceleratorOptions
Isolated Dependencies: Each plugin manages its own dependency requirements
CLI Support: Built-in commands for inspecting registered plugins

The architecture enables both built-in model variety and community-developed extensions without modifying Docling's core codebase.

Sources: pyproject.toml70 pyproject.toml85-86 CHANGELOG.md107 CHANGELOG.md56

Plugin System

Relevant source files

Purpose and Scope

For information about configuring specific model types, see Configuration and Pipeline Options. For details on the built-in model implementations, see AI/ML Models.

Sources: pyproject.toml70 pyproject.toml85-86 CHANGELOG.md107

Architecture Overview

Diagram 1: Plugin System Architecture

The plugin system operates through these layers:

Entry Point System: Plugins register themselves via Python package entry points in their pyproject.toml files
Plugin Manager: The pluggy-based manager discovers and loads all registered plugins at runtime
Factory Layer: Factories use the plugin registry to instantiate models based on user configuration
Model Specifications: Configuration objects specify which plugin implementation to use

Sources: pyproject.toml70 pyproject.toml85-86 CHANGELOG.md107

Entry Point Registration

Plugins register themselves using Python's entry point mechanism. The entry point group name is docling.

Default Plugin Registration

Docling's built-in models are registered through the default plugin:

This entry point references the module at docling.models.plugins.defaults, which registers all built-in model implementations.

Sources: pyproject.toml85-86

External Plugin Registration

External packages can register plugins by declaring similar entry points in their own pyproject.toml:

The plugin module must implement the required hook specifications that pluggy expects. When installed, the plugin is automatically discovered by Docling's plugin manager.

Sources: pyproject.toml85-86 CHANGELOG.md832

Plugin Types and Extension Points

Docling provides extension points for five primary model categories. Each category follows a factory pattern where plugins register implementations that factories can instantiate.

Diagram 2: Plugin Extension Points and Built-in Implementations

Sources: CHANGELOG.md107 CHANGELOG.md272 CHANGELOG.md543 CHANGELOG.md62

OCR Engine Plugins

OCR plugins provide text extraction capabilities. The system supports multiple OCR engine implementations through the AutoOCR selector, which automatically chooses the best available engine.

Built-in implementations:

RapidOCR (default): Fast GPU-accelerated OCR
EasyOCR: Alternative OCR engine
Tesseract: Open-source OCR engine with PSM options support
OCRMac: macOS-native OCR (macOS only)
AutoOCR: Automatic engine selection based on availability

Configuration is done through OcrOptions which specifies the engine type and parameters.

Sources: CHANGELOG.md272 CHANGELOG.md273 CHANGELOG.md186

Layout Model Plugins

Layout model plugins detect document structure (paragraphs, headings, tables, figures, etc.). Multiple implementations can be registered and selected via LayoutOptions.

Built-in implementations:

Heron (default): 258M parameter layout detection model
RT-DETR: Alternative layout detection approach
Egret: Additional layout model option

The factory pattern allows runtime selection: LayoutOptions(model="heron") or LayoutOptions(model="rt-detr").

Sources: CHANGELOG.md386 CHANGELOG.md543

Table Structure Plugins

Table structure plugins parse detected table regions into cell structures. Plugins must handle both cell detection and structure recognition (rows, columns, spans).

Built-in implementations:

TableFormer: Default implementation using OTSL (Optimized Table-Structure Language) tokenization
TableCropsLayoutModel: Experimental alternative approach

Configuration uses TableStructureOptions with parameters like do_cell_matching and mode.

Sources: CHANGELOG.md106 CHANGELOG.md53

VLM (Vision-Language Model) Plugins

VLM plugins provide vision-language model capabilities for document understanding. The system supports multiple backends (Transformers, MLX, vLLM, API-based).

Built-in implementations:

GraniteDocling: IBM's granite-docling model
SmolDocling: Smaller VLM option
Phi-4, Pixtral, DeepSeek-OCR: Additional VLM options
Backend support: Transformers, MLX (Apple Silicon), vLLM, Ollama, LM Studio, watsonx.ai

VLM plugins can be inline (locally executed) or API-based (remote inference).

Sources: CHANGELOG.md343 CHANGELOG.md29 CHANGELOG.md62 CHANGELOG.md545

Enrichment Model Plugins

Enrichment plugins add metadata and classifications to document elements (code blocks, formulas, picture descriptions).

Built-in implementations:

Picture Classifier v2.0: Classifies detected pictures
Code/Formula enrichment: Identifies and marks code blocks and mathematical formulas

Sources: CHANGELOG.md11 CHANGELOG.md446

Plugin Discovery and Loading

The plugin discovery process occurs during Docling initialization. The following diagram shows the loading sequence:

Diagram 3: Plugin Discovery and Loading Sequence

Discovery Process

Entry Point Scanning: At startup, pluggy scans all installed packages for entry points in the docling group
Module Loading: Each discovered entry point's module is imported
Hook Registration: Plugins register their implementations by calling registration functions
Factory Integration: Factories query the plugin registry when creating models

CLI Plugin Inspection

The CLI provides a command to inspect registered plugins:

This command displays:

All registered plugins
Available implementations for each plugin type
Layout and Table model options

Sources: CHANGELOG.md56

Creating Custom Plugins

To create a custom plugin, follow this pattern:

Step 1: Project Structure

Create a Python package with the following structure:

my_docling_plugin/
├── pyproject.toml
├── src/
│   └── my_plugin/
│       ├── __init__.py
│       └── docling_plugin.py

Step 2: Define Entry Point

In pyproject.toml, declare the entry point:

Step 3: Implement Plugin Module

The plugin module must implement hook functions that register model implementations. The exact hook specifications depend on the plugin type.

Step 4: Install and Verify

Install the plugin package:

Verify registration:

Your plugin should appear in the output.

Sources: pyproject.toml85-86 CHANGELOG.md56

Plugin Configuration

Once plugins are registered, they can be selected through configuration objects. Each plugin type uses specific option classes:

Plugin Type	Configuration Class	Selection Method
OCR	`OcrOptions`	Specify `engine` parameter
Layout	`LayoutOptions`	Specify `model` parameter
Table	`TableStructureOptions`	Specify `mode` parameter
VLM	`VlmOptions`	Specify model spec and backend
Enrichment	`ConvertPipelineOptions`	Enable via flags

Example configuration selecting specific plugins:

The factory layer uses these options to instantiate the selected plugin implementations.

Sources: CHANGELOG.md543 CHANGELOG.md53

Factory Pattern Integration

The plugin system integrates with Docling's factory pattern. Factories are responsible for:

Querying the plugin registry for available implementations
Selecting the implementation based on configuration
Instantiating the model with appropriate parameters
Managing hardware acceleration settings

Diagram 4: Factory Pattern with Plugin Registry

Factories abstract the instantiation logic, allowing pipelines to work with any registered implementation without modification.

Sources: CHANGELOG.md107

External Plugin Examples

The Docling ecosystem includes community-developed plugins:

docling-surya

A plugin providing SuryaOCR integration as an alternative OCR engine. Installation:

Once installed, it can be selected via configuration:

Sources: CHANGELOG.md137

Hardware Acceleration

Plugins can leverage hardware acceleration through AcceleratorOptions. The factory layer passes acceleration settings to plugin implementations:

Sources: CHANGELOG.md47 CHANGELOG.md16

Dependency Management

Plugins are installed as Python packages with their own dependencies. Docling uses pluggy as the core dependency for the plugin system:

Optional dependencies for built-in model implementations are declared as extras:

Sources: pyproject.toml70 pyproject.toml93-112

Thread Safety

Sources: CHANGELOG.md202

Summary

The Docling plugin system provides a flexible, standardized way to extend functionality:

Entry Point Based: Uses Python's package entry points for automatic discovery
Factory Integration: Seamlessly integrates with Docling's factory pattern
Type Safety: Configuration objects provide type-safe selection of implementations
Hardware Aware: Supports GPU acceleration through AcceleratorOptions
Isolated Dependencies: Each plugin manages its own dependency requirements
CLI Support: Built-in commands for inspecting registered plugins

The architecture enables both built-in model variety and community-developed extensions without modifying Docling's core codebase.

Sources: pyproject.toml70 pyproject.toml85-86 CHANGELOG.md107 CHANGELOG.md56

Plugin System

Purpose and Scope

Architecture Overview

Entry Point Registration

Default Plugin Registration

External Plugin Registration

Plugin Types and Extension Points

OCR Engine Plugins

Layout Model Plugins

Table Structure Plugins

VLM (Vision-Language Model) Plugins

Enrichment Model Plugins

Plugin Discovery and Loading

Discovery Process

CLI Plugin Inspection

Creating Custom Plugins

Step 1: Project Structure

Step 2: Define Entry Point

Step 3: Implement Plugin Module

Step 4: Install and Verify

Plugin Configuration

Factory Pattern Integration

External Plugin Examples

docling-surya

Hardware Acceleration

Dependency Management

Thread Safety

Summary

On this page

Plugin System

Purpose and Scope

Architecture Overview

Entry Point Registration

Default Plugin Registration

External Plugin Registration

Plugin Types and Extension Points

OCR Engine Plugins

Layout Model Plugins

Table Structure Plugins

VLM (Vision-Language Model) Plugins

Enrichment Model Plugins

Plugin Discovery and Loading

Discovery Process

CLI Plugin Inspection

Creating Custom Plugins

Step 1: Project Structure

Step 2: Define Entry Point

Step 3: Implement Plugin Module

Step 4: Install and Verify

Plugin Configuration

Factory Pattern Integration

External Plugin Examples

docling-surya

Hardware Acceleration

Dependency Management

Thread Safety

Summary

On this page