This document explains the plugin architecture in MarkItDown, which allows third-party packages to extend the system by registering custom DocumentConverter implementations without modifying core code. The plugin system uses Python's entry points mechanism for discovery and a standardized interface for registration.
For implementation details of the DocumentConverter interface that plugins must implement, see DocumentConverter Interface. For a complete walkthrough of the sample RTF converter plugin, see Sample RTF Converter Plugin.
Plugins must implement two required components that MarkItDown uses to discover and initialize converters.
Every plugin package must export a module-level variable declaring its interface version:
This variable enables future interface evolution while maintaining backward compatibility. The current and only supported version is 1. MarkItDown validates this version during plugin loading and will skip plugins with unsupported versions.
packages/markitdown-sample-plugin/src/markitdown_sample_plugin/_plugin.py13-15
Plugins must implement and export a register_converters() function with the following signature:
This function receives a MarkItDown instance and must call markitdown.register_converter() to attach any converters the plugin provides. The **kwargs parameter allows for future extensibility without breaking existing plugins.
packages/markitdown-sample-plugin/src/markitdown_sample_plugin/_plugin.py25-31
Sources:
Plugins use Python's entry points mechanism to advertise themselves to MarkItDown. This approach allows automatic discovery without requiring users to manually import or configure plugins.
Plugins declare themselves in their pyproject.toml file using the markitdown.plugin entry point group:
The entry point format is:
| Component | Description | Example |
|---|---|---|
| Group name | "markitdown.plugin" (required literal) | "markitdown.plugin" |
| Entry point key | Arbitrary identifier for the plugin | sample_plugin |
| Module path | Fully qualified Python module path | "markitdown_sample_plugin" |
The module path must point to a package or module that exports both __plugin_interface_version__ and register_converters().
packages/markitdown-sample-plugin/pyproject.toml39-41
A single package can register multiple entry points if it provides logically distinct sets of converters. Each entry point should reference a module with its own register_converters() implementation.
Sources:
The following diagram illustrates how MarkItDown discovers and loads plugins during initialization:
Sources:
Plugin loading occurs during MarkItDown instance construction when enable_plugins=True:
markitdown.plugin group__plugin_interface_version__ variable is validatedregister_converters() is called with the MarkItDown instancemarkitdown.register_converter() to add converters to the priority-sorted registryPlugins are disabled by default and must be explicitly enabled:
CLI Usage:
Python API Usage:
This opt-in model ensures predictable behavior when plugins are not needed and avoids potential conflicts between converters.
Sources:
The following diagram maps the plugin registration flow to specific code entities:
Plugins register converters by calling the MarkItDown.register_converter() method:
packages/markitdown-sample-plugin/src/markitdown_sample_plugin/_plugin.py25-31
The register_converter() method adds the converter to the internal _converters list and maintains priority ordering. Converters with higher priority values are checked first during file type matching.
Sources:
The plugin interface is at version 1, which is the initial and only version currently supported. This version defines:
__plugin_interface_version__ variable requirementregister_converters(markitdown, **kwargs) function signaturemarkitdown.plugin entry point groupDocumentConverter interface for converter implementationspackages/markitdown-sample-plugin/src/markitdown_sample_plugin/_plugin.py13-15
The versioning system allows MarkItDown to evolve the plugin interface while maintaining backward compatibility:
| Scenario | Behavior |
|---|---|
| Plugin version matches MarkItDown support | Plugin loaded and converters registered |
| Plugin version not supported | Plugin skipped with warning |
Missing __plugin_interface_version__ | Plugin skipped as invalid |
Future interface versions might introduce:
register_converters()MarkItDown instance for plugin usePlugins should always use **kwargs in register_converters() to accept future parameters gracefully.
Sources:
All plugins must declare a dependency on a compatible version of the markitdown package:
packages/markitdown-sample-plugin/pyproject.toml26-29
Plugins can declare additional dependencies needed for their converters. For the RTF sample plugin, the striprtf library is required:
packages/markitdown-sample-plugin/pyproject.toml26-29
Users who install the plugin automatically get all required dependencies through pip's dependency resolution.
Sources:
MarkItDown provides a CLI command to list all discovered plugins:
This command queries the markitdown.plugin entry points and displays:
Users should run this command after installing a plugin to verify it was registered correctly.
packages/markitdown-sample-plugin/README.md84-87
Sources:
Plugins must provide classes that inherit from DocumentConverter and implement the required interface:
The RtfConverter implementation in the sample plugin demonstrates this pattern:
packages/markitdown-sample-plugin/src/markitdown_sample_plugin/_plugin.py34-71
Converters can specify priority during construction. The default priority for format-specific converters is DocumentConverter.PRIORITY_SPECIFIC_FILE_FORMAT:
Higher priority converters are checked first during file type matching. See DocumentConverter Interface for details on priority values.
Sources:
The following table shows the minimal file structure for a plugin package:
| File | Purpose | Required Exports |
|---|---|---|
pyproject.toml | Package metadata and entry point declaration | Entry point in [project.entry-points."markitdown.plugin"] |
__init__.py or module file | Plugin implementation | __plugin_interface_version__, register_converters() |
| Converter modules | DocumentConverter implementations | Converter classes |
The sample plugin demonstrates this structure:
markitdown-sample-plugin/
├── pyproject.toml # Entry point: sample_plugin = "markitdown_sample_plugin"
├── src/
│ └── markitdown_sample_plugin/
│ ├── __init__.py
│ ├── __about__.py # Version: 0.1.0a1
│ └── _plugin.py # __plugin_interface_version__ = 1
│ # register_converters()
│ # RtfConverter class
packages/markitdown-sample-plugin/pyproject.toml40-41 packages/markitdown-sample-plugin/src/markitdown_sample_plugin/__about__.py4 packages/markitdown-sample-plugin/src/markitdown_sample_plugin/_plugin.py13-14
Sources:
For local development and testing, install the plugin in editable mode:
This allows modifications to the plugin code without reinstalling.
packages/markitdown-sample-plugin/README.md79-81
For production use, install the plugin from PyPI or a wheel:
After installation, verify the plugin is discoverable:
Then test conversion with the plugin enabled:
packages/markitdown-sample-plugin/README.md89-93
Sources:
Refresh this wiki