This document covers MarkItDown's integration with external command-line tools and third-party libraries that enhance document conversion capabilities. These integrations are optional and require separate installation of system tools or Python packages via feature groups.
For LLM-based image captioning, see LLM Integration for Image Captioning. For Azure Document Intelligence integration, see Azure Document Intelligence Integration.
MarkItDown integrates with three categories of external tools to extend its conversion capabilities beyond what can be achieved with standard Python libraries:
Each integration is optional and requires explicit installation of dependencies or system tools.
Sources: packages/markitdown/pyproject.toml36-61 packages/markitdown/tests/test_module_misc.py385-413
ExifTool is a command-line application for reading and writing metadata in various file formats. MarkItDown uses it to extract rich metadata from images and audio files that is not accessible through Python libraries alone.
ExifTool integration provides:
ExifTool location is configured via two methods (in priority order):
The MarkItDown class will use the environment variable EXIFTOOL_PATH if no exiftool_path parameter is provided.
Sources: packages/markitdown/tests/test_module_misc.py389-407
Sources: packages/markitdown/tests/test_module_misc.py389-402
ExifTool integration is used by:
ImageConverter: Extracts metadata from JPEG, PNG, TIFF, and other image formatsAudioConverter: Extracts metadata from MP3, M4A, and other audio formatsThe converters receive the exiftool_path from the MarkItDown instance and invoke the binary via subprocess when processing files.
When ExifTool is available, image conversion includes structured metadata:
Author: AutoGen Authors
Title: AutoGen: Enabling Next-Gen LLM Applications via Multi-Agent Conversation
Description: AutoGen enables diverse LLM-based applications
ImageSize: 1615x1967
DateTimeOriginal: 2024:03:14 22:10:00
For audio files:
Title: Song Title
Artist: Artist Name
Album: Album Name
SampleRate: 48000
Sources: packages/markitdown/tests/test_module_misc.py39-52
ExifTool is a system-level binary that must be installed separately:
After installation, ensure exiftool is in the system PATH or configure its location explicitly.
Sources: packages/markitdown/tests/test_module_misc.py35
MarkItDown supports speech-to-text transcription for audio files using the SpeechRecognition library combined with pydub for audio format handling.
Install the audio transcription feature group:
This installs:
SpeechRecognition - Speech recognition library with multiple engine backendspydub - Audio manipulation library for format conversionSources: packages/markitdown/pyproject.toml59
The AudioConverter handles:
Sources: packages/markitdown/tests/test_module_misc.py357-367
Sources: packages/markitdown/tests/test_module_misc.py354-368
SpeechRecognition supports multiple speech recognition backends:
The default configuration uses Google Speech Recognition, which sends audio to Google's servers for processing.
Output includes both transcription and metadata (if ExifTool is available).
Sources: packages/markitdown/tests/test_module_misc.py354-368
The YouTubeConverter fetches transcripts (captions) from YouTube videos using the youtube-transcript-api library.
Install the YouTube transcription feature group:
This installs:
youtube-transcript-api~=1.0.0 - Library for fetching YouTube video transcriptsSources: packages/markitdown/pyproject.toml60
The YouTube integration:
Given a YouTube URL like https://www.youtube.com/watch?v=V2qZ_lgxTzg, the converter produces:
Sources: packages/markitdown/tests/test_module_misc.py59-65
Sources: packages/markitdown/tests/test_module_misc.py59-65
The converter automatically handles:
External tool integrations are organized into optional dependency groups in pyproject.toml:
| Feature Group | Dependencies | Purpose |
|---|---|---|
[audio-transcription] | pydub, SpeechRecognition | Audio file transcription |
[youtube-transcription] | youtube-transcript-api | YouTube video transcript fetching |
[all] | All optional dependencies | Install everything |
Sources: packages/markitdown/pyproject.toml36-61
When optional dependencies are not installed, MarkItDown handles missing integrations gracefully:
If a converter requires a Python library that is not installed, it will:
accepts() check (converter won't be selected)MissingDependencyException during conversion with installation instructionsIf ExifTool is not available:
Test files skip tests when external tools are unavailable:
Sources: packages/markitdown/tests/test_module_misc.py34-388
The test suite includes comprehensive tests for external tool integration:
Located in packages/markitdown/tests/test_module_misc.py385-413:
Located in packages/markitdown/tests/test_module_misc.py350-368:
Test files with known metadata values are used to verify extraction:
Sources: packages/markitdown/tests/test_module_misc.py39-52
| Variable | Purpose | Example |
|---|---|---|
EXIFTOOL_PATH | Location of exiftool binary | /usr/local/bin/exiftool |
Sources: packages/markitdown/tests/test_module_misc.py400-407
Refresh this wiki