Get started with Docling in minutes. This page shows the simplest ways to convert documents using the CLI and Python API.
Docling converts documents from various formats to a unified DoclingDocument representation, which can be exported to multiple output formats.
Sources: docling/document_converter.py189-326 docling/datamodel/document.py28-36
The docling CLI provides the fastest way to convert documents from the command line.
| Format | Option | Description |
|---|---|---|
| Markdown | --to md | Human-readable text (default) |
| JSON | --to json | Full document structure |
| HTML | --to html | Styled HTML output |
| DocTags | --to doctags | XML-like annotations |
For all CLI options, see the CLI reference or run docling --help.
Sources: docling/cli/main.py373-474 README.md88-101
The DocumentConverter class provides programmatic access for integrating Docling into your applications.
Sources: README.md73-82 docling/document_converter.py209-261
The convert() method accepts file paths (string or Path), URLs, or DocumentStream objects and returns a ConversionResult containing the converted DoclingDocument.
Sources: docling/document_converter.py283-326
The convert_all() method processes documents concurrently and returns an iterator of ConversionResult objects.
Sources: docling/document_converter.py327-383
The DoclingDocument class (from docling-core) provides multiple export methods for different output formats.
Sources: docling/datamodel/document.py28-36
Docling automatically detects and processes various document formats:
| Format | Extensions | Use Case |
|---|---|---|
.pdf | Technical documents, reports, papers | |
| Word | .docx, .dotx, .docm | Office documents |
| PowerPoint | .pptx, .ppsx, .pptm | Presentations |
| Excel | .xlsx, .xlsm | Spreadsheets |
| HTML | .html, .htm, .xhtml | Web pages |
| Markdown | .md | Documentation |
| Images | .png, .jpg, .tiff | Scanned documents, photos |
| Audio/Video | .mp3, .wav, .mp4 | Transcription (ASR) |
| LaTeX | .tex | Scientific documents |
See Supported Formats for the complete list.
Format detection is automatic based on file extension and MIME type.
Sources: docling/datamodel/base_models.py55-103 README.md33
convert(source) - Convert a single documentconvert_all(sources) - Batch convert multiple documentsconvert_string(content, name, format) - Convert string contentdocument - The converted DoclingDocumentstatus - Conversion status (SUCCESS, FAILURE, PARTIAL_SUCCESS)errors - List of errors if conversion failedpages - Page-level metadataexport_to_markdown() - Export as Markdown stringsave_as_markdown(filename) - Save as Markdown filesave_as_json(filename) - Save as JSONsave_as_html(filename) - Save as HTMLSources: docling/document_converter.py189-432 docling/datamodel/document.py407-524
Sources: docling/document_converter.py283-383
After getting started with basic conversion:
Sources: docling/cli/main.py368-609 docling/document_converter.py178-609
Refresh this wiki