This page describes the architecture of rustdoc, the Rust documentation tool located in src/librustdoc/. It covers how rustdoc transforms compiler-internal data structures into documentation output—the clean IR transformation, HTML and JSON rendering backends, search index construction, intra-doc link resolution, and the rustbook/mdBook wrapper for prose documentation.
For information on how rustdoc tests are executed (UI tests, doctest modes), see Testing Infrastructure. For how rustdoc is invoked as part of the build system, see Bootstrap Build System.
Rustdoc reuses the full rustc compilation pipeline up through type-checking, then takes the resulting HIR and type information and transforms it into documentation output.
Rustdoc Pipeline
Sources: src/librustdoc/clean/utils.rs39-88 src/librustdoc/visit_ast.rs1-30 src/librustdoc/html/render/mod.rs1-30 src/librustdoc/json/mod.rs1-20
The clean module (src/librustdoc/clean/) defines rustdoc's primary intermediate representation (IR), commonly called the cleaned AST. This IR is closer to source code than either HIR or rustc_middle::ty, which simplifies rendering. Both the HTML and JSON backends consume the same clean IR.
Clean IR Core Types
Sources: src/librustdoc/clean/types.rs54-363
Item wraps Box<ItemInner> to keep the Vec<Item> overhead at 8 bytes per element. The ItemId enum covers three cases:
| Variant | Description |
|---|---|
ItemId::DefId(DefId) | Normal items with a compiler DefId |
ItemId::Auto { trait_, for_ } | Synthesized auto-trait impls (e.g., Send, Sync) |
ItemId::Blanket { impl_id, for_ } | Synthesized blanket impls (e.g., impl<T: Foo> Bar for T) |
The clean module defines two families of transformation functions, distinguished by prefix:
| Family | Input | Used for |
|---|---|---|
clean_* (HIR) | hir::* types | User-written code, inlined local re-exports |
clean_middle_* | rustc_middle::ty::* types | Cross-crate re-exports, auto/blanket impl synthesis |
Key entry points:
clean_doc_module() — converts a visit_ast::Module into a clean::Item (src/librustdoc/clean/mod.rs65-167)clean_ty(), clean_generics(), clean_where_predicate() — HIR cleaningclean_middle_ty(), clean_middle_const(), clean_predicate() — ty::* cleaningRustdocVisitor (src/librustdoc/visit_ast.rs) performs a first pass over the HIR to build a visit_ast::Module tree before any cleaning occurs. The Module struct (src/librustdoc/visit_ast.rs28-52) holds:
items: FxIndexMap — non-glob items (preserving insertion order)mods: Vec<Module> — child modulesforeigns — extern "C" itemsinlined_foreigns — cross-crate inlined re-exportsThe visitor handles #[doc(hidden)], #[doc(inline)], and glob re-exports (pub use foo::*), deciding which items to include before the cleaning step runs.
krate() — the top-level entry point (src/librustdoc/clean/utils.rs39-88):
RustdocVisitor::new(cx).visit() → visit_ast::Module
clean_doc_module(&module, cx) → clean::Item (ModuleItem)
Crate { module, external_traits }
After cleaning, rustdoc runs a series of passes over the clean::Crate. Passes are defined in src/librustdoc/passes/ and implement a fold or visitor pattern.
| Pass | Purpose |
|---|---|
collect-intra-doc-links | Resolve [path] links in doc comments |
propagate-stability | Fill in stability fields for re-exported items |
strip-hidden | Remove #[doc(hidden)] items |
strip-private | Remove private items from public docs |
check-doc-test-visibility | Warn about private items with doc tests |
collect-trait-impls | Gather blanket and auto-trait implementations |
Sources: src/librustdoc/passes/collect_intra_doc_links.rs41-55
Cache (src/librustdoc/formats/cache.rs) is populated after all passes complete and is shared across the rendering step. It stores:
| Field | Type | Purpose |
|---|---|---|
paths | FxHashMap<DefId, (Vec<Symbol>, ItemType)> | Maps item IDs to fully-qualified paths |
impls | FxHashMap<DefId, Vec<Impl>> | All implementations for a type |
intra_doc_links | FxHashMap<ItemId, Vec<ItemLink>> | Resolved intra-doc links |
traits | FxHashMap<DefId, clean::Trait> | Known trait definitions |
masked_crates | FxHashSet<CrateNum> | Crates marked #[doc(masked)] |
exact_paths | FxHashMap<DefId, Vec<Symbol>> | Canonical paths for cross-crate linking |
Sources: src/librustdoc/formats/cache.rs1-30
Both backends implement the FormatRenderer trait (src/librustdoc/formats/mod.rs). The trait has two key methods: init() (setup before any items) and item() (called once per clean::Item).
FormatRenderer Implementations
Sources: src/librustdoc/html/render/context.rs49-98 src/librustdoc/json/mod.rs1-30
The HTML backend is implemented across several files under src/librustdoc/html/render/.
Context (src/librustdoc/html/render/context.rs49-72) carries:
current: Vec<Symbol> — current module path (e.g., ["std", "vec"])dst: PathBuf — output directory for the current moduleshared: SharedContext<'tcx> — immutable shared state (cache, options, source map)info: ContextInfo — per-item flags (redirect pages, source inclusion)print_item() (src/librustdoc/html/render/print_item.rs64-197) is the main dispatch function. It renders a single clean::Item by:
ItemVars template (breadcrumbs, source link, stability)item_module, item_function, item_struct, item_trait, etc.)write_shared() (src/librustdoc/html/render/write_shared.rs) writes two categories of files:
| Category | Filename pattern | Cache behavior |
|---|---|---|
| Static (toolchain-specific) | static.files/{name}-{hash}.{ext} | Cache-Control: immutable |
| Invocation-specific | {name}{resource-suffix}.js | Not immutable |
HTML templates use the askama crate for type-safe templating. The main page template is at src/librustdoc/html/templates/page.html
The front-end is served from src/librustdoc/html/static/:
| File | Role |
|---|---|
js/main.js | Page interaction, keyboard shortcuts, tab switching |
js/search.js | Client-side search engine |
js/storage.js | Theme/settings persistence via localStorage |
js/settings.js | Settings UI |
css/rustdoc.css | Main stylesheet with CSS custom property theming |
Sources: src/librustdoc/html/static/js/main.js1-10 src/librustdoc/html/static/css/rustdoc.css1-66
The JSON backend (src/librustdoc/json/) emits a single crate-name.json file. The public schema is defined in src/rustdoc-json-types/lib.rs, which is a separately versioned library.
Conversion flow:
convert_item() (src/librustdoc/json/conversions.rs26-84) maps each clean::Item to a rustdoc_json_types::Item. The conversion uses two traits:
FromClean<T> — converts a clean type into a rustdoc_json_types typeIntoJson<U> — blanket impl over FromClean, callable as .into_json(renderer)Key differences from HTML output:
is_stripped: true)links: HashMap<String, Id>Id based on its DefIdSources: src/librustdoc/json/conversions.rs105-120 src/rustdoc-json-types/lib.rs1-15
The search index is built in src/librustdoc/html/render/search_index.rs. The SerializedSearchIndex struct stores data in parallel column arrays:
| Column | Type | Content |
|---|---|---|
names | Vec<String> | Item names |
path_data | Vec<Option<PathData>> | Module paths |
entry_data | Vec<Option<EntryData>> | Item type and parent |
descs | Vec<String> | Short doc summaries |
function_data | Vec<Option<IndexItemFunctionType>> | Function type signatures |
type_data | Vec<Option<TypeData>> | Concrete type inverted index |
generic_inverted_index | Vec<Vec<Vec<u32>>> | Generic type inverted index |
alias_pointers | Vec<Option<usize>> | Alias resolution |
Sources: src/librustdoc/html/render/search_index.rs34-59
Each IndexItem (src/librustdoc/html/render/mod.rs130-146) records:
ty: ItemType — discriminant matching rustdoc::formats::item_type::ItemTypename: Symbol, module_path: Vec<Symbol>search_type: Option<IndexItemFunctionType> — for type-signature searchaliases: Box<[Symbol]>, is_deprecated, is_unstableFunction type signatures are stored as compact self-terminating VLQ hex strings via RenderType::write_to_string() (src/librustdoc/html/render/mod.rs174-210).
Type-signature search uses a two-level inverted index:
type_data: maps concrete types → item listgeneric_inverted_index: maps alpha-normalized generics → item list, sorted by signature size (smaller signatures ranked higher)search.js (src/librustdoc/html/static/js/search.js) is the client-side search engine. Key constants and components:
| Symbol | Value/Role |
|---|---|
UNBOXING_LIMIT | 5 — max generic nesting depth to search |
MAX_RESULTS | 200 — maximum search results returned |
itemTypes | Object mapping type names to discriminant integers |
itemParents | Map of subtype → parent type (e.g., method → fn) |
editDistanceState | Damerau-Levenshtein edit distance for fuzzy name matching |
Query parsing handles:
Vecstd::vec::VecVec<T> -> Option<T>Iterator<Item=u8>struct:Foo, fn:bar"Foo"Search Flow
Sources: src/librustdoc/html/static/js/search.js93-160
Intra-doc links allow doc comments to reference Rust items by path (e.g., [`Vec`], [Iterator::next]). These are resolved by the collect-intra-doc-links pass (src/librustdoc/passes/collect_intra_doc_links.rs).
Key types:
| Type | Role |
|---|---|
LinkCollector | Visitor that processes each item's doc comment |
Res | Resolution result: Def(DefKind, DefId) or Primitive(PrimitiveType) |
UrlFragment | Item(DefId) or UserWritten(String) for #section anchors |
Disambiguator | Explicit kind prefix: struct@, fn@, mod@, etc. |
ResolutionInfo | Caches a resolved link to avoid redundant lookups |
Resolution process (src/librustdoc/passes/collect_intra_doc_links.rs44-55):
collect_intra_doc_links() creates a LinkCollector and calls visit_crate()markdown_links()rustc_resolve in the module's scopeBROKEN_INTRA_DOC_LINKS)Cache::intra_doc_linksDisambiguators supported: struct:, enum:, trait:, fn:, type:, mod:, const:, static:, macro:, derive:, prim:, value:.
Sources: src/librustdoc/passes/collect_intra_doc_links.rs68-165
src/tools/rustbook/ is a thin wrapper around mdBook It is used to build the Rust project's prose documentation books:
| Book | Source path (git submodule or in-tree) |
|---|---|
| The Rust Book | src/doc/book |
| The Rustonomicon | src/doc/nomicon |
| The Unstable Book | src/doc/unstable-book |
| The Reference | src/doc/reference |
| The Rustdoc Book | src/doc/rustdoc |
The submodule configuration is in .gitmodules
rustbook adds Rust-specific capabilities on top of stock mdBook: doc test execution via rustdoc, integration with the bootstrap build system, and custom preprocessors. It is invoked by ./x.py doc when building books.
Sources: src/tools/rustbook/Cargo.lock1-10 .gitmodules1-20
Doc comments and #[doc(...)] attributes are processed by the Attributes type in src/librustdoc/clean/types.rs. Key doc attribute flags:
| Attribute | Effect |
|---|---|
#[doc(hidden)] | Excludes item from documentation |
#[doc(inline)] | Inlines re-exported item's documentation |
#[doc(no_inline)] | Prevents inlining |
#[doc(masked)] | Hides an extern crate entirely |
#[doc(cfg(...))] | Marks item as platform/feature conditional |
#[doc(alias = "...")] | Adds a search alias for the item |
#[doc(html_root_url = "...")] | Sets the URL for cross-crate linking |
Cfg representations (src/librustdoc/clean/cfg.rs) track platform and feature conditions and are rendered as "available on ..." banners in HTML output.
Sources: src/librustdoc/clean/types.rs265-285 src/librustdoc/clean/cfg.rs1-30
Refresh this wiki
This wiki was recently refreshed. Please wait 4 days to refresh again.