This page covers how TypeScript tokenizes source text into tokens (the scanner) and how it assembles those tokens into an Abstract Syntax Tree (the parser). The output — a SourceFile tree of Node objects — is the input to the binder and type checker. For how the binder processes the AST to produce Symbol objects, see 2.2. For how the Program object drives the overall compilation lifecycle including source file management, see 2.5.
Diagram: Source Text to AST
Sources: src/compiler/scanner.ts1-37 src/compiler/parser.ts1-50 src/compiler/types.ts40-492
The scanner lives in src/compiler/scanner.ts Its sole job is to consume characters from a source text string and emit a stream of SyntaxKind tokens, one at a time, on demand.
createScannerThe scanner is created via createScanner:
createScanner(
languageVersion: ScriptTarget,
skipTrivia: boolean,
languageVariant?: LanguageVariant,
textInitial?: string,
onError?: ErrorCallback,
start?: number,
length?: number,
jsDocParsingMode?: JSDocParsingMode
): Scanner
Sources: src/compiler/scanner.ts1-37
| Parameter | Purpose |
|---|---|
languageVersion | Determines which syntax features are valid (e.g. ES2022) |
skipTrivia | Whether scan() auto-skips whitespace/comments |
languageVariant | Standard or JSX — changes how < is lexed |
jsDocParsingMode | Controls JSDoc comment parsing depth |
Scanner InterfaceThe Scanner interface (defined in src/compiler/types.ts) exposes a cursor-style API:
| Method | Description |
|---|---|
scan() | Advance to the next token; returns its SyntaxKind |
getToken() | Return the kind of the current token without advancing |
getTokenStart() | Start position of the token, after leading trivia |
getTokenFullStart() | Start position, including leading trivia |
getTokenEnd() | End position of the current token |
getTokenText() | Raw text of the current token |
getTokenValue() | Decoded value (e.g. string literal content with escapes resolved) |
getTokenFlags() | Bitmask of TokenFlags for the current token |
hasPrecedingLineBreak() | Whether a newline appeared before this token |
resetTokenState(pos) | Rewind the scanner to a specific position |
lookAhead(cb) | Run cb speculatively; always restores scanner state |
tryScan(cb) | Run cb speculatively; restores only on false/undefined return |
scanJsxToken() | Scan in JSX child context |
scanJsDocToken() | Scan in JSDoc comment context |
Sources: src/compiler/types.ts905-1100
Every token stores its fullStart (before any preceding whitespace or comments) and start (after trivia). Trivia SyntaxKind values are:
| Kind | Description |
|---|---|
SingleLineCommentTrivia | //... |
MultiLineCommentTrivia | /* ... */ |
NewLineTrivia | Line feed or carriage return |
WhitespaceTrivia | Spaces and tabs |
ShebangTrivia | #! on the first line |
ConflictMarkerTrivia | Git merge conflict markers |
The scanner always attaches trivia to the following token as "leading trivia". Trailing trivia is computed on demand by services code.
Sources: src/compiler/types.ts494-500
TokenFlagsTokenFlags is a bitmask enum (defined in src/compiler/types.ts) providing additional metadata about the current token:
| Flag | Meaning |
|---|---|
PrecedingLineBreak | A newline preceded this token |
Unterminated | String/template literal was not closed |
Scientific | Numeric literal uses scientific notation (1e5) |
Octal | Legacy octal literal (0777) |
HexSpecifier | Hex literal (0x...) |
BinarySpecifier | Binary literal (0b...) |
OctalSpecifier | Octal literal (0o...) |
ContainsSeparator | Numeric separator present (1_000) |
UnicodeEscape | \uXXXX escape used |
ExtendedUnicodeEscape | \u{XXXXX} escape used |
ContainsInvalidEscape | Invalid escape sequence present |
Sources: src/compiler/types.ts1-50 (TokenFlags section)
The scanner has three distinct scanning modes, each producing a different token set:
Diagram: Scanner Modes and Token Sets
The JSX mode is activated when languageVariant is LanguageVariant.JSX and the parser encounters a < in an expression position. The parser calls reScanJsxToken() or scanJsxToken() to obtain JSX-specific token kinds like JsxText, JsxTextAllWhiteSpaces.
Sources: src/compiler/scanner.ts1-37
SyntaxKindSyntaxKind (defined in src/compiler/types.ts40-492) is a const enum with ~400 members. It covers every kind of token, trivia, and AST node.
Diagram: SyntaxKind Taxonomy
The enum also defines range markers for fast membership checks:
| Marker | Value |
|---|---|
FirstKeyword | First reserved word |
LastKeyword | Last contextual keyword |
FirstTypeNode | TypePredicate |
LastTypeNode | ImportType |
FirstJSDocNode | JSDocTypeExpression |
LastJSDocNode | JSDocImportTag |
FirstNode | QualifiedName (first non-token node) |
Sources: src/compiler/types.ts40-492
The parser lives in src/compiler/parser.ts It drives the scanner and constructs the AST.
Diagram: Parser Entry Points and Key Internal Functions
| Function | Description |
|---|---|
createSourceFile | Public API: parse fresh source from text |
updateSourceFile | Public API: incremental reparse given a TextChangeRange |
parseJsonText | Parse a JSON file as a JsonSourceFile |
parseIsolatedEntityName | Parse a standalone entity name (used by the checker) |
isFileProbablyExternalModule | Heuristic scan for top-level import/export |
Sources: src/compiler/parser.ts420-502
The parser does not directly call new Node(...). It uses two cooperating objects:
parseBaseNodeFactory — creates raw low-level node objects via objectAllocator constructors (NodeConstructor, TokenConstructor, IdentifierConstructor, SourceFileConstructor, PrivateIdentifierConstructor). Nodes are allocated at position -1, -1 initially.parseNodeFactory — wraps parseBaseNodeFactory with NodeFactoryFlags.NoParenthesizerRules, providing factory.* helpers for each node kind.Sources: src/compiler/parser.ts420-441
NodeFlags as Context BitsThe parser tracks its active context by accumulating NodeFlags into a contextFlags variable. These flags are stamped onto each created node and propagate to the AST:
| Flag | Context Meaning |
|---|---|
YieldContext | Inside a generator function body |
AwaitContext | Inside an async function body |
DisallowInContext | Inside the initializer of a for...in header |
DecoratorContext | Inside a decorator expression |
JavaScriptFile | File is .js / .jsx |
Ambient | Inside an ambient declaration (declare ...) |
InWithStatement | Ancestor is a with statement |
DisallowConditionalTypesContext | Inside an infer constraint |
These flags are stored on individual nodes via Node.flags so that later passes (binder, checker) can query them without re-walking the tree.
Sources: src/compiler/types.ts782-848
The parser frequently needs to look ahead without committing. This is managed by the scanner's lookAhead and tryScan methods, plus a SpeculationKind enum internal to the parser:
| Kind | Behavior |
|---|---|
TryParse | Attempt a parse; roll back scanner and discard nodes on failure |
Lookahead | Always roll back; used for boolean "can we parse X here?" checks |
Reparse | Re-enter a region already parsed; used for incremental recovery |
Sources: src/compiler/parser.ts414-418
When the parser encounters an unexpected token, it:
createDetachedDiagnostic / parseErrorAtPosition.ThisNodeHasError and ThisNodeOrAnySubNodesHasError in NodeFlags on the offending and ancestor nodes. This is tested via containsParseError(node).nodeIsMissing(node) returns true for these).Missing nodes are zero-width nodes with pos === end and serve as placeholders to allow the rest of parsing to proceed.
Sources: src/compiler/parser.ts270-340
forEachChild and the Dispatch TableforEachChild is the canonical way to visit a node's children. The parser defines forEachChildTable — a lookup table indexed by SyntaxKind — mapping each node kind to a function that visits its specific child fields in source order:
forEachChildTable: { [TNode in ForEachChildNodes as TNode["kind"]]: ForEachChildFunction<TNode> }
This avoids a giant switch statement and is the implementation used by both the public forEachChild and the internal forEachChildRecursively.
Sources: src/compiler/parser.ts503-510
Diagram: Core AST Interfaces and Their Relationships
Sources: src/compiler/types.ts782-1200 src/services/services.ts374-495
NodeThe base interface for every AST node:
| Property | Type | Description |
|---|---|---|
kind | SyntaxKind | What kind of node this is |
flags | NodeFlags | Parse context and error flags |
transformFlags | TransformFlags | Which transformers care about this node |
parent | Node | Parent node (set by setParentRecursive after parsing) |
pos | number | Full start (including leading trivia) |
end | number | End of the node text |
getStart(sourceFile) skips leading trivia; getFullStart() returns pos.
NodeArray<T>A typed, readonly array of nodes with two extra fields:
| Field | Description |
|---|---|
pos | Start of the list (may include trivia before the first element) |
end | End of the list |
hasTrailingComma | Whether the list ended with a trailing comma |
NodeFlagsSelected flags and their origin:
| Flag | Set By | Purpose |
|---|---|---|
Let, Const, Using | Parser | Variable declaration kind |
Synthesized | Transformers | Node was not from source text |
OptionalChain | Parser | ?. chaining root |
JavaScriptFile | Parser | Source is .js/.jsx |
JsonFile | Parser | Source is .json |
JSDoc | Parser | Node is inside a JSDoc comment |
ThisNodeHasError | Parser | Parse error on this specific node |
ThisNodeOrAnySubNodesHasError | Parser | Aggregate error flag |
HasImplicitReturn, HasExplicitReturn | Binder | Reachability flags |
Ambient | Binder/Parser | Inside declare context |
Deprecated | Binder | Has @deprecated JSDoc tag |
Unreachable | Binder | Unreachable code |
PossiblyContainsDynamicImport | Parser | Optimization flag; once set, never cleared |
Sources: src/compiler/types.ts782-848
SourceFileSourceFile (kind SyntaxKind.SourceFile) is the root of every parsed AST. Key properties:
| Property | Description |
|---|---|
statements | Top-level statement nodes |
text | The full source text string |
fileName | File path as provided to the parser |
path | Canonical Path (normalized, lowercase on case-insensitive systems) |
languageVersion | ScriptTarget — the ECMAScript target |
languageVariant | Standard or JSX |
scriptKind | TS, TSX, JS, JSX, JSON, External, Deferred |
isDeclarationFile | Whether this is a .d.ts file |
parseDiagnostics | Syntactic errors found during parsing |
referencedFiles | /// <reference path="..." /> references |
typeReferenceDirectives | /// <reference types="..." /> directives |
libReferenceDirectives | /// <reference lib="..." /> directives |
pragmas | Processed comment pragmas |
identifiers | Map of interned identifier strings (avoids duplicate strings) |
nameTable | Map of identifier text to node position (for find-references) |
Sources: src/compiler/types.ts2000-2200 (SourceFile interface region)
When a file is edited, TypeScript avoids re-parsing the entire file. The public API for this is:
updateSourceFile(
sourceFile: SourceFile,
newText: string,
textChangeRange: TextChangeRange,
aggressiveChecks?: boolean
): SourceFile
TextChangeRange describes the span that changed: { span: TextSpan, newLength: number }.
Internally, the parser uses a syntaxCursor — an iterator over the original tree — to opportunistically reuse old nodes. A node can be reused if:
Nodes that intersect the changed span are always reparsed.
Note:
PossiblyContainsDynamicImportandPossiblyContainsImportMetainNodeFlagsare "permanently set" flags — once set, they are never cleared during incremental parsing. This is a deliberate trade-off for simplicity; these flags cause the program to re-scan for dynamic imports but this scenario is rare.
Sources: src/compiler/types.ts810-843 src/compiler/parser.ts360-405
JSX support is controlled by LanguageVariant.JSX (set when the file extension is .tsx or .jsx, or when scriptKind is JSX/TSX).
The scanner and parser cooperate on JSX parsing:
Diagram: JSX Parsing Collaboration
JSX-specific SyntaxKind values include:
| Kind | Description |
|---|---|
JsxElement | <Tag>...</Tag> |
JsxSelfClosingElement | <Tag /> |
JsxOpeningElement | The <Tag ...> opening part |
JsxClosingElement | The </Tag> closing part |
JsxFragment | <>...</> |
JsxAttribute | name={value} or name |
JsxSpreadAttribute | {...expr} |
JsxExpression | {expr} inside JSX |
JsxText | Text content between tags |
JsxNamespacedName | ns:name attribute names |
Sources: src/compiler/types.ts361-372 src/compiler/parser.ts224-238
JSDoc comments (/** ... */) are parsed on-demand, controlled by JSDocParsingMode:
| Mode | Description |
|---|---|
ParseAll | Parse all JSDoc comments fully |
ParseForTypeErrors | Parse only JSDoc needed for error reporting |
ParseForTypeInfo | Parse only JSDoc needed for type information |
ParseNone | Skip JSDoc parsing entirely |
JSDoc nodes are attached to their host node via the jsDoc property (type JSDoc[]). JSDoc has its own scanner mode (scanJsDocToken()) producing JSDocSyntaxKind tokens, which is a restricted subset.
Key JSDoc node types:
| Node | Description |
|---|---|
JSDoc | The /** ... */ container |
JSDocParameterTag | @param |
JSDocReturnTag | @returns |
JSDocTypeTag | @type |
JSDocTypedefTag | @typedef |
JSDocCallbackTag | @callback |
JSDocTemplateTag | @template |
JSDocSignature | Overload signature in JSDoc |
JSDocTypeLiteral | Inline object type from @param {object} |
JSDocImportTag | @import (TS 5.5+) |
Sources: src/compiler/types.ts396-444 src/compiler/parser.ts170-219
| Concept | Code Entity | File |
|---|---|---|
| Token producer | createScanner, Scanner | src/compiler/scanner.ts |
| Token kinds | SyntaxKind | src/compiler/types.ts |
| Token metadata | TokenFlags | src/compiler/types.ts |
| AST root | SourceFile | src/compiler/types.ts |
| AST node base | Node, NodeFlags | src/compiler/types.ts |
| Parser entry point | createSourceFile, updateSourceFile | src/compiler/parser.ts |
| Internal parser | parseSourceFile, parseSourceFileWorker | src/compiler/parser.ts |
| Node allocation | parseBaseNodeFactory, parseNodeFactory | src/compiler/parser.ts |
| Child traversal | forEachChild, forEachChildTable | src/compiler/parser.ts |
| Parse speculation | Scanner.lookAhead, Scanner.tryScan, SpeculationKind | src/compiler/parser.ts, src/compiler/scanner.ts |
| Parse error detection | containsParseError, NodeFlags.ThisNodeOrAnySubNodesHasError | src/compiler/parser.ts |
| JSX token scan | Scanner.scanJsxToken, Scanner.reScanJsxToken | src/compiler/scanner.ts |
| JSDoc token scan | Scanner.scanJsDocToken | src/compiler/scanner.ts |
| JSDoc parse mode | JSDocParsingMode | src/compiler/types.ts |
| Incremental reparse | updateSourceFile, TextChangeRange | src/compiler/parser.ts |
Sources: src/compiler/scanner.ts1-37 src/compiler/parser.ts1-441 src/compiler/types.ts40-848
Refresh this wiki
This wiki was recently refreshed. Please wait 4 days to refresh again.