Course Structure

Relevant source files

Purpose and Scope

This document describes the organizational structure of the LLM Course repository, detailing the three main learning paths and how they interconnect. It explains the progression of topics, prerequisites, and the relationship between theoretical content, practical notebooks, and external resources.

For information about specific learning resources (notebooks, articles, tools), see Learning Resources. For detailed coverage of individual topics, refer to the respective path sections: LLM Fundamentals, The LLM Scientist, and The LLM Engineer.

Sources: README.md12-16

Three-Track Architecture

The course is organized into three distinct learning tracks with different entry points and objectives. Each track serves a specific purpose in the LLM learning journey.

Three Main Tracks

Sources: README.md12-16 README.md74-157 README.md159-304 README.md305-402

Track 1: LLM Fundamentals (Optional)

The Fundamentals track provides prerequisite knowledge for learners without prior machine learning experience. This track is optional and can be skipped by those already familiar with the topics.

Section	Topic	Key Concepts	Lines
1	Mathematics for Machine Learning	Linear algebra, calculus, probability/statistics	83-100
2	Python for Machine Learning	Python basics, NumPy, Pandas, Scikit-learn	103-119
3	Neural Networks	Network fundamentals, backpropagation, PyTorch	122-137
4	Natural Language Processing	Text preprocessing, embeddings, RNNs	140-157

Fundamentals Track Progression

Sources: README.md74-157 README.md83-100 README.md103-119 README.md122-137 README.md140-157

Track 2: The LLM Scientist

The Scientist track focuses on building and training LLMs, covering the complete pipeline from architecture through deployment preparation. This track emphasizes model development, optimization, and evaluation.

Scientist Track: Linear Progression

Key Frameworks and Tools

Stage	Frameworks	Lines
Fine-Tuning	TRL, Unsloth, Axolotl	223
Alignment	TRL, verl, OpenRLHF	241
Evaluation	LM Evaluation Harness, Lighteval	265-266
Quantization	llama.cpp, AutoGPTQ, ExLlamaV2	275-277
Merging	mergekit	292

Sources: README.md159-304 README.md165-180 README.md219-233 README.md235-251 README.md270-286 README.md288-304

Track 3: The LLM Engineer

The Engineer track focuses on building applications with LLMs and deploying them to production. This track emphasizes practical implementation, system integration, and operational concerns.

Engineer Track: Application Pipeline

LLM APIs and Platforms

Category	Providers	Lines
Private APIs	OpenAI, Google, Anthropic	315
Open-Source APIs	OpenRouter, Hugging Face, Together AI	315
Local Execution	Ollama, llama.cpp, vLLM	318-320
Vector Databases	Chroma, Pinecone, Milvus	329-330
RAG Frameworks	LangChain, LlamaIndex	337-338

Sources: README.md305-402 README.md311-324 README.md326-333 README.md335-345 README.md357-369

Practical Resources Integration

The course integrates 23 hands-on Colab notebooks that implement concepts from the theoretical tracks. These notebooks are organized into four categories.

Notebook-to-Track Mapping

Notebook Categories

Category	Count	Examples	Lines
Tools	8	LLM AutoEval, LazyMergekit, AutoQuant	32-42
Fine-tuning	6	Llama 3.1 + Unsloth, Mistral + QLoRA	45-52
Quantization	4	GPTQ 4-bit, GGUF + llama.cpp	55-61
Advanced	5	Merge with MergeKit, Create MoEs	64-71

Sources: README.md23-72 README.md32-42 README.md45-52 README.md55-61 README.md64-71

Knowledge Flow and Dependencies

The three tracks have specific dependency relationships that determine the optimal learning path.

Inter-Track Dependencies

Critical Gateway Concepts

The Transformer Architecture (Section 3.1) serves as the primary gateway concept that enables both the Scientist and Engineer tracks. Understanding attention mechanisms and tokenization is essential before proceeding to either:

Scientist path: Training techniques, fine-tuning, and optimization
Engineer path: Prompt engineering, RAG, and application development

Sources: README.md165-180 README.md311-324

Content Delivery Model

Each topic in the course follows a three-component delivery model:

Content Type Distribution

Component	Count	Purpose
Theoretical Topics	24	Core concepts and principles (4 Fundamentals + 8 Scientist + 8 Engineer + 4 additional sections)
Practical Notebooks	23	Hands-on implementations
External References	~50+	Deep-dive articles, video courses, documentation

Sources: README.md12-402

Entry Points and Flexibility

The course structure supports multiple entry points depending on prior knowledge:

Learning Path Entry Points

Background	Recommended Entry	Skip
No ML experience	Start at Fundamentals (Section 2)	None
ML background, new to LLMs	Start at LLM Scientist (Section 3.1)	Fundamentals
Want to build applications	Start at LLM Engineer (Section 4.1)	Fundamentals, review 3.1 as needed
Experienced practitioners	Jump to specific topics	Review only gaps

Standalone Topics: Sections 3.7 (Quantization), 3.8 (New Trends), and 4.8 (Security) can be studied independently after understanding the core architecture concepts in Section 3.1.

Sources: README.md12-21 README.md74-76

Repository Organization

The course content is primarily structured through the README.md file, which serves as the central navigation hub:

File Structure

Key Files:

README.md: Complete course structure and navigation
img/roadmap_fundamentals.png: Visual guide for optional prerequisites
img/roadmap_scientist.png: Visual guide for model building track
img/roadmap_engineer.png: Visual guide for application development track
img/colab.svg: Badge icon for notebook links

Sources: README.md1-9 img/roadmap_scientist.png img/roadmap_engineer.png

The course uses a hierarchical numbering system that maps to the table of contents structure:

Section 2.x: LLM Fundamentals (4 subsections)
Section 3.x: The LLM Scientist (8 subsections)
Section 4.x: The LLM Engineer (8 subsections)
Section 5.x: Practical Resources (5 subsections covering notebooks)

Each section in the README includes:

Overview: Bullet points explaining key concepts
📚 References: Links to articles, videos, and tools
Colab links: Where applicable, direct links to executable notebooks

Sources: README.md74-402

Course Structure

Relevant source files

Purpose and Scope

Sources: README.md12-16

Three-Track Architecture

The course is organized into three distinct learning tracks with different entry points and objectives. Each track serves a specific purpose in the LLM learning journey.

Three Main Tracks

Sources: README.md12-16 README.md74-157 README.md159-304 README.md305-402

Track 1: LLM Fundamentals (Optional)

The Fundamentals track provides prerequisite knowledge for learners without prior machine learning experience. This track is optional and can be skipped by those already familiar with the topics.

Section	Topic	Key Concepts	Lines
1	Mathematics for Machine Learning	Linear algebra, calculus, probability/statistics	83-100
2	Python for Machine Learning	Python basics, NumPy, Pandas, Scikit-learn	103-119
3	Neural Networks	Network fundamentals, backpropagation, PyTorch	122-137
4	Natural Language Processing	Text preprocessing, embeddings, RNNs	140-157

Fundamentals Track Progression

Sources: README.md74-157 README.md83-100 README.md103-119 README.md122-137 README.md140-157

Track 2: The LLM Scientist

Scientist Track: Linear Progression

Key Frameworks and Tools

Stage	Frameworks	Lines
Fine-Tuning	TRL, Unsloth, Axolotl	223
Alignment	TRL, verl, OpenRLHF	241
Evaluation	LM Evaluation Harness, Lighteval	265-266
Quantization	llama.cpp, AutoGPTQ, ExLlamaV2	275-277
Merging	mergekit	292

Sources: README.md159-304 README.md165-180 README.md219-233 README.md235-251 README.md270-286 README.md288-304

Track 3: The LLM Engineer

The Engineer track focuses on building applications with LLMs and deploying them to production. This track emphasizes practical implementation, system integration, and operational concerns.

Engineer Track: Application Pipeline

LLM APIs and Platforms

Category	Providers	Lines
Private APIs	OpenAI, Google, Anthropic	315
Open-Source APIs	OpenRouter, Hugging Face, Together AI	315
Local Execution	Ollama, llama.cpp, vLLM	318-320
Vector Databases	Chroma, Pinecone, Milvus	329-330
RAG Frameworks	LangChain, LlamaIndex	337-338

Sources: README.md305-402 README.md311-324 README.md326-333 README.md335-345 README.md357-369

Practical Resources Integration

The course integrates 23 hands-on Colab notebooks that implement concepts from the theoretical tracks. These notebooks are organized into four categories.

Notebook-to-Track Mapping

Notebook Categories

Category	Count	Examples	Lines
Tools	8	LLM AutoEval, LazyMergekit, AutoQuant	32-42
Fine-tuning	6	Llama 3.1 + Unsloth, Mistral + QLoRA	45-52
Quantization	4	GPTQ 4-bit, GGUF + llama.cpp	55-61
Advanced	5	Merge with MergeKit, Create MoEs	64-71

Sources: README.md23-72 README.md32-42 README.md45-52 README.md55-61 README.md64-71

Knowledge Flow and Dependencies

The three tracks have specific dependency relationships that determine the optimal learning path.

Inter-Track Dependencies

Critical Gateway Concepts

Scientist path: Training techniques, fine-tuning, and optimization
Engineer path: Prompt engineering, RAG, and application development

Sources: README.md165-180 README.md311-324

Content Delivery Model

Each topic in the course follows a three-component delivery model:

Content Type Distribution

Component	Count	Purpose
Theoretical Topics	24	Core concepts and principles (4 Fundamentals + 8 Scientist + 8 Engineer + 4 additional sections)
Practical Notebooks	23	Hands-on implementations
External References	~50+	Deep-dive articles, video courses, documentation

Sources: README.md12-402

Entry Points and Flexibility

The course structure supports multiple entry points depending on prior knowledge:

Learning Path Entry Points

Background	Recommended Entry	Skip
No ML experience	Start at Fundamentals (Section 2)	None
ML background, new to LLMs	Start at LLM Scientist (Section 3.1)	Fundamentals
Want to build applications	Start at LLM Engineer (Section 4.1)	Fundamentals, review 3.1 as needed
Experienced practitioners	Jump to specific topics	Review only gaps

Standalone Topics: Sections 3.7 (Quantization), 3.8 (New Trends), and 4.8 (Security) can be studied independently after understanding the core architecture concepts in Section 3.1.

Sources: README.md12-21 README.md74-76

Repository Organization

The course content is primarily structured through the README.md file, which serves as the central navigation hub:

File Structure

Key Files:

README.md: Complete course structure and navigation
img/roadmap_fundamentals.png: Visual guide for optional prerequisites
img/roadmap_scientist.png: Visual guide for model building track
img/roadmap_engineer.png: Visual guide for application development track
img/colab.svg: Badge icon for notebook links

Sources: README.md1-9 img/roadmap_scientist.png img/roadmap_engineer.png

The course uses a hierarchical numbering system that maps to the table of contents structure:

Section 2.x: LLM Fundamentals (4 subsections)
Section 3.x: The LLM Scientist (8 subsections)
Section 4.x: The LLM Engineer (8 subsections)
Section 5.x: Practical Resources (5 subsections covering notebooks)

Each section in the README includes:

Overview: Bullet points explaining key concepts
📚 References: Links to articles, videos, and tools
Colab links: Where applicable, direct links to executable notebooks

Sources: README.md74-402

Course Structure

Purpose and Scope

Three-Track Architecture

Track 1: LLM Fundamentals (Optional)

Track 2: The LLM Scientist

Track 3: The LLM Engineer

Practical Resources Integration

Knowledge Flow and Dependencies

Content Delivery Model

Entry Points and Flexibility

Repository Organization

Navigation Pattern

On this page

Course Structure

Purpose and Scope

Three-Track Architecture

Track 1: LLM Fundamentals (Optional)

Track 2: The LLM Scientist

Track 3: The LLM Engineer

Practical Resources Integration

Knowledge Flow and Dependencies

Content Delivery Model

Entry Points and Flexibility

Repository Organization

Navigation Pattern

On this page