Core Architecture

Relevant source files

Purpose and Scope

This document describes the fundamental architectural design of AnythingLLM, covering the three-layer system structure (frontend, backend, services), the configuration management pipeline, and the provider abstraction patterns that enable support for 30+ LLM providers and 10+ vector databases.

For information about specific LLM provider implementations, see Provider Architecture. For vector database configurations, see Vector Database Providers. For the document ingestion workflow, see Document Ingestion.

System Layers

AnythingLLM follows a three-layer architecture separating concerns between presentation, business logic, and data processing.

Layer Diagram

Sources: server/index.js frontend/src/main.jsx collector/index.js server/utils/prisma.js server/models/systemSettings.js1-50

Layer Responsibilities

Layer	Primary Purpose	Key Components	Port
Frontend	User interface and client-side state	React components, frontend models, routing	3000 (dev)
Backend	Business logic, API endpoints, authentication	Express server, Prisma ORM, models	3001
Collector Service	Document processing and parsing	Puppeteer, file parsers, OCR	8888
Storage	Persistent data storage	SQLite, LanceDB, file system	N/A

Sources: README.md152-161 server/.env.example1-10

Configuration Management System

The configuration system is the highest importance architectural component (cluster weight: 286.89 according to system analysis). It orchestrates all provider selections, API keys, and system settings through a centralized pipeline.

Configuration Flow

Sources: server/utils/helpers/updateENV.js1-338 server/models/systemSettings.js209-321

KEY_MAPPING Structure

The KEY_MAPPING object in server/utils/helpers/updateENV.js7-830 defines validation rules and lifecycle hooks for every configurable setting. Each key follows this structure:

Key validation functions include:

Validator	Purpose	Location
`isNotEmpty`	Ensures value is not empty	updateENV.js:832
`supportedLLM`	Validates against whitelist of 46 LLM providers	updateENV.js:909
`validDockerizedUrl`	Checks port availability in Docker	updateENV.js:1024
`validatePGVectorConnectionString`	Tests PostgreSQL connection	updateENV.js:1116
`supportedVectorDB`	Validates vector DB selection	updateENV.js:990

Sources: server/utils/helpers/updateENV.js7-830 server/utils/helpers/updateENV.js832-1159

Dual Persistence Model

Configuration changes are persisted to two locations simultaneously:

File System: .env file via dumpENV() updateENV.js1244-1332
Database: system_settings table via Prisma upsert systemSettings.js387-398

This dual approach ensures:

Configuration survives container restarts (.env file)
Configuration is queryable and auditable (database)
Settings can be read without file I/O (cached in process.env)

Sources: server/utils/helpers/updateENV.js1244-1332 server/models/systemSettings.js374-407

Critical Post-Update Hooks

When certain settings change, the system executes side effects to maintain data integrity:

Hook	Trigger	Action	Location
`handleVectorStoreReset`	Vector DB or embedding engine change	Purges all namespace data to prevent embedding mismatch	updateENV.js:1062
`downloadEmbeddingModelIfRequired`	Native embedder model change	Downloads new model in background	updateENV.js:1085
Provider-specific actions	Model selection changes	Cache context windows, unload models	updateENV.js:699-752

Sources: server/utils/helpers/updateENV.js1062-1094 server/utils/helpers/updateENV.js699-715

Provider Architecture

AnythingLLM uses a factory pattern with polymorphic interfaces to abstract 30+ LLM providers, 10+ vector databases, and 13+ embedding engines.

Provider Factory Pattern

Sources: server/utils/helpers/index.js131-248 server/utils/helpers/index.js254-303

Common Provider Interface

All LLM providers implement this interface defined in server/utils/helpers/index.js35-46:

Key methods:

promptWindowLimit(): Returns model's token limit for message compression
compressMessages(): Intelligently truncates history to fit context window
streamGetChatCompletion(): Streams responses via Server-Sent Events (SSE)
embedTextInput(): Delegates to paired embedding engine

Sources: server/utils/helpers/index.js35-52 server/utils/chats/stream.js53-56

Vector Database Factory

Similar pattern for vector databases via getVectorDbClass() server/utils/helpers/index.js84-124:

Common vector DB interface server/utils/helpers/index.js54-68:

connect(): Establishes client connection
namespace(): Retrieves collection/namespace
hasNamespace(): Checks if namespace exists
similarityResponse(): Performs vector search
addDocumentToNamespace(): Inserts vectors with caching
deleteDocumentFromNamespace(): Removes by docId
performSimilaritySearch(): High-level search with reranking support

Sources: server/utils/helpers/index.js84-124 server/utils/vectorDbProviders/lance/index.js17-50

Data Processing Pipeline

The document-to-vector pipeline processes uploaded files through multiple stages before storage.

Document Pipeline Flow

Sources: server/utils/files/index.js server/utils/TextSplitter/index.js1-100 server/utils/vectorDbProviders/lance/index.js301-400 collector/index.js

Text Splitting Configuration

Text splitting is controlled by system settings stored in the system_settings table:

Setting	Default	Description	Validation
`text_splitter_chunk_size`	1000	Characters per chunk	Must be > 0
`text_splitter_chunk_overlap`	20	Overlapping characters between chunks	Must be ≥ 0

When these settings change, SystemSettings.validations systemSettings.js80-108 triggers purgeEntireVectorCache() to invalidate all cached embeddings.

Sources: server/models/systemSettings.js80-108 server/utils/TextSplitter/index.js19-50

Vector Caching Mechanism

The system implements aggressive caching to avoid recomputing embeddings:

Cache Check: cachedVectorInformation(fullFilePath) files/index.js generates UUID-based cache key from file path
Cache Storage: Embeddings stored in storage/vector-cache/{hash}/ as JSON
Cache Invalidation:
- Text splitting settings change
- Embedding model change
- Manual purge via purgeEntireVectorCache()

Benefits:

Instant re-embedding when moving documents between workspaces
No API costs for re-processing same document
Faster workspace setup from existing documents

Sources: server/utils/files/index.js server/utils/vectorDbProviders/lance/index.js313-320

Multi-Tenancy Architecture

Workspaces provide tenant isolation at both the database and vector storage levels.

Workspace Isolation Model

Sources: server/prisma/schema.prisma26-100 server/models/workspace.js1-50

Workspace Settings Hierarchy

Workspaces can override system-level settings for per-tenant customization:

Setting	System Default	Workspace Override	Location
LLM Provider	`process.env.LLM_PROVIDER`	`workspace.chatProvider`	workspace.chatProvider:48
LLM Model	System model	`workspace.chatModel`	workspace.chatModel:49
Agent Provider	System agent	`workspace.agentProvider`	workspace.agentProvider:53
Similarity Threshold	0.25	`workspace.similarityThreshold`	workspace.similarityThreshold:47
Top N Results	4	`workspace.topN`	workspace.topN:50
Chat Mode	"chat"	`workspace.chatMode` ("chat" or "query")	workspace.chatMode:51

Workspace-level settings in server/models/workspace.js40-58:

Sources: server/models/workspace.js35-58 server/utils/chats/stream.js53-56

Vector Namespace Management

Each workspace has an isolated vector namespace identified by workspace.slug:

LanceDB: Separate table per workspace lance/index.js221-229
Chroma: Collection name normalized to meet requirements chroma/index.js31-65
Pinecone: Namespace within shared index pinecone/index.js
Qdrant: Separate collection per workspace qdrant/index.js

Namespace operations:

Sources: server/utils/vectorDbProviders/lance/index.js250-267 server/utils/chats/stream.js57-165

Request Flow: Chat Completion

A complete chat request flows through multiple architectural layers:

Key code locations:

Entry point: server/endpoints/api/workspace/index.js
Stream handler: server/utils/chats/stream.js18-282
Context assembly: server/utils/chats/stream.js94-196
Message compression: LLM provider's compressMessages() method
Response writing: server/utils/helpers/chat/responses.js

Sources: server/utils/chats/stream.js1-282 server/endpoints/api/workspace/index.js

Directory Structure

anythingllm/
├── frontend/                 # React application (Layer 1)
│   ├── src/
│   │   ├── components/       # UI components
│   │   ├── models/           # API client abstractions
│   │   ├── pages/            # Route-level components
│   │   └── utils/            # Frontend utilities
│   └── package.json
│
├── server/                   # Express backend (Layer 2)
│   ├── endpoints/            # API route handlers
│   │   ├── api/              # Main REST API
│   │   ├── embed/            # Embed widget API
│   │   └── admin/            # Admin endpoints
│   ├── models/               # Prisma model wrappers
│   │   ├── systemSettings.js # Configuration model
│   │   ├── workspace.js      # Workspace model
│   │   └── workspaceChats.js # Chat history
│   ├── utils/
│   │   ├── helpers/
│   │   │   ├── updateENV.js  # Configuration pipeline
│   │   │   └── index.js      # Provider factories
│   │   ├── AiProviders/      # LLM implementations
│   │   ├── EmbeddingEngines/ # Embedder implementations
│   │   ├── vectorDbProviders/# Vector DB implementations
│   │   ├── chats/            # Chat orchestration
│   │   ├── TextSplitter/     # Chunking logic
│   │   └── middleware/       # Express middleware
│   ├── prisma/
│   │   └── schema.prisma     # Database schema
│   ├── storage/              # Data directory
│   │   ├── anythingllm.db    # SQLite database
│   │   ├── lancedb/          # LanceDB files
│   │   ├── documents/        # Processed documents
│   │   └── vector-cache/     # Embedding cache
│   └── .env                  # Configuration file
│
└── collector/                # Document processor (Layer 3)
    ├── index.js              # Collector API server
    ├── utils/
    │   └── extensions/       # File format parsers
    └── hotdir/               # Temporary upload directory

Sources: README.md152-161 Repository structure

Key Architectural Decisions

Decision	Rationale	Trade-offs
Dual persistence (ENV + DB)	Survives container restarts while remaining queryable	Potential inconsistency if manually edited
Factory pattern for providers	Easy to add new providers without modifying core logic	Additional abstraction layer
Vector namespace per workspace	Complete tenant isolation, no data leakage	Higher storage usage, can't share embeddings
Aggressive vector caching	Avoid redundant API calls and computation	Storage overhead, cache invalidation complexity
Separate collector service	Isolate heavy dependencies (Puppeteer, Tesseract)	Additional service to manage
SQLite default database	Zero-configuration, embedded, portable	Not ideal for distributed deployments
Server-Sent Events (SSE) for streaming	Native browser support, simpler than WebSocket for one-way streams	Less flexible than full WebSocket

Sources: server/utils/helpers/updateENV.js1164-1220 server/utils/files/index.js server/utils/vectorDbProviders/lance/index.js17-50

Configuration Consumption Pattern

System settings flow from storage to runtime through multiple access patterns:

Access patterns:

Direct ENV access: process.env.LLM_PROVIDER in factory functions
CurrentSettings method: SystemSettings.currentSettings() returns merged ENV + DB state
Workspace overrides: workspace.chatProvider || process.env.LLM_PROVIDER

Sources: server/models/systemSettings.js209-321 server/utils/helpers/index.js131-248 server/utils/chats/stream.js53-56

Core Architecture

Relevant source files

Purpose and Scope

System Layers

AnythingLLM follows a three-layer architecture separating concerns between presentation, business logic, and data processing.

Layer Diagram

Sources: server/index.js frontend/src/main.jsx collector/index.js server/utils/prisma.js server/models/systemSettings.js1-50

Layer Responsibilities

Layer	Primary Purpose	Key Components	Port
Frontend	User interface and client-side state	React components, frontend models, routing	3000 (dev)
Backend	Business logic, API endpoints, authentication	Express server, Prisma ORM, models	3001
Collector Service	Document processing and parsing	Puppeteer, file parsers, OCR	8888
Storage	Persistent data storage	SQLite, LanceDB, file system	N/A

Sources: README.md152-161 server/.env.example1-10

Configuration Management System

Configuration Flow

Sources: server/utils/helpers/updateENV.js1-338 server/models/systemSettings.js209-321

KEY_MAPPING Structure

The KEY_MAPPING object in server/utils/helpers/updateENV.js7-830 defines validation rules and lifecycle hooks for every configurable setting. Each key follows this structure:

Key validation functions include:

Validator	Purpose	Location
`isNotEmpty`	Ensures value is not empty	updateENV.js:832
`supportedLLM`	Validates against whitelist of 46 LLM providers	updateENV.js:909
`validDockerizedUrl`	Checks port availability in Docker	updateENV.js:1024
`validatePGVectorConnectionString`	Tests PostgreSQL connection	updateENV.js:1116
`supportedVectorDB`	Validates vector DB selection	updateENV.js:990

Sources: server/utils/helpers/updateENV.js7-830 server/utils/helpers/updateENV.js832-1159

Dual Persistence Model

Configuration changes are persisted to two locations simultaneously:

File System: .env file via dumpENV() updateENV.js1244-1332
Database: system_settings table via Prisma upsert systemSettings.js387-398

This dual approach ensures:

Configuration survives container restarts (.env file)
Configuration is queryable and auditable (database)
Settings can be read without file I/O (cached in process.env)

Sources: server/utils/helpers/updateENV.js1244-1332 server/models/systemSettings.js374-407

Critical Post-Update Hooks

When certain settings change, the system executes side effects to maintain data integrity:

Hook	Trigger	Action	Location
`handleVectorStoreReset`	Vector DB or embedding engine change	Purges all namespace data to prevent embedding mismatch	updateENV.js:1062
`downloadEmbeddingModelIfRequired`	Native embedder model change	Downloads new model in background	updateENV.js:1085
Provider-specific actions	Model selection changes	Cache context windows, unload models	updateENV.js:699-752

Sources: server/utils/helpers/updateENV.js1062-1094 server/utils/helpers/updateENV.js699-715

Provider Architecture

AnythingLLM uses a factory pattern with polymorphic interfaces to abstract 30+ LLM providers, 10+ vector databases, and 13+ embedding engines.

Provider Factory Pattern

Sources: server/utils/helpers/index.js131-248 server/utils/helpers/index.js254-303

Common Provider Interface

All LLM providers implement this interface defined in server/utils/helpers/index.js35-46:

Key methods:

promptWindowLimit(): Returns model's token limit for message compression
compressMessages(): Intelligently truncates history to fit context window
streamGetChatCompletion(): Streams responses via Server-Sent Events (SSE)
embedTextInput(): Delegates to paired embedding engine

Sources: server/utils/helpers/index.js35-52 server/utils/chats/stream.js53-56

Vector Database Factory

Similar pattern for vector databases via getVectorDbClass() server/utils/helpers/index.js84-124:

Common vector DB interface server/utils/helpers/index.js54-68:

connect(): Establishes client connection
namespace(): Retrieves collection/namespace
hasNamespace(): Checks if namespace exists
similarityResponse(): Performs vector search
addDocumentToNamespace(): Inserts vectors with caching
deleteDocumentFromNamespace(): Removes by docId
performSimilaritySearch(): High-level search with reranking support

Sources: server/utils/helpers/index.js84-124 server/utils/vectorDbProviders/lance/index.js17-50

Data Processing Pipeline

The document-to-vector pipeline processes uploaded files through multiple stages before storage.

Document Pipeline Flow

Sources: server/utils/files/index.js server/utils/TextSplitter/index.js1-100 server/utils/vectorDbProviders/lance/index.js301-400 collector/index.js

Text Splitting Configuration

Text splitting is controlled by system settings stored in the system_settings table:

Setting	Default	Description	Validation
`text_splitter_chunk_size`	1000	Characters per chunk	Must be > 0
`text_splitter_chunk_overlap`	20	Overlapping characters between chunks	Must be ≥ 0

When these settings change, SystemSettings.validations systemSettings.js80-108 triggers purgeEntireVectorCache() to invalidate all cached embeddings.

Sources: server/models/systemSettings.js80-108 server/utils/TextSplitter/index.js19-50

Vector Caching Mechanism

The system implements aggressive caching to avoid recomputing embeddings:

Cache Check: cachedVectorInformation(fullFilePath) files/index.js generates UUID-based cache key from file path
Cache Storage: Embeddings stored in storage/vector-cache/{hash}/ as JSON
Cache Invalidation:
- Text splitting settings change
- Embedding model change
- Manual purge via purgeEntireVectorCache()

Benefits:

Instant re-embedding when moving documents between workspaces
No API costs for re-processing same document
Faster workspace setup from existing documents

Sources: server/utils/files/index.js server/utils/vectorDbProviders/lance/index.js313-320

Multi-Tenancy Architecture

Workspaces provide tenant isolation at both the database and vector storage levels.

Workspace Isolation Model

Sources: server/prisma/schema.prisma26-100 server/models/workspace.js1-50

Workspace Settings Hierarchy

Workspaces can override system-level settings for per-tenant customization:

Setting	System Default	Workspace Override	Location
LLM Provider	`process.env.LLM_PROVIDER`	`workspace.chatProvider`	workspace.chatProvider:48
LLM Model	System model	`workspace.chatModel`	workspace.chatModel:49
Agent Provider	System agent	`workspace.agentProvider`	workspace.agentProvider:53
Similarity Threshold	0.25	`workspace.similarityThreshold`	workspace.similarityThreshold:47
Top N Results	4	`workspace.topN`	workspace.topN:50
Chat Mode	"chat"	`workspace.chatMode` ("chat" or "query")	workspace.chatMode:51

Workspace-level settings in server/models/workspace.js40-58:

Sources: server/models/workspace.js35-58 server/utils/chats/stream.js53-56

Vector Namespace Management

Each workspace has an isolated vector namespace identified by workspace.slug:

LanceDB: Separate table per workspace lance/index.js221-229
Chroma: Collection name normalized to meet requirements chroma/index.js31-65
Pinecone: Namespace within shared index pinecone/index.js
Qdrant: Separate collection per workspace qdrant/index.js

Namespace operations:

Sources: server/utils/vectorDbProviders/lance/index.js250-267 server/utils/chats/stream.js57-165

Request Flow: Chat Completion

A complete chat request flows through multiple architectural layers:

Key code locations:

Entry point: server/endpoints/api/workspace/index.js
Stream handler: server/utils/chats/stream.js18-282
Context assembly: server/utils/chats/stream.js94-196
Message compression: LLM provider's compressMessages() method
Response writing: server/utils/helpers/chat/responses.js

Sources: server/utils/chats/stream.js1-282 server/endpoints/api/workspace/index.js

Directory Structure

anythingllm/
├── frontend/                 # React application (Layer 1)
│   ├── src/
│   │   ├── components/       # UI components
│   │   ├── models/           # API client abstractions
│   │   ├── pages/            # Route-level components
│   │   └── utils/            # Frontend utilities
│   └── package.json
│
├── server/                   # Express backend (Layer 2)
│   ├── endpoints/            # API route handlers
│   │   ├── api/              # Main REST API
│   │   ├── embed/            # Embed widget API
│   │   └── admin/            # Admin endpoints
│   ├── models/               # Prisma model wrappers
│   │   ├── systemSettings.js # Configuration model
│   │   ├── workspace.js      # Workspace model
│   │   └── workspaceChats.js # Chat history
│   ├── utils/
│   │   ├── helpers/
│   │   │   ├── updateENV.js  # Configuration pipeline
│   │   │   └── index.js      # Provider factories
│   │   ├── AiProviders/      # LLM implementations
│   │   ├── EmbeddingEngines/ # Embedder implementations
│   │   ├── vectorDbProviders/# Vector DB implementations
│   │   ├── chats/            # Chat orchestration
│   │   ├── TextSplitter/     # Chunking logic
│   │   └── middleware/       # Express middleware
│   ├── prisma/
│   │   └── schema.prisma     # Database schema
│   ├── storage/              # Data directory
│   │   ├── anythingllm.db    # SQLite database
│   │   ├── lancedb/          # LanceDB files
│   │   ├── documents/        # Processed documents
│   │   └── vector-cache/     # Embedding cache
│   └── .env                  # Configuration file
│
└── collector/                # Document processor (Layer 3)
    ├── index.js              # Collector API server
    ├── utils/
    │   └── extensions/       # File format parsers
    └── hotdir/               # Temporary upload directory

Sources: README.md152-161 Repository structure

Key Architectural Decisions

Decision	Rationale	Trade-offs
Dual persistence (ENV + DB)	Survives container restarts while remaining queryable	Potential inconsistency if manually edited
Factory pattern for providers	Easy to add new providers without modifying core logic	Additional abstraction layer
Vector namespace per workspace	Complete tenant isolation, no data leakage	Higher storage usage, can't share embeddings
Aggressive vector caching	Avoid redundant API calls and computation	Storage overhead, cache invalidation complexity
Separate collector service	Isolate heavy dependencies (Puppeteer, Tesseract)	Additional service to manage
SQLite default database	Zero-configuration, embedded, portable	Not ideal for distributed deployments
Server-Sent Events (SSE) for streaming	Native browser support, simpler than WebSocket for one-way streams	Less flexible than full WebSocket

Sources: server/utils/helpers/updateENV.js1164-1220 server/utils/files/index.js server/utils/vectorDbProviders/lance/index.js17-50

Configuration Consumption Pattern

System settings flow from storage to runtime through multiple access patterns:

Access patterns:

Direct ENV access: process.env.LLM_PROVIDER in factory functions
CurrentSettings method: SystemSettings.currentSettings() returns merged ENV + DB state
Workspace overrides: workspace.chatProvider || process.env.LLM_PROVIDER

Sources: server/models/systemSettings.js209-321 server/utils/helpers/index.js131-248 server/utils/chats/stream.js53-56

Core Architecture

Purpose and Scope

System Layers

Layer Diagram

Layer Responsibilities

Configuration Management System

Configuration Flow

KEY_MAPPING Structure

Dual Persistence Model

Critical Post-Update Hooks

Provider Architecture

Provider Factory Pattern

Common Provider Interface

Vector Database Factory

Data Processing Pipeline

Document Pipeline Flow

Text Splitting Configuration

Vector Caching Mechanism

Multi-Tenancy Architecture

Workspace Isolation Model

Workspace Settings Hierarchy

Vector Namespace Management

Request Flow: Chat Completion

Directory Structure

Key Architectural Decisions

Configuration Consumption Pattern

On this page

Core Architecture

Purpose and Scope

System Layers

Layer Diagram

Layer Responsibilities

Configuration Management System

Configuration Flow

KEY_MAPPING Structure

Dual Persistence Model

Critical Post-Update Hooks

Provider Architecture

Provider Factory Pattern

Common Provider Interface

Vector Database Factory

Data Processing Pipeline

Document Pipeline Flow

Text Splitting Configuration

Vector Caching Mechanism

Multi-Tenancy Architecture

Workspace Isolation Model

Workspace Settings Hierarchy

Vector Namespace Management

Request Flow: Chat Completion

Directory Structure

Key Architectural Decisions

Configuration Consumption Pattern

On this page