This document describes the fundamental architectural design of AnythingLLM, covering the three-layer system structure (frontend, backend, services), the configuration management pipeline, and the provider abstraction patterns that enable support for 30+ LLM providers and 10+ vector databases.
For information about specific LLM provider implementations, see Provider Architecture. For vector database configurations, see Vector Database Providers. For the document ingestion workflow, see Document Ingestion.
AnythingLLM follows a three-layer architecture separating concerns between presentation, business logic, and data processing.
Sources: server/index.js frontend/src/main.jsx collector/index.js server/utils/prisma.js server/models/systemSettings.js1-50
| Layer | Primary Purpose | Key Components | Port |
|---|---|---|---|
| Frontend | User interface and client-side state | React components, frontend models, routing | 3000 (dev) |
| Backend | Business logic, API endpoints, authentication | Express server, Prisma ORM, models | 3001 |
| Collector Service | Document processing and parsing | Puppeteer, file parsers, OCR | 8888 |
| Storage | Persistent data storage | SQLite, LanceDB, file system | N/A |
Sources: README.md152-161 server/.env.example1-10
The configuration system is the highest importance architectural component (cluster weight: 286.89 according to system analysis). It orchestrates all provider selections, API keys, and system settings through a centralized pipeline.
Sources: server/utils/helpers/updateENV.js1-338 server/models/systemSettings.js209-321
The KEY_MAPPING object in server/utils/helpers/updateENV.js7-830 defines validation rules and lifecycle hooks for every configurable setting. Each key follows this structure:
Key validation functions include:
| Validator | Purpose | Location |
|---|---|---|
isNotEmpty | Ensures value is not empty | updateENV.js:832 |
supportedLLM | Validates against whitelist of 46 LLM providers | updateENV.js:909 |
validDockerizedUrl | Checks port availability in Docker | updateENV.js:1024 |
validatePGVectorConnectionString | Tests PostgreSQL connection | updateENV.js:1116 |
supportedVectorDB | Validates vector DB selection | updateENV.js:990 |
Sources: server/utils/helpers/updateENV.js7-830 server/utils/helpers/updateENV.js832-1159
Configuration changes are persisted to two locations simultaneously:
.env file via dumpENV() updateENV.js1244-1332system_settings table via Prisma upsert systemSettings.js387-398This dual approach ensures:
.env file)process.env)Sources: server/utils/helpers/updateENV.js1244-1332 server/models/systemSettings.js374-407
When certain settings change, the system executes side effects to maintain data integrity:
| Hook | Trigger | Action | Location |
|---|---|---|---|
handleVectorStoreReset | Vector DB or embedding engine change | Purges all namespace data to prevent embedding mismatch | updateENV.js:1062 |
downloadEmbeddingModelIfRequired | Native embedder model change | Downloads new model in background | updateENV.js:1085 |
| Provider-specific actions | Model selection changes | Cache context windows, unload models | updateENV.js:699-752 |
Sources: server/utils/helpers/updateENV.js1062-1094 server/utils/helpers/updateENV.js699-715
AnythingLLM uses a factory pattern with polymorphic interfaces to abstract 30+ LLM providers, 10+ vector databases, and 13+ embedding engines.
Sources: server/utils/helpers/index.js131-248 server/utils/helpers/index.js254-303
All LLM providers implement this interface defined in server/utils/helpers/index.js35-46:
Key methods:
promptWindowLimit(): Returns model's token limit for message compressioncompressMessages(): Intelligently truncates history to fit context windowstreamGetChatCompletion(): Streams responses via Server-Sent Events (SSE)embedTextInput(): Delegates to paired embedding engineSources: server/utils/helpers/index.js35-52 server/utils/chats/stream.js53-56
Similar pattern for vector databases via getVectorDbClass() server/utils/helpers/index.js84-124:
Common vector DB interface server/utils/helpers/index.js54-68:
connect(): Establishes client connectionnamespace(): Retrieves collection/namespacehasNamespace(): Checks if namespace existssimilarityResponse(): Performs vector searchaddDocumentToNamespace(): Inserts vectors with cachingdeleteDocumentFromNamespace(): Removes by docIdperformSimilaritySearch(): High-level search with reranking supportSources: server/utils/helpers/index.js84-124 server/utils/vectorDbProviders/lance/index.js17-50
The document-to-vector pipeline processes uploaded files through multiple stages before storage.
Sources: server/utils/files/index.js server/utils/TextSplitter/index.js1-100 server/utils/vectorDbProviders/lance/index.js301-400 collector/index.js
Text splitting is controlled by system settings stored in the system_settings table:
| Setting | Default | Description | Validation |
|---|---|---|---|
text_splitter_chunk_size | 1000 | Characters per chunk | Must be > 0 |
text_splitter_chunk_overlap | 20 | Overlapping characters between chunks | Must be ≥ 0 |
When these settings change, SystemSettings.validations systemSettings.js80-108 triggers purgeEntireVectorCache() to invalidate all cached embeddings.
Sources: server/models/systemSettings.js80-108 server/utils/TextSplitter/index.js19-50
The system implements aggressive caching to avoid recomputing embeddings:
cachedVectorInformation(fullFilePath) files/index.js generates UUID-based cache key from file pathstorage/vector-cache/{hash}/ as JSONpurgeEntireVectorCache()Benefits:
Sources: server/utils/files/index.js server/utils/vectorDbProviders/lance/index.js313-320
Workspaces provide tenant isolation at both the database and vector storage levels.
Sources: server/prisma/schema.prisma26-100 server/models/workspace.js1-50
Workspaces can override system-level settings for per-tenant customization:
| Setting | System Default | Workspace Override | Location |
|---|---|---|---|
| LLM Provider | process.env.LLM_PROVIDER | workspace.chatProvider | workspace.chatProvider:48 |
| LLM Model | System model | workspace.chatModel | workspace.chatModel:49 |
| Agent Provider | System agent | workspace.agentProvider | workspace.agentProvider:53 |
| Similarity Threshold | 0.25 | workspace.similarityThreshold | workspace.similarityThreshold:47 |
| Top N Results | 4 | workspace.topN | workspace.topN:50 |
| Chat Mode | "chat" | workspace.chatMode ("chat" or "query") | workspace.chatMode:51 |
Workspace-level settings in server/models/workspace.js40-58:
Sources: server/models/workspace.js35-58 server/utils/chats/stream.js53-56
Each workspace has an isolated vector namespace identified by workspace.slug:
Namespace operations:
Sources: server/utils/vectorDbProviders/lance/index.js250-267 server/utils/chats/stream.js57-165
A complete chat request flows through multiple architectural layers:
Key code locations:
compressMessages() methodSources: server/utils/chats/stream.js1-282 server/endpoints/api/workspace/index.js
anythingllm/
├── frontend/ # React application (Layer 1)
│ ├── src/
│ │ ├── components/ # UI components
│ │ ├── models/ # API client abstractions
│ │ ├── pages/ # Route-level components
│ │ └── utils/ # Frontend utilities
│ └── package.json
│
├── server/ # Express backend (Layer 2)
│ ├── endpoints/ # API route handlers
│ │ ├── api/ # Main REST API
│ │ ├── embed/ # Embed widget API
│ │ └── admin/ # Admin endpoints
│ ├── models/ # Prisma model wrappers
│ │ ├── systemSettings.js # Configuration model
│ │ ├── workspace.js # Workspace model
│ │ └── workspaceChats.js # Chat history
│ ├── utils/
│ │ ├── helpers/
│ │ │ ├── updateENV.js # Configuration pipeline
│ │ │ └── index.js # Provider factories
│ │ ├── AiProviders/ # LLM implementations
│ │ ├── EmbeddingEngines/ # Embedder implementations
│ │ ├── vectorDbProviders/# Vector DB implementations
│ │ ├── chats/ # Chat orchestration
│ │ ├── TextSplitter/ # Chunking logic
│ │ └── middleware/ # Express middleware
│ ├── prisma/
│ │ └── schema.prisma # Database schema
│ ├── storage/ # Data directory
│ │ ├── anythingllm.db # SQLite database
│ │ ├── lancedb/ # LanceDB files
│ │ ├── documents/ # Processed documents
│ │ └── vector-cache/ # Embedding cache
│ └── .env # Configuration file
│
└── collector/ # Document processor (Layer 3)
├── index.js # Collector API server
├── utils/
│ └── extensions/ # File format parsers
└── hotdir/ # Temporary upload directory
Sources: README.md152-161 Repository structure
| Decision | Rationale | Trade-offs |
|---|---|---|
| Dual persistence (ENV + DB) | Survives container restarts while remaining queryable | Potential inconsistency if manually edited |
| Factory pattern for providers | Easy to add new providers without modifying core logic | Additional abstraction layer |
| Vector namespace per workspace | Complete tenant isolation, no data leakage | Higher storage usage, can't share embeddings |
| Aggressive vector caching | Avoid redundant API calls and computation | Storage overhead, cache invalidation complexity |
| Separate collector service | Isolate heavy dependencies (Puppeteer, Tesseract) | Additional service to manage |
| SQLite default database | Zero-configuration, embedded, portable | Not ideal for distributed deployments |
| Server-Sent Events (SSE) for streaming | Native browser support, simpler than WebSocket for one-way streams | Less flexible than full WebSocket |
Sources: server/utils/helpers/updateENV.js1164-1220 server/utils/files/index.js server/utils/vectorDbProviders/lance/index.js17-50
System settings flow from storage to runtime through multiple access patterns:
Access patterns:
process.env.LLM_PROVIDER in factory functionsSystemSettings.currentSettings() returns merged ENV + DB stateworkspace.chatProvider || process.env.LLM_PROVIDERSources: server/models/systemSettings.js209-321 server/utils/helpers/index.js131-248 server/utils/chats/stream.js53-56
Refresh this wiki