This document provides technical guidance for deploying AnythingLLM using Docker containers. It covers the multi-stage Dockerfile architecture, multi-architecture builds (ARM64/AMD64), volume configuration, environment setup, and container orchestration patterns.
For environment variable configuration details, see Environment Configuration. For CI/CD pipeline and automated build processes, see CI/CD and Build Process.
AnythingLLM provides official Docker images published to Docker Hub at mintplexlabs/anythingllm. The Docker deployment strategy uses a multi-stage build process to optimize image size and support both AMD64 and ARM64 architectures. The container runs three services: the Express server (port 3001), the document collector service, and serves the React frontend.
Key Characteristics:
anythingllm user (UID/GID 1000 by default)/app/server/storage (configurable)linux/amd64 and linux/arm64Sources: docker/Dockerfile1-183 .github/workflows/dev-build.yaml1-120
Build Stage Breakdown:
| Stage | Purpose | Key Operations |
|---|---|---|
base | Foundation | Defines build arguments for UID/GID |
build-arm64 | ARM setup | Installs ARM-compatible Chromium for Puppeteer |
build-amd64 | AMD setup | Standard system dependencies |
build | Common setup | User creation, helper scripts, system packages |
frontend-build | UI compilation | Vite build on native architecture (avoids QEMU) |
backend-build | Server setup | Install production dependencies for server and collector |
production-build | Final image | Combine all components, set production environment |
Sources: docker/Dockerfile1-183 docker/Dockerfile131-183
The Dockerfile implements architecture-specific build paths to handle platform differences, particularly for Puppeteer/Chromium compatibility.
ARM64 Chromium Patching:
ARM64 builds require a custom Chromium binary because Puppeteer's default distribution does not include ARM-compatible builds. The Dockerfile handles this at docker/Dockerfile63-70:
ENV PUPPETEER_SKIP_CHROMIUM_DOWNLOAD=true
ENV CHROME_PATH=/app/chrome-linux/chrome
ENV PUPPETEER_EXECUTABLE_PATH=/app/chrome-linux/chrome
The ARM-specific Chromium binary is downloaded from https://webassets.anythingllm.com/chromium-1088-linux-arm64.zip and extracted to /app/chrome-linux/.
Frontend Build Optimization:
The frontend build stage uses --platform=$BUILDPLATFORM to compile on the native host architecture rather than the target architecture. This avoids esbuild crashes under QEMU emulation during cross-compilation. Since the output is static HTML/CSS/JS, it's platform-independent.
Sources: docker/Dockerfile8-73 docker/Dockerfile77-125 docker/Dockerfile141-148 .github/workflows/dev-build.yaml69-81
AnythingLLM requires persistent storage for documents, vector databases, SQLite database, and cache files. All persistent data is stored under a single configurable directory.
Default Storage Structure:
| Directory | Purpose | Size Considerations |
|---|---|---|
documents/ | Uploaded files organized by workspace/folder | Grows with document uploads |
vector-cache/ | Cached embedding vectors (UUID-based filenames) | Grows with unique documents |
lancedb/ | LanceDB vector database files (if using default) | Grows with vectorized chunks |
anythingllm.db | SQLite database (users, workspaces, chat history) | Typically <100MB |
hotdir/ | Temporary processing directory for collector | Self-cleaning |
Volume Mount Examples:
Named volume:
Bind mount:
Custom storage directory (requires matching STORAGE_DIR environment variable):
Kubernetes PersistentVolume Configuration:
The Helm chart creates or mounts a PVC based on persistentVolume.* values. The mount path must match the config.STORAGE_DIR value.
Sources: docker/Dockerfile173-174 cloud-deployments/helm/charts/anythingllm/values.yaml17-70
The container runs as a non-root user for security. The UID/GID are configurable at build time to match host filesystem permissions.
Build-Time User Setup:
docker/Dockerfile42-46 (for ARM64) and docker/Dockerfile111-115 (for AMD64):
anythingllm group with specified GIDanythingllm user with specified UID/appCustom UID/GID Build:
To build with custom UID/GID:
Runtime Ownership Issues:
If the mounted volume has incorrect ownership, the container cannot write to storage. Solutions:
Sources: docker/Dockerfile42-46 docker/Dockerfile111-115 cloud-deployments/helm/charts/anythingllm/values.yaml1-6
The Dockerfile installs a comprehensive set of system packages to support document processing, AI operations, and web scraping.
Core Dependencies:
| Package Category | Packages | Purpose |
|---|---|---|
| Build Tools | build-essential, git | Compiling native Node modules |
| Node.js | nodejs (18.x), yarn (1.22.19) | JavaScript runtime and package manager |
| Python Tools | uvx (0.6.10) | MCP (Model Context Protocol) support |
| Media Processing | ffmpeg | Audio/video transcription |
| Web Scraping | Chromium dependencies | Puppeteer browser automation |
| System Utilities | curl, gnupg, netcat-openbsd, tzdata | Networking and time zone support |
Chromium/Puppeteer Dependencies:
The Dockerfile installs an extensive list of libraries required for headless Chromium operation (docker/Dockerfile17-22 docker/Dockerfile86-91):
libgbm1, libgtk-3-0, libnspr4, libnss3 - Graphics and renderinglibatk1.0-0, libcups2, libdbus-1-3 - UI accessibility and printinglibx11-6, libxcomposite1, libxrandr2 - X11 display server librariesfonts-liberation, ca-certificates - Font rendering and TLS certificatesMCP (Model Context Protocol) Support:
MCP servers are Python-based tools that extend agent capabilities. The Dockerfile installs uvx version 0.6.10 to execute MCP servers:
docker/Dockerfile32-36 (ARM64):
curl -LsSf https://astral.sh/uv/0.6.10/install.sh | sh
mv /root/.local/bin/uv /usr/local/bin/uv
mv /root/.local/bin/uvx /usr/local/bin/uvx
This enables the agent system to dynamically install and run MCP servers as plugins.
Sources: docker/Dockerfile14-38 docker/Dockerfile84-107
Key Build Optimizations:
Network Timeout: yarn install uses --network-timeout 100000 to handle slow or unreliable connections during dependency downloads
Production Dependencies: Uses yarn install --production to exclude dev dependencies, reducing image size
Layer Caching: Package files are copied before source code to leverage Docker layer caching (dependencies change less frequently than source)
Frontend Platform Independence: Frontend build runs on BUILDPLATFORM (native host) to avoid esbuild QEMU issues. The compiled output is platform-agnostic static files
Puppeteer Download URL: Sets PUPPETEER_DOWNLOAD_BASE_URL=https://storage.googleapis.com/chrome-for-testing-public to use the official Chrome for Testing distribution
Yarn Cache Cleaning:
Both frontend and backend builds run yarn cache clean after installation to remove temporary files and reduce final image size.
Sources: docker/Dockerfile141-162 docker/Dockerfile167-183
The container uses docker-entrypoint.sh as the entrypoint, which handles startup initialization. Key responsibilities include:
docker/Dockerfile49 docker/Dockerfile118 docker/Dockerfile182
The container includes a built-in health check that probes the API health endpoint:
Configuration:
/v1/api/health (port 8888 internally mapped)docker-healthcheck.shThe health check is defined at docker/Dockerfile177-178:
HEALTHCHECK --interval=1m --timeout=10s --start-period=1m \
CMD /bin/bash /usr/local/bin/docker-healthcheck.sh || exit 1
Kubernetes health probes in the Helm chart use similar configuration (cloud-deployments/helm/charts/anythingllm/values.yaml173-186):
| Probe Type | Path | Port | Initial Delay | Period | Other Settings |
|---|---|---|---|---|---|
| Readiness | /v1/api/health | 8888 | 15s | 5s | successThreshold: 2 |
| Liveness | /v1/api/health | 8888 | 15s | 5s | failureThreshold: 3 |
The container accepts configuration through environment variables. Core variables set at build time:
NODE_ENV=production
ANYTHING_LLM_RUNTIME=docker
DEPLOYMENT_VERSION=1.11.0
The ANYTHING_LLM_RUNTIME=docker flag informs the application it's running in a containerized environment, which may alter certain behaviors (e.g., path resolution, collector communication).
For complete environment variable reference, see Environment Configuration.
Sources: docker/Dockerfile49-55 docker/Dockerfile117-124 docker/Dockerfile172-182
While not included in the repository, a typical Docker Compose configuration follows this pattern:
Configuration Notes:
anythingllm-storage volume for persistence.env file for sensitive configuration (API keys)unless-stopped ensures container restarts after crashesBind Mount Alternative:
Requires host directory ownership: sudo chown -R 1000:1000 ./storage
Sources: docker/.env.example docker/Dockerfile177-178
The official Helm chart provides production-ready Kubernetes deployment with configurable persistence, secrets management, and ingress.
Helm Chart Configuration Values:
| Value Path | Default | Purpose |
|---|---|---|
image.repository | mintplexlabs/anythingllm | Docker image repository |
image.tag | 1.11.0 | Image version |
service.type | ClusterIP | Service type (ClusterIP/NodePort/LoadBalancer) |
service.port | 3001 | Service port |
persistentVolume.size | 8Gi | PVC size |
persistentVolume.mountPath | /app/server/storage | Container mount path |
persistentVolume.existingClaim | "" | Use pre-existing PVC |
config.STORAGE_DIR | /app/server/storage | Must match mountPath |
config.UID / config.GID | 1000 | User/group IDs |
strategy.type | Recreate | Deployment strategy |
Secrets Management:
The chart recommends using Kubernetes Secrets for API keys, not ConfigMaps:
Method 1: envFrom (recommended for multiple keys):
Method 2: env with secretKeyRef (explicit mapping):
Create the secret:
Deployment Strategy:
The default strategy is Recreate (cloud-deployments/helm/charts/anythingllm/values.yaml88-94) because:
For multi-replica deployments, you would need:
Init Containers for Permissions:
If the PVC has incorrect ownership, add an init container:
Sources: cloud-deployments/helm/charts/anythingllm/values.yaml1-232 cloud-deployments/helm/charts/anythingllm/README.md1-149
The GitHub Actions workflow builds and publishes multi-architecture Docker images automatically.
Workflow Configuration:
.github/workflows/dev-build.yaml1-120
Key Features:
Concurrency Control: .github/workflows/dev-build.yaml3-5
Path Filtering: Excludes paths that don't affect Docker image:
**.md)Multi-Architecture Build: .github/workflows/dev-build.yaml77
GitHub Actions Cache: .github/workflows/dev-build.yaml80-81
mode=max caches all intermediate layersSupply Chain Security:
mode=maxVEX Attestations:
The workflow attaches VEX attestations for known non-exploitable CVEs (.github/workflows/dev-build.yaml86-119):
docker/vex/*.vex.json for VEX documentsTag Strategy:
Development builds tag as dev, production builds (master) tag as latest and version number.
Sources: .github/workflows/dev-build.yaml1-120
Basic deployment:
With environment file:
With PostgreSQL:
Using ExternalSecrets for GitOps:
This pattern integrates with HashiCorp Vault or AWS Secrets Manager to avoid storing secrets in Git.
Sources: cloud-deployments/helm/charts/anythingllm/values.yaml209-231
Symptom: Container logs show "EACCES: permission denied" when accessing /app/server/storage
Cause: Volume mount has incorrect ownership (not UID/GID 1000)
Solution 1 (Docker): Fix host directory ownership:
Solution 2 (Kubernetes): Add init container to fix permissions:
Solution 3: Rebuild image with custom UID/GID matching your host:
Symptom: Document collector fails with "Failed to launch browser" errors
Cause 1: Missing system dependencies for Chromium
Solution: Verify container includes all Chromium dependencies. If using custom base image, ensure all packages from docker/Dockerfile17-22 are installed.
Cause 2: ARM64 Chromium not found
Solution: Verify ARM64 builds include custom Chromium:
Should show the ARM-compatible binary. If missing, rebuild ensuring the ARM64 build path was used.
Symptom: Container marked unhealthy, restarts frequently
Cause: Health check endpoint not responding
Diagnosis:
Common Issues:
start-period)Symptom: Container killed with exit code 137
Cause: Memory limit too low for vector operations or LLM inference
Solution: Increase container memory:
For Kubernetes:
Symptom: "database is locked" errors in logs
Cause: Multiple container instances accessing the same SQLite database
Solution: Ensure only one container accesses the PVC:
Recreate deployment strategy in KubernetesReadWriteOnce PVC access modeSources: docker/Dockerfile177-178 cloud-deployments/helm/charts/anythingllm/values.yaml88-94
Refresh this wiki