Deployment and Configuration

Relevant source files

DB-GPT supports three primary deployment strategies: Source Code Installation, Docker Single Container, and Docker Compose Multi-Service. Each method targets different operational requirements from development iteration to production deployment.

Configuration is centralized in TOML files under the configs/ directory, loaded at startup via the --config flag to dbgpt start webserver. Runtime parameters are sourced from environment variables using ${env:VARIABLE_NAME} syntax in TOML files.

Overview

This page provides a high-level comparison of deployment methods, hardware requirements, and configuration approaches. Detailed implementation instructions are provided in the child pages:

Docker Base Image and Build System - Dockerfile structure, build modes, and image variants
Docker Compose Deployment - Multi-container orchestration with MySQL and persistent storage
Source Code Installation - Installing dependencies with uv and running from source
Configuration Management - TOML file structure, environment variables, and model configuration
Database and External Integrations - Database setup and datasource/storage integrations

Sources: docker/base/Dockerfile1-130 docker-compose.yml1-54 docs/docs/quickstart.md1-456

Deployment Method Comparison

Deployment Method	Runtime Environment	Configuration Mechanism	Package Installation	Primary Use Case
Source Code	Host Python 3.10+ with `.venv/`	TOML files in `configs/`	`uv sync --all-packages --extra <extras>`	Development, debugging, custom modifications
Docker Single	Container with `/opt/.uv.venv`	TOML files + `Dockerfile` ARG	Build-time `uv sync --frozen`	Testing, cloud deployment, isolated environments
Docker Compose	Multi-container with `dbgptnet`	TOML + `docker-compose.yml`	Pre-built image `eosphorosai/dbgpt-openai`	Production, multi-service orchestration

All methods use the same dbgpt start webserver --config <path.toml> command to launch the application server. The entry point is defined in packages/dbgpt-core/src/dbgpt/cli/cli_scripts.py28-116 which routes to start_webserver() in dbgpt_app._cli

Sources: docker/base/Dockerfile1-130 docker-compose.yml1-54 packages/dbgpt-core/src/dbgpt/cli/cli_scripts.py28-116

Deployment Architecture

Deployment Flow and Code Entry Points

All deployment methods execute the same entry point: cli() in packages/dbgpt-core/src/dbgpt/cli/cli_scripts.py25 routes start webserver command to start_webserver() in packages/dbgpt-app/src/dbgpt_app/_cli.py The webserver calls initialize_app() to create the FastAPI application and load TOML configuration.

Sources: packages/dbgpt-core/src/dbgpt/cli/cli_scripts.py16-116 docker/base/Dockerfile1-130 docker-compose.yml1-54

Source Code Installation Overview

Source code installation uses uv (Astral's Python package manager) to resolve dependencies from the monorepo's pyproject.toml and uv.lock files. A virtual environment is created at .venv/ in the repository root.

Key Components:

Component	Path/Description	Function
Package Manager	`uv` CLI tool	Resolves dependencies, manages virtual environments
Dependency Definition	`pyproject.toml` at workspace root	Defines packages via `[tool.uv.workspace]` and optional dependencies via `[project.optional-dependencies]`
Lock File	`uv.lock` at workspace root	Pins exact dependency versions for reproducibility
CLI Entry Point	`.venv/bin/dbgpt`	Installed script defined in `[project.scripts]` section, routes to `cli()` function
Extras System	`--extra` flags to `uv sync`	Installs optional dependency groups from `[project.optional-dependencies]`

Installation Extras:

The monorepo defines optional dependency groups in pyproject.toml:

Model Providers: proxy_openai, proxy_ollama, proxy_deepseek, proxy_zhipuai, proxy_anthropic, hf, vllm, llama_cpp
RAG Components: rag, graph_rag, storage_chromadb, storage_milvus, storage_obvector, storage_elasticsearch
Datasources: datasource_postgres, datasource_clickhouse, datasource_duckdb, datasource_mssql, datasource_oracle, datasource_hive
GPU Acceleration: cuda121, cuda124, quant_bnb, quant_awq, quant_gptq, flash_attn
Utilities: dbgpts (pre-built workflows), cpu (CPU-only torch), sandbox (code execution)

For detailed installation instructions, prerequisite setup, and step-by-step examples, see Source Code Installation.

Sources: docs/docs/quickstart.md29-74 pyproject.toml

Docker Deployment Overview

Docker deployment uses a multi-stage build process defined in docker/base/Dockerfile1-130 The builder stage installs dependencies with uv sync --frozen, and the final stage copies the virtual environment to /opt/.uv.venv.

Build System:

The docker/base/build_image.sh1-360 script provides five predefined installation modes:

Mode	Base Image	GPU Support	Target Use Case
`default`	`nvidia/cuda:12.4.0-devel-ubuntu22.04`	Yes	General GPU deployment with proxy and local model support
`openai`	`ubuntu:22.04`	No	Proxy-only deployment (OpenAI, Anthropic, etc.)
`vllm`	`nvidia/cuda:12.4.0-devel-ubuntu22.04`	Yes	High-performance inference with vLLM backend
`llama-cpp`	`nvidia/cuda:12.4.0-devel-ubuntu22.04`	Yes	CPU/GPU inference with llama.cpp (GGUF models)
`full`	`nvidia/cuda:12.4.0-devel-ubuntu22.04`	Yes	All features (vLLM + llama.cpp + quantization)

Each mode sets different values for the EXTRAS build argument (line 12 in Dockerfile), which controls which packages are installed during uv sync at docker/base/Dockerfile73-81

Pre-built Images:

Official images are published to Docker Hub at eosphorosai/dbgpt-openai:latest (proxy models) and eosphorosai/dbgpt:latest (full features). Images are built via GitHub Actions workflow at .github/workflows/docker-image-publish.yml

For detailed Docker build instructions, Dockerfile structure, and custom image builds, see Docker Base Image and Build System.

Sources: docker/base/Dockerfile1-130 docker/base/build_image.sh11-41

Docker Compose Overview

Docker Compose orchestrates a multi-container stack defined in docker-compose.yml1-54 The deployment consists of two services:

db service (lines 4-19): MySQL 8.0.32 database with persistent storage
webserver service (lines 20-44): DB-GPT application container

Key Architecture Features:

Networking: dbgptnet bridge network (lines 51-53) enables service-to-service communication; webserver resolves db hostname to MySQL container
Volumes: Named volumes (dbgpt-myql-db, dbgpt-data, dbgpt-message) persist across container restarts; bind mounts (./configs, /data/models) provide configuration and model access
Initialization: MySQL executes ./assets/schema/dbgpt.sql on first start to create application tables
Dependency Management: depends_on: [db] at line 37 ensures MySQL starts before webserver

Configuration:

The webserver command at line 22 specifies:

This TOML file must include MySQL connection parameters:

Environment variables (lines 23-29) inject runtime configuration into the container.

For detailed service configuration, volume management, and deployment instructions, see Docker Compose Deployment.

Sources: docker-compose.yml1-54

Configuration Management

Configuration in DB-GPT is managed through TOML files in the configs/ directory. The application loads configuration at startup via the --config flag passed to dbgpt start webserver.

Configuration Loading Mechanism

Configuration Structure:

TOML files are organized into sections with nested tables:

[system] - Language, API keys, encryption settings
[service.web] - HTTP server configuration (host, port, workers)
[service.web.database] - Database connection (type, host, port, user, password, database)
[models] - Model definitions with array-of-tables syntax
- [[models.llms]] - LLM model configuration (name, provider, api_key, path)
- [[models.embeddings]] - Embedding model configuration
[rag.storage.vector] - Vector store configuration (type, uri, port, username, password)
[rag.storage.graph] - Graph store configuration (type, host, port, username, password)

Environment Variable Substitution:

TOML files support ${env:VARIABLE_NAME} syntax for runtime injection. The substitution engine processes:

${env:VAR} - Required variable (fails if missing)
${env:VAR:-default} - Optional variable with default value
${env:VAR:} - Optional variable (empty string if missing)

Available Configuration Files:

Configuration File	Provider Class	Database	Model Type	Primary Use Case
`dbgpt-proxy-openai.toml`	`OpenAILLMClient`	SQLite	Proxy	OpenAI API proxy
`dbgpt-proxy-siliconflow.toml`	`SiliconFlowLLMClient`	SQLite	Proxy	SiliconFlow API
`dbgpt-proxy-aimlapi.toml`	`AimlapiLLMClient`	SQLite	Proxy	AI/ML API (300+ models)
`dbgpt-proxy-burncloud.toml`	`BurnCloudLLMClient`	SQLite	Proxy	BurnCloud API
`dbgpt-proxy-deepseek.toml`	`DeepseekLLMClient`	SQLite	Proxy	DeepSeek API
`dbgpt-proxy-ollama.toml`	`OllamaLLMClient`	SQLite	Proxy	Local Ollama server
`dbgpt-local-glm.toml`	`HFLLMDeployModelParameters`	SQLite	Local	Hugging Face Transformers
`dbgpt-local-vllm.toml`	`VLLMDeployModelParameters`	SQLite	Local	vLLM inference engine
`dbgpt-local-llama-cpp.toml`	`LlamaCppModelParameters`	SQLite	Local	llama.cpp GGUF models
`dbgpt-proxy-siliconflow-mysql.toml`	`SiliconFlowLLMClient`	MySQL	Proxy	Docker Compose deployment
`dbgpt-graphrag.toml`	Any	SQLite	Any	Graph RAG with TuGraph

For detailed TOML file structure, model provider configuration, and database setup, see Configuration Management.

Sources: configs/dbgpt-proxy-openai.toml configs/dbgpt-proxy-aimlapi.toml1-30 configs/dbgpt-proxy-burncloud.toml1-60 packages/dbgpt-core/src/dbgpt/cli/cli_scripts.py25-116

Database and External Integrations

DB-GPT supports multiple database backends and external service integrations:

Application Databases:

SQLite (default): File-based database at pilot/meta_data/dbgpt.db, configured via [service.web.database] section
MySQL: Production database with schema initialization from ./assets/schema/dbgpt.sql

Datasource Integrations:

DB-GPT can connect to external databases as data sources for SQL generation and querying:

Relational: PostgreSQL, MySQL, ClickHouse, DuckDB, MSSQL, Oracle, Hive
Installation: Via datasource_* extras (e.g., datasource_postgres, datasource_clickhouse)
Configuration: Through UI at /datasources after installation

RAG Storage Backends:

Multiple vector and graph store implementations for knowledge retrieval:

Vector Stores: Chroma (default), Milvus, Elasticsearch, PGVector, OceanBase, Weaviate
Graph Stores: TuGraph, Neo4j, MemGraph
Configuration: Via [rag.storage.vector] and [rag.storage.graph] sections in TOML files

Installation Examples:

For detailed integration setup instructions, configuration examples, and troubleshooting, see Database and External Integrations.

Sources: docs/docs/installation/integrations/postgres_install.md1-41 docs/docs/installation/integrations/milvus_rag_install.md1-47 docs/docs/installation/integrations/graph_rag_install.md1-72

Hardware Requirements

Resource requirements vary by deployment method and model provider:

Deployment Strategy	CPU	Memory	GPU	Disk	Use Case
Source (Proxy)	4 cores	8GB	None	10GB	Development with OpenAI/Anthropic/DeepSeek APIs
Source (Local - HF)	8 cores	32GB	24GB VRAM	50GB+	Local 7B model (GLM-4, Qwen2.5) with transformers
Docker (openai mode)	4 cores	8GB	None	10GB	Production proxy deployment, CPU-only
Docker (default mode)	8 cores	32GB	24GB VRAM	50GB+	GPU deployment with CUDA 12.1, quantization support
Docker (vllm mode)	16 cores	64GB	48GB VRAM	100GB+	High-throughput vLLM inference, batch optimization
Docker Compose	8 cores	16GB	Optional	50GB+	Multi-service with MySQL persistence

GPU and CUDA Requirements:

CUDA 12.1/12.4 support enabled via cuda121 or cuda124 extras
24GB VRAM minimum for 7B parameter models (e.g., GLM-4-9B-chat, Qwen2.5-7B)
48GB VRAM minimum for 13B parameter models with vLLM
80GB VRAM (A100/H100) recommended for 70B parameter models
CPU-only inference available via llama_cpp extra (slower, no CUDA)

Inference Backend Requirements:

vLLM: Requires CUDA 12.1+, supports PagedAttention, tensor parallelism, optimized for batch inference
llama.cpp: Supports CPU, Metal (Apple Silicon), CUDA, OpenCL; GGUF quantized models (2-8 bits)
Hugging Face Transformers: Supports CPU, CUDA, MPS (Apple); quantization via BitsAndBytes (4-bit/8-bit)

Sources: docs/docs/installation/sourcecode.md3-9 docker/base/build_image.sh11-41

Deployment Method Selection

Choose Source Code Installation when:

Developing custom features or debugging DB-GPT internals
Frequently modifying code during development cycles
Testing different combinations of extras and dependencies
Contributing to the project via pull requests
Need access to unreleased features from main branch

Choose Docker Single Container when:

Testing in isolation without affecting host system
Deploying to cloud platforms (AWS ECS, GCP Cloud Run, Azure Container Instances)
Need reproducible builds across different environments
Want to avoid Python dependency conflicts
Rapid prototyping with pre-built images from Docker Hub

Choose Docker Compose when:

Running production deployments with persistent data
Need separate database service (MySQL instead of SQLite)
Managing multi-service architecture with networking
Require automatic restart policies for high availability
Team deployment with standardized configuration
Need volume persistence across container updates

Sources: docs/docs/quickstart.md1-456 docs/docs/installation/docker.md1-227 docs/docs/installation/docker_compose.md1-39

Model Provider Integration

DB-GPT integrates with multiple LLM providers through a unified adapter architecture. Proxy clients inherit from OpenAILLMClient for OpenAI API compatibility. Local model workers use provider-specific parameter classes (e.g., VLLMDeployModelParameters, HFLLMDeployModelParameters).

Model Provider Architecture

Model Provider Categories:

Category	Implementation	Configuration Syntax	Example Models
Proxy - Commercial API	`OpenAILLMClient` subclasses	`provider = "proxy/openai"`	GPT-4o, GPT-3.5-turbo
Proxy - Third-party Aggregator	`AimlapiLLMClient`, `BurnCloudLLMClient`	`provider = "proxy/aimlapi"`	300+ models via AI/ML API
Proxy - Local Server	`OllamaLLMClient`	`provider = "proxy/ollama"`	Llama 3, Mistral, Qwen via Ollama
Local - vLLM	`VLLMDeployModelParameters`	`provider = "vllm"`	Any HF model with vLLM support
Local - Hugging Face	`HFLLMDeployModelParameters`	`provider = "hf"`	Any transformers model
Local - llama.cpp	`LlamaCppModelParameters`	`provider = "llama.cpp"`	GGUF quantized models

Embedding Provider Integration:

Embedding implementations in packages/dbgpt-ext/src/dbgpt_ext/rag/embeddings/__init__.py1-18:

AimlapiEmbeddings - AI/ML API (text-embedding-3-large, BGE-large-en-v1.5)
JinaEmbeddings - Jina AI embeddings
OllamaEmbeddings - Local Ollama embeddings
QianFanEmbeddings - Baidu QianFan embeddings
SiliconFlowEmbeddings - SiliconFlow embeddings
TongYiEmbeddings - Alibaba Tongyi embeddings
HuggingFaceEmbeddings - Local Hugging Face models

All embedding classes implement embed_documents(texts: List[str]) and embed_query(text: str) methods.

Sources: packages/dbgpt-core/src/dbgpt/model/proxy/__init__.py1-73 packages/dbgpt-ext/src/dbgpt_ext/rag/embeddings/__init__.py1-18 packages/dbgpt-core/src/dbgpt/model/proxy/llms/aimlapi.py70 packages/dbgpt-core/src/dbgpt/model/proxy/llms/burncloud.py70 packages/dbgpt-core/src/dbgpt/model/adapter/vllm_adapter.py packages/dbgpt-core/src/dbgpt/model/adapter/hf_adapter.py

Next Steps

For detailed step-by-step instructions on each deployment method, refer to the following pages:

Docker Deployment - Docker image build options, installation modes, and single container deployment
Docker Compose Deployment - Multi-container stack with MySQL, networking, and persistent volumes
Source Code Installation - Installing with uv, managing extras, and running locally
Configuration and Server Setup - Environment variables, TOML configuration, API keys, and server startup options

Deployment and Configuration

Relevant source files

Overview

This page provides a high-level comparison of deployment methods, hardware requirements, and configuration approaches. Detailed implementation instructions are provided in the child pages:

Docker Base Image and Build System - Dockerfile structure, build modes, and image variants
Docker Compose Deployment - Multi-container orchestration with MySQL and persistent storage
Source Code Installation - Installing dependencies with uv and running from source
Configuration Management - TOML file structure, environment variables, and model configuration
Database and External Integrations - Database setup and datasource/storage integrations

Sources: docker/base/Dockerfile1-130 docker-compose.yml1-54 docs/docs/quickstart.md1-456

Deployment Method Comparison

Deployment Method	Runtime Environment	Configuration Mechanism	Package Installation	Primary Use Case
Source Code	Host Python 3.10+ with `.venv/`	TOML files in `configs/`	`uv sync --all-packages --extra <extras>`	Development, debugging, custom modifications
Docker Single	Container with `/opt/.uv.venv`	TOML files + `Dockerfile` ARG	Build-time `uv sync --frozen`	Testing, cloud deployment, isolated environments
Docker Compose	Multi-container with `dbgptnet`	TOML + `docker-compose.yml`	Pre-built image `eosphorosai/dbgpt-openai`	Production, multi-service orchestration

Sources: docker/base/Dockerfile1-130 docker-compose.yml1-54 packages/dbgpt-core/src/dbgpt/cli/cli_scripts.py28-116

Deployment Architecture

Deployment Flow and Code Entry Points

Sources: packages/dbgpt-core/src/dbgpt/cli/cli_scripts.py16-116 docker/base/Dockerfile1-130 docker-compose.yml1-54

Source Code Installation Overview

Key Components:

Component	Path/Description	Function
Package Manager	`uv` CLI tool	Resolves dependencies, manages virtual environments
Dependency Definition	`pyproject.toml` at workspace root	Defines packages via `[tool.uv.workspace]` and optional dependencies via `[project.optional-dependencies]`
Lock File	`uv.lock` at workspace root	Pins exact dependency versions for reproducibility
CLI Entry Point	`.venv/bin/dbgpt`	Installed script defined in `[project.scripts]` section, routes to `cli()` function
Extras System	`--extra` flags to `uv sync`	Installs optional dependency groups from `[project.optional-dependencies]`

Installation Extras:

The monorepo defines optional dependency groups in pyproject.toml:

Model Providers: proxy_openai, proxy_ollama, proxy_deepseek, proxy_zhipuai, proxy_anthropic, hf, vllm, llama_cpp
RAG Components: rag, graph_rag, storage_chromadb, storage_milvus, storage_obvector, storage_elasticsearch
Datasources: datasource_postgres, datasource_clickhouse, datasource_duckdb, datasource_mssql, datasource_oracle, datasource_hive
GPU Acceleration: cuda121, cuda124, quant_bnb, quant_awq, quant_gptq, flash_attn
Utilities: dbgpts (pre-built workflows), cpu (CPU-only torch), sandbox (code execution)

For detailed installation instructions, prerequisite setup, and step-by-step examples, see Source Code Installation.

Sources: docs/docs/quickstart.md29-74 pyproject.toml

Docker Deployment Overview

Build System:

The docker/base/build_image.sh1-360 script provides five predefined installation modes:

Mode	Base Image	GPU Support	Target Use Case
`default`	`nvidia/cuda:12.4.0-devel-ubuntu22.04`	Yes	General GPU deployment with proxy and local model support
`openai`	`ubuntu:22.04`	No	Proxy-only deployment (OpenAI, Anthropic, etc.)
`vllm`	`nvidia/cuda:12.4.0-devel-ubuntu22.04`	Yes	High-performance inference with vLLM backend
`llama-cpp`	`nvidia/cuda:12.4.0-devel-ubuntu22.04`	Yes	CPU/GPU inference with llama.cpp (GGUF models)
`full`	`nvidia/cuda:12.4.0-devel-ubuntu22.04`	Yes	All features (vLLM + llama.cpp + quantization)

Each mode sets different values for the EXTRAS build argument (line 12 in Dockerfile), which controls which packages are installed during uv sync at docker/base/Dockerfile73-81

Pre-built Images:

For detailed Docker build instructions, Dockerfile structure, and custom image builds, see Docker Base Image and Build System.

Sources: docker/base/Dockerfile1-130 docker/base/build_image.sh11-41

Docker Compose Overview

Docker Compose orchestrates a multi-container stack defined in docker-compose.yml1-54 The deployment consists of two services:

db service (lines 4-19): MySQL 8.0.32 database with persistent storage
webserver service (lines 20-44): DB-GPT application container

Key Architecture Features:

Networking: dbgptnet bridge network (lines 51-53) enables service-to-service communication; webserver resolves db hostname to MySQL container
Volumes: Named volumes (dbgpt-myql-db, dbgpt-data, dbgpt-message) persist across container restarts; bind mounts (./configs, /data/models) provide configuration and model access
Initialization: MySQL executes ./assets/schema/dbgpt.sql on first start to create application tables
Dependency Management: depends_on: [db] at line 37 ensures MySQL starts before webserver

Configuration:

The webserver command at line 22 specifies:

This TOML file must include MySQL connection parameters:

Environment variables (lines 23-29) inject runtime configuration into the container.

For detailed service configuration, volume management, and deployment instructions, see Docker Compose Deployment.

Sources: docker-compose.yml1-54

Configuration Management

Configuration in DB-GPT is managed through TOML files in the configs/ directory. The application loads configuration at startup via the --config flag passed to dbgpt start webserver.

Configuration Loading Mechanism

Configuration Structure:

TOML files are organized into sections with nested tables:

[system] - Language, API keys, encryption settings
[service.web] - HTTP server configuration (host, port, workers)
[service.web.database] - Database connection (type, host, port, user, password, database)
[models] - Model definitions with array-of-tables syntax
- [[models.llms]] - LLM model configuration (name, provider, api_key, path)
- [[models.embeddings]] - Embedding model configuration
[rag.storage.vector] - Vector store configuration (type, uri, port, username, password)
[rag.storage.graph] - Graph store configuration (type, host, port, username, password)

Environment Variable Substitution:

TOML files support ${env:VARIABLE_NAME} syntax for runtime injection. The substitution engine processes:

${env:VAR} - Required variable (fails if missing)
${env:VAR:-default} - Optional variable with default value
${env:VAR:} - Optional variable (empty string if missing)

Available Configuration Files:

Configuration File	Provider Class	Database	Model Type	Primary Use Case
`dbgpt-proxy-openai.toml`	`OpenAILLMClient`	SQLite	Proxy	OpenAI API proxy
`dbgpt-proxy-siliconflow.toml`	`SiliconFlowLLMClient`	SQLite	Proxy	SiliconFlow API
`dbgpt-proxy-aimlapi.toml`	`AimlapiLLMClient`	SQLite	Proxy	AI/ML API (300+ models)
`dbgpt-proxy-burncloud.toml`	`BurnCloudLLMClient`	SQLite	Proxy	BurnCloud API
`dbgpt-proxy-deepseek.toml`	`DeepseekLLMClient`	SQLite	Proxy	DeepSeek API
`dbgpt-proxy-ollama.toml`	`OllamaLLMClient`	SQLite	Proxy	Local Ollama server
`dbgpt-local-glm.toml`	`HFLLMDeployModelParameters`	SQLite	Local	Hugging Face Transformers
`dbgpt-local-vllm.toml`	`VLLMDeployModelParameters`	SQLite	Local	vLLM inference engine
`dbgpt-local-llama-cpp.toml`	`LlamaCppModelParameters`	SQLite	Local	llama.cpp GGUF models
`dbgpt-proxy-siliconflow-mysql.toml`	`SiliconFlowLLMClient`	MySQL	Proxy	Docker Compose deployment
`dbgpt-graphrag.toml`	Any	SQLite	Any	Graph RAG with TuGraph

For detailed TOML file structure, model provider configuration, and database setup, see Configuration Management.

Sources: configs/dbgpt-proxy-openai.toml configs/dbgpt-proxy-aimlapi.toml1-30 configs/dbgpt-proxy-burncloud.toml1-60 packages/dbgpt-core/src/dbgpt/cli/cli_scripts.py25-116

Database and External Integrations

DB-GPT supports multiple database backends and external service integrations:

Application Databases:

SQLite (default): File-based database at pilot/meta_data/dbgpt.db, configured via [service.web.database] section
MySQL: Production database with schema initialization from ./assets/schema/dbgpt.sql

Datasource Integrations:

DB-GPT can connect to external databases as data sources for SQL generation and querying:

Relational: PostgreSQL, MySQL, ClickHouse, DuckDB, MSSQL, Oracle, Hive
Installation: Via datasource_* extras (e.g., datasource_postgres, datasource_clickhouse)
Configuration: Through UI at /datasources after installation

RAG Storage Backends:

Multiple vector and graph store implementations for knowledge retrieval:

Vector Stores: Chroma (default), Milvus, Elasticsearch, PGVector, OceanBase, Weaviate
Graph Stores: TuGraph, Neo4j, MemGraph
Configuration: Via [rag.storage.vector] and [rag.storage.graph] sections in TOML files

Installation Examples:

For detailed integration setup instructions, configuration examples, and troubleshooting, see Database and External Integrations.

Sources: docs/docs/installation/integrations/postgres_install.md1-41 docs/docs/installation/integrations/milvus_rag_install.md1-47 docs/docs/installation/integrations/graph_rag_install.md1-72

Hardware Requirements

Resource requirements vary by deployment method and model provider:

Deployment Strategy	CPU	Memory	GPU	Disk	Use Case
Source (Proxy)	4 cores	8GB	None	10GB	Development with OpenAI/Anthropic/DeepSeek APIs
Source (Local - HF)	8 cores	32GB	24GB VRAM	50GB+	Local 7B model (GLM-4, Qwen2.5) with transformers
Docker (openai mode)	4 cores	8GB	None	10GB	Production proxy deployment, CPU-only
Docker (default mode)	8 cores	32GB	24GB VRAM	50GB+	GPU deployment with CUDA 12.1, quantization support
Docker (vllm mode)	16 cores	64GB	48GB VRAM	100GB+	High-throughput vLLM inference, batch optimization
Docker Compose	8 cores	16GB	Optional	50GB+	Multi-service with MySQL persistence

GPU and CUDA Requirements:

CUDA 12.1/12.4 support enabled via cuda121 or cuda124 extras
24GB VRAM minimum for 7B parameter models (e.g., GLM-4-9B-chat, Qwen2.5-7B)
48GB VRAM minimum for 13B parameter models with vLLM
80GB VRAM (A100/H100) recommended for 70B parameter models
CPU-only inference available via llama_cpp extra (slower, no CUDA)

Inference Backend Requirements:

vLLM: Requires CUDA 12.1+, supports PagedAttention, tensor parallelism, optimized for batch inference
llama.cpp: Supports CPU, Metal (Apple Silicon), CUDA, OpenCL; GGUF quantized models (2-8 bits)
Hugging Face Transformers: Supports CPU, CUDA, MPS (Apple); quantization via BitsAndBytes (4-bit/8-bit)

Sources: docs/docs/installation/sourcecode.md3-9 docker/base/build_image.sh11-41

Deployment Method Selection

Choose Source Code Installation when:

Developing custom features or debugging DB-GPT internals
Frequently modifying code during development cycles
Testing different combinations of extras and dependencies
Contributing to the project via pull requests
Need access to unreleased features from main branch

Choose Docker Single Container when:

Testing in isolation without affecting host system
Deploying to cloud platforms (AWS ECS, GCP Cloud Run, Azure Container Instances)
Need reproducible builds across different environments
Want to avoid Python dependency conflicts
Rapid prototyping with pre-built images from Docker Hub

Choose Docker Compose when:

Running production deployments with persistent data
Need separate database service (MySQL instead of SQLite)
Managing multi-service architecture with networking
Require automatic restart policies for high availability
Team deployment with standardized configuration
Need volume persistence across container updates

Sources: docs/docs/quickstart.md1-456 docs/docs/installation/docker.md1-227 docs/docs/installation/docker_compose.md1-39

Model Provider Integration

Model Provider Architecture

Model Provider Categories:

Category	Implementation	Configuration Syntax	Example Models
Proxy - Commercial API	`OpenAILLMClient` subclasses	`provider = "proxy/openai"`	GPT-4o, GPT-3.5-turbo
Proxy - Third-party Aggregator	`AimlapiLLMClient`, `BurnCloudLLMClient`	`provider = "proxy/aimlapi"`	300+ models via AI/ML API
Proxy - Local Server	`OllamaLLMClient`	`provider = "proxy/ollama"`	Llama 3, Mistral, Qwen via Ollama
Local - vLLM	`VLLMDeployModelParameters`	`provider = "vllm"`	Any HF model with vLLM support
Local - Hugging Face	`HFLLMDeployModelParameters`	`provider = "hf"`	Any transformers model
Local - llama.cpp	`LlamaCppModelParameters`	`provider = "llama.cpp"`	GGUF quantized models

Embedding Provider Integration:

Embedding implementations in packages/dbgpt-ext/src/dbgpt_ext/rag/embeddings/__init__.py1-18:

AimlapiEmbeddings - AI/ML API (text-embedding-3-large, BGE-large-en-v1.5)
JinaEmbeddings - Jina AI embeddings
OllamaEmbeddings - Local Ollama embeddings
QianFanEmbeddings - Baidu QianFan embeddings
SiliconFlowEmbeddings - SiliconFlow embeddings
TongYiEmbeddings - Alibaba Tongyi embeddings
HuggingFaceEmbeddings - Local Hugging Face models

All embedding classes implement embed_documents(texts: List[str]) and embed_query(text: str) methods.

Next Steps

For detailed step-by-step instructions on each deployment method, refer to the following pages:

Docker Deployment - Docker image build options, installation modes, and single container deployment
Docker Compose Deployment - Multi-container stack with MySQL, networking, and persistent volumes
Source Code Installation - Installing with uv, managing extras, and running locally
Configuration and Server Setup - Environment variables, TOML configuration, API keys, and server startup options

Deployment and Configuration

Overview

Deployment Method Comparison

Deployment Architecture

Source Code Installation Overview

Docker Deployment Overview

Docker Compose Overview

Configuration Management

Database and External Integrations

Hardware Requirements

Deployment Method Selection

Model Provider Integration

Next Steps

On this page

Deployment and Configuration

Overview

Deployment Method Comparison

Deployment Architecture

Source Code Installation Overview

Docker Deployment Overview

Docker Compose Overview

Configuration Management

Database and External Integrations

Hardware Requirements

Deployment Method Selection

Model Provider Integration

Next Steps

On this page