This document describes the database layer architecture of LobeChat, including PostgreSQL with PGVector extension, Drizzle ORM implementation, database migration system, and dual persistence strategy for server and desktop deployments.
For detailed information about individual database schemas and table structures, see Core Data Models. For file storage integration with S3, see File Storage and S3 Integration. For caching implementation, see Caching with Redis.
LobeChat uses PostgreSQL as its primary database system with the following key components:
The database layer is built on PostgreSQL 17 with the PGVector extension for vector similarity search capabilities, which powers the RAG (Retrieval-Augmented Generation) system.
Container Setup
The PostgreSQL instance is deployed using the official pgvector/pgvector:pg17 Docker image:
The database container configuration includes health checks and persistent volume mounting to ensure data durability across container restarts.
Sources: docker-compose/local/docker-compose.yml19-36
PGVector for RAG Capabilities
PGVector extends PostgreSQL with vector data types and similarity search operations (<->, <#>, <=> operators for L2 distance, inner product, and cosine distance). This enables:
The extension is pre-installed in the pgvector/pgvector:pg17 image and requires no additional setup.
Sources: docs/self-hosting/server-database/docker-compose.mdx34-52 docs/development/database-schema.dbml867-984
LobeChat uses Drizzle ORM as its database abstraction layer, providing type-safe database operations and schema management.
Schema Definition Architecture
Drizzle ORM provides several key benefits:
$inferInsert and $inferSelectdrizzle-zod for runtime schema validationSources: package.json255-257 packages/database/src/schemas/agent.ts1-87 packages/database/src/schemas/message.ts1-50
Schema Definition Example
The agent schema demonstrates Drizzle's type-safe approach:
The schema uses pgTable to define table structure, with type inference providing compile-time safety for all database operations.
Sources: packages/database/src/schemas/agent.ts26-87 packages/database/src/schemas/user.ts1-80
LobeChat implements an automated database migration system using Drizzle Kit:
Sources: package.json45-47 docker-compose/local/docker-compose.yml166-205
The migration system maintains a complete history in the migration journal:
| Migration | Date | Purpose |
|---|---|---|
| 0000_init | 2024-05-29 | Initial schema setup |
| 0004_add_next_auth | 2024-07-23 | NextAuth integration (deprecated) |
| 0005_pgvector | 2024-08-06 | PGVector extension for RAG |
| 0006_add_knowledge_base | 2024-08-19 | Knowledge base system |
| 0013_add_ai_infra | 2025-01-02 | AI provider and model management |
| 0030_add_group_chat | 2025-01-16 | Multi-agent chat groups |
| 0037_add_user_memory | 2025-01-20 | User memory system |
| 0049_better_auth | 2025-01-23 | Better Auth migration |
| 0070_add_user_memory_activities | 2025-01-21 | Memory activity tracking |
| 0075_add_user_memory_persona | 2025-01-22 | User memory persona |
The journal tracks 75 migrations spanning from May 2024 to January 2025, demonstrating continuous schema evolution.
Sources: packages/database/migrations/meta/_journal.json1-538
The Docker container automatically runs database migrations on startup:
This ensures that:
Migration Safety: The official Docker image guarantees stability for "empty database → complete tables" migration path, eliminating manual schema maintenance.
Sources: docker-compose/local/docker-compose.yml166-205 docs/self-hosting/server-database/docker-compose.zh-CN.mdx101-111
LobeChat implements a dual persistence strategy to support both server-based and local-first deployment models:
Production Architecture
Server deployments use a networked PostgreSQL instance with full ACID guarantees:
user_id foreign keys on all tablesConnection string format:
DATABASE_URL=postgresql://username:password@host:port/dbname
Sources: docker-compose/local/docker-compose.yml19-36 docker-compose/local/.env.example25-26
Local-First Architecture
The Electron desktop application uses PGlite, a WASM-compiled PostgreSQL that runs entirely in the browser/Electron environment:
PGlite provides the full PostgreSQL feature set including:
Implementation
The desktop app initializes PGlite on startup and uses the same database models and schema definitions as the server version, ensuring API compatibility.
Sources: High-level architecture diagrams, package.json32
Both persistence modes share identical schema definitions from packages/database/src/schemas, enabling:
This architecture allows users to switch between server-hosted and local-first deployment without application code changes.
Sources: packages/database/src/schemas/relations.ts1-150
The database schema is organized into domain-specific files within packages/database/src/schemas/:
User & Authentication (52 tables total)
The user domain includes identity, authentication, and authorization tables:
users: Core user account informationaccounts, auth_sessions, passkey, two_factor: Better Auth authentication systemuser_settings: User preferences and configurationSources: packages/database/src/schemas/user.ts1-80 docs/development/database-schema.dbml195-273
Agent System (6 tables)
Agent-related tables store agent configurations and relationships:
agents: Agent definitions with system roles, models, and parametersagents_knowledge_bases: Many-to-many relationship between agents and knowledge basesagents_files: File attachments for agentsagent_cron_jobs: Scheduled agent executionchat_groups, chat_groups_agents: Multi-agent collaboration groupsSources: packages/database/src/schemas/agent.ts26-150 packages/database/src/schemas/chatGroup.ts1-80 docs/development/database-schema.dbml1-104
Conversational Data (12 tables)
Message and conversation management:
sessions, session_groups: Chat session organizationtopics, threads: Conversation threads with branching supportmessages: Core message storage with role, content, tools, reasoningmessage_groups: Multi-model parallel conversationsmessage_plugins, message_tts, message_translates, message_queriesSources: packages/database/src/schemas/message.ts26-200 packages/database/src/schemas/topic.ts24-150 docs/development/database-schema.dbml275-713
Knowledge & RAG (11 tables)
File storage and vector search infrastructure:
files, global_files: File metadata and storage referencesdocuments: Processed document contentknowledge_bases, knowledge_base_files: Knowledge base organizationchunks: Text segments for RAG (document, file, message, unstructured types)embeddings: Vector embeddings (1024-dimensional) with PGVectorSources: packages/database/src/schemas/file.ts1-200 docs/development/database-schema.dbml317-495 docs/development/database-schema.dbml867-984
User Memory System (8 tables)
Six-dimensional memory system with vector embeddings:
user_memories: Base memory layeruser_memories_activities: Temporal events with time rangesuser_memories_contexts: Goals and projectsuser_memories_experiences: Learnings and interactionsuser_memories_identities: People and relationshipsuser_memories_preferences: Likes and directivesuser_memories_personas: User personality profilesEach memory type includes vector embeddings for semantic retrieval.
Sources: docs/development/database-schema.dbml985-1184
The relations.ts file defines cascade delete behavior and foreign key constraints:
All tables include user_id foreign keys with cascade delete rules, ensuring clean data removal when users are deleted.
Sources: packages/database/src/schemas/relations.ts1-150 packages/database/src/schemas/agent.ts89-120
Schema files define indexes for query optimization:
Common Index Patterns:
user_id: All user-scoped queries(client_id, user_id): Unique constraint for client-generated IDscreated_at: Temporal queriesExample from agents table:
Sources: packages/database/src/schemas/agent.ts72-78 packages/database/src/schemas/message.ts180-220
The database connection is configured via the DATABASE_URL environment variable:
The connection string supports standard PostgreSQL parameters including SSL configuration, connection pooling, and timeouts.
Sources: docker-compose/local/.env.example25-26 docker-compose/local/docker-compose.yml151
PostgreSQL data is persisted using Docker volumes:
The ./data directory contains the complete PostgreSQL data directory, including:
Important: The data directory should be backed up regularly for disaster recovery.
Sources: docker-compose/local/docker-compose.yml24-28
The database container includes health checks to ensure availability:
Application containers depend on the database health check passing before starting, preventing connection errors during startup.
Sources: docker-compose/local/docker-compose.yml29-33
The database includes autovacuum tuning for optimal performance with high write volumes:
These settings trigger vacuum and analyze operations more frequently on RAG-related tables to maintain query performance with large vector datasets.
Sources: packages/database/migrations/meta/_journal.json189-192
The schema employs strategic indexing:
user_id indexes on all tables for multi-tenant isolationcreated_at, updated_at, accessed_at indexesSources: packages/database/src/schemas/agent.ts72-78 packages/database/src/schemas/message.ts180-220
This database architecture provides a robust foundation for LobeChat's multi-user, AI-powered conversational system with support for vector search, user memory, and flexible deployment options.
Refresh this wiki