This document describes the application layer and service components in DB-GPT's architecture. The application layer sits between the user-facing interfaces (Web UI, REST API, CLI) and the core framework components (AWEL, Multi-Agents, Storage). It provides business logic for knowledge management, document processing, retrieval-augmented generation (RAG), and storage orchestration.
The application layer is organized into two main package structures:
packages/dbgpt-app - High-level application logic and business servicespackages/dbgpt-serve - Service-layer implementations for RAG, evaluation, and other capabilitiesFor details on the underlying storage implementations, see Storage Architecture and Databases. For information on RAG retrieval strategies, see Retrieval Strategies and Reranking. For Multi-Agent orchestration, see Multi-Agents and AWEL Workflows. For GBI text-to-SQL capabilities, see Generative Business Intelligence (GBI).
The following diagram illustrates the overall structure of the application layer, showing how services interact with the core framework and storage backends.
Sources: packages/dbgpt-app/src/dbgpt_app/knowledge/service.py60-90 packages/dbgpt-serve/src/dbgpt_serve/rag/service/service.py63-80 packages/dbgpt-serve/src/dbgpt_serve/rag/storage_manager.py18-28
The application layer consists of several key service components, each with distinct responsibilities.
| Service Component | Location | Primary Responsibilities |
|---|---|---|
KnowledgeService | packages/dbgpt-app/src/dbgpt_app/knowledge/service.py | High-level knowledge space and document management, recall testing, document summarization |
Service (RAG Service) | packages/dbgpt-serve/src/dbgpt_serve/rag/service/service.py | Document synchronization, chunk management, retrieval operations, storage lifecycle |
StorageManager | packages/dbgpt-serve/src/dbgpt_serve/rag/storage_manager.py | Factory for creating storage connectors, configuration management, storage type selection |
VectorStoreConnector | packages/dbgpt-serve/src/dbgpt_serve/rag/connector.py | Abstraction layer for vector stores, unified CRUD operations, connector lifecycle |
Sources: packages/dbgpt-app/src/dbgpt_app/knowledge/service.py60-70 packages/dbgpt-serve/src/dbgpt_serve/rag/service/service.py63-80 packages/dbgpt-serve/src/dbgpt_serve/rag/storage_manager.py18-33
The StorageManager implements a factory pattern to abstract storage backend selection and instantiation. It supports multiple storage types including vector stores, knowledge graphs, and full-text search engines.
Sources: packages/dbgpt-serve/src/dbgpt_serve/rag/storage_manager.py18-200 packages/dbgpt-ext/src/dbgpt_ext/storage/vector_store/milvus_store.py196-296 packages/dbgpt-ext/src/dbgpt_ext/storage/knowledge_graph/knowledge_graph.py98-400
The StorageManager reads configuration from the system application config and selects appropriate storage backends based on the requested storage_type. The configuration is typically provided through TOML files or environment variables.
Sources: packages/dbgpt-serve/src/dbgpt_serve/rag/storage_manager.py34-87
The create_vector_store() method instantiates vector store instances with proper configuration and embedding functions.
Key Steps:
app_config.rag.storageEmbeddingFactoryVectorStoreConfig instance with connection parametersconfig.create_store() to instantiate concrete vector storeVectorStoreConnector for unified interfaceSources: packages/dbgpt-serve/src/dbgpt_serve/rag/storage_manager.py89-117
The RAG service (Service class in service/service.py) manages the complete document processing pipeline from ingestion to retrieval.
Sources: packages/dbgpt-serve/src/dbgpt_serve/rag/service/service.py204-350 packages/dbgpt-ext/src/dbgpt_ext/rag/assembler/embedding.py1-100
The RAG service tracks document synchronization status through the SyncStatus enum:
| Status | Description | Transitions |
|---|---|---|
TODO | Document queued for processing | Initial state → RUNNING when processing starts |
RUNNING | Document currently being processed | TODO → RUNNING → FINISHED or FAILED |
FINISHED | Document successfully synchronized | Terminal state (success) |
FAILED | Document synchronization failed | Terminal state (error) |
Sources: packages/dbgpt-serve/src/dbgpt_serve/rag/service/service.py56-61
The service uses concurrent loading with configurable batch sizes and thread pools to optimize large document ingestion:
Sources: packages/dbgpt-core/src/dbgpt/storage/base.py159-212 packages/dbgpt-ext/src/dbgpt_ext/rag/assembler/embedding.py50-100
The KnowledgeService class provides high-level business logic for knowledge space and document management. It coordinates between the RAG service, storage backends, and user-facing APIs.
Sources: packages/dbgpt-app/src/dbgpt_app/knowledge/service.py60-700
Knowledge spaces are the primary organizational unit for documents and embeddings. The lifecycle includes:
Creation:
KnowledgeSpaceDaoNormal, FinancialReport)KnowledgeSpaceEntity to metadata databaseUsage:
Deletion:
storage_manager.get_storage_connector().delete_vector_name()Sources: packages/dbgpt-app/src/dbgpt_app/knowledge/service.py91-185
The document_recall_test() method evaluates retrieval quality by querying the knowledge space and measuring relevance:
Process:
KnowledgeSpaceRetriever with space configuration_retrieve_with_score() to fetch relevant chunksRerankEmbeddingsRankerDocumentQueryResponse with ranked resultsSources: packages/dbgpt-app/src/dbgpt_app/knowledge/service.py531-566
The VectorStoreConnector class in connector.py provides a unified interface for interacting with diverse storage backends. It implements the factory pattern and connector lifecycle management.
| Method | Purpose | Returns |
|---|---|---|
create_collection() | Initialize storage collection/index | Collection reference |
load_document() | Insert document chunks | List of chunk IDs |
similar_search() | Vector similarity search | List of Chunk objects |
similar_search_with_scores() | Search with relevance scores | List of scored Chunk objects |
delete_by_ids() | Remove chunks by ID | List of deleted IDs |
delete_vector_name() | Drop entire collection | None |
truncate() | Clear all data from collection | List of deleted IDs |
Sources: packages/dbgpt-serve/src/dbgpt_serve/rag/connector.py1-300
The connector system uses a registry pattern to map storage types to concrete implementations:
Factory Instantiation:
config.create_store() to get concrete store instanceVectorStoreConnector for lifecycle managementSources: packages/dbgpt-serve/src/dbgpt_serve/rag/connector.py50-150
Services in the application layer communicate through well-defined interfaces and dependency injection:
Sources: packages/dbgpt-app/src/dbgpt_app/knowledge/service.py68-90 packages/dbgpt-serve/src/dbgpt_serve/rag/service/service.py66-95 packages/dbgpt-serve/src/dbgpt_serve/rag/storage_manager.py23-33
Services access shared components via SystemApp component registry:
Sources: packages/dbgpt-app/src/dbgpt_app/knowledge/service.py72-85
Service behavior is controlled through hierarchical configuration loaded from TOML files and environment variables:
| Configuration Section | Keys | Purpose |
|---|---|---|
[rag.storage.vector] | type, uri, port, user, password | Vector store connection parameters |
[rag.storage.graph] | type, host, port, username, password | Knowledge graph connection |
[rag.storage.full_text] | type, host, port | Full-text search engine connection |
[rag.embedding] | model_name, model_path | Embedding model configuration |
[rag.knowledge] | chunk_size, chunk_overlap | Document chunking parameters |
Sources: packages/dbgpt-serve/src/dbgpt_serve/rag/storage_manager.py34-37 packages/dbgpt-app/src/dbgpt_app/knowledge/service.py78-81
The service layer implements comprehensive error handling and validation:
Space Validation:
Document Validation:
KnowledgeFactoryStorage Validation:
Sources: packages/dbgpt-serve/src/dbgpt_serve/rag/storage_manager.py46-85 packages/dbgpt-app/src/dbgpt_app/knowledge/service.py91-169
Refresh this wiki