This document describes the Retrieval Augmented Generation (RAG) system in DB-GPT, which enables knowledge-based applications by augmenting LLM responses with retrieved information from various storage backends. The RAG pipeline supports document ingestion, multiple storage types (vector stores, knowledge graphs, full-text search), flexible retrieval strategies, and seamless integration with the generation layer.
For information about Multi-Agent systems and workflow orchestration that leverage RAG capabilities, see Multi-Agents and AWEL Workflows. For details about the storage architecture abstractions, see Storage Architecture and Databases.
The RAG system follows a multi-stage pipeline architecture that transforms documents into retrievable knowledge and enhances LLM responses with relevant context.
Sources: packages/dbgpt-serve/src/dbgpt_serve/rag/storage_manager.py1-250 packages/dbgpt-core/src/dbgpt/storage/base.py1-250 packages/dbgpt-ext/src/dbgpt_ext/rag/assembler/embedding.py1-100
The knowledge ingestion pipeline transforms raw documents into embedded chunks stored in various backends. The process is coordinated by the KnowledgeFactory and involves chunking, embedding, and storage operations.
The KnowledgeFactory creates Knowledge instances based on document type and handles content extraction. It supports multiple file formats including PDF, Markdown, DOCX, CSV, HTML, and URLs.
The KnowledgeType enum defines supported document types: DOCUMENT, URL, TEXT, PDF, MARKDOWN, CSV, HTML, and PPTX. Each type has a corresponding Knowledge implementation that handles content extraction.
Sources: packages/dbgpt-ext/src/dbgpt_ext/rag/knowledge/factory.py1-100 packages/dbgpt-app/src/dbgpt_app/knowledge/service.py35-50
The ChunkParameters class configures text splitting behavior. It supports multiple chunking strategies defined in ChunkStrategy enum:
| Strategy | Description | Configuration |
|---|---|---|
CHUNK_BY_SIZE | Fixed-size chunks | chunk_size, chunk_overlap |
CHUNK_BY_SEPARATOR | Split by delimiters | separators list |
CHUNK_BY_MARKDOWN_HEADER | Markdown structure-aware | Header-based splitting |
CHUNK_BY_PAGE | Page-level chunks | Document pagination |
Each Chunk contains:
content: Text contentmetadata: Dictionary with source informationchunk_id: Unique identifierscore: Relevance score (set during retrieval)Sources: packages/dbgpt-ext/src/dbgpt_ext/rag/chunk_manager.py1-100 packages/dbgpt-serve/src/dbgpt_serve/rag/service/service.py20-30
The embedding process converts text chunks into vector representations using the Embeddings interface. The system supports multiple embedding providers through EmbeddingFactory.
The embedding configuration is managed through EMBEDDING_MODEL_CONFIG and can be customized per knowledge space.
Sources: packages/dbgpt-app/src/dbgpt_app/knowledge/service.py11-20 packages/dbgpt-serve/src/dbgpt_serve/rag/service/service.py20-30
The storage layer provides unified interfaces for multiple backend types through the IndexStoreBase abstraction. The StorageManager acts as a factory to create and manage storage connectors.
The StorageManager.get_storage_connector() method at packages/dbgpt-serve/src/dbgpt_serve/rag/storage_manager.py39-88 accepts three parameters:
index_name: Collection/space namestorage_type: One of vector store types, "KnowledgeGraph", or "FullText"llm_model: Optional LLM model name for graph-based retrievalSources: packages/dbgpt-serve/src/dbgpt_serve/rag/storage_manager.py18-120 packages/dbgpt-serve/src/dbgpt_serve/rag/connector.py1-150
All vector stores inherit from VectorStoreBase at packages/dbgpt-core/src/dbgpt/storage/vector_store/base.py80-200 and implement core methods:
| Method | Purpose | Return Type |
|---|---|---|
load_document() | Store chunks | List[str] (chunk IDs) |
similar_search_with_scores() | Semantic search | List[Chunk] |
delete_by_ids() | Remove chunks | List[str] |
truncate() | Clear collection | List[str] |
Milvus configuration at packages/dbgpt-ext/src/dbgpt_ext/storage/vector_store/milvus_store.py102-177 includes:
The Milvus store supports full-text search when is_support_full_text_search() returns True, creating a sparse_vector field with BM25 function at packages/dbgpt-ext/src/dbgpt_ext/storage/vector_store/milvus_store.py362-375
Sources: packages/dbgpt-ext/src/dbgpt_ext/storage/vector_store/milvus_store.py102-434
Chroma stores data locally in DuckDB+Parquet format at the configured persist_path. The collection name is hashed using SHA256 if it contains special characters or exceeds 63 characters at packages/dbgpt-ext/src/dbgpt_ext/storage/vector_store/chroma_store.py125-129
Sources: packages/dbgpt-ext/src/dbgpt_ext/storage/vector_store/chroma_store.py44-250
Elasticsearch provides both vector similarity and full-text search capabilities. The store creates indices with dense_vector fields for semantic search and supports match queries for keyword search at packages/dbgpt-ext/src/dbgpt_ext/storage/vector_store/elastic_store.py152-300
Sources: packages/dbgpt-ext/src/dbgpt_ext/storage/vector_store/elastic_store.py78-400
| Store | Config Class | Key Features |
|---|---|---|
| PGVector | PGVectorConfig | PostgreSQL with pgvector extension |
| Weaviate | WeaviateVectorConfig | Cloud-native vector database |
| OceanBase | OceanBaseConfig | HNSW index on OceanBase database |
Sources: packages/dbgpt-ext/src/dbgpt_ext/storage/vector_store/pgvector_store.py42-78 packages/dbgpt-ext/src/dbgpt_ext/storage/vector_store/weaviate_store.py22-60 packages/dbgpt-ext/src/dbgpt_ext/storage/vector_store/oceanbase_store.py77-168
Knowledge graphs store entities and relationships extracted from documents. The BuiltinKnowledgeGraph class provides graph storage and retrieval capabilities.
The graph ingestion pipeline at packages/dbgpt-ext/src/dbgpt_ext/storage/knowledge_graph/knowledge_graph.py120-200:
TripletExtractor uses LLMs to extract (subject, predicate, object) tripletsKeywordExtractor identifies important keywords for filteringSources: packages/dbgpt-ext/src/dbgpt_ext/storage/knowledge_graph/knowledge_graph.py1-300 packages/dbgpt-ext/src/dbgpt_ext/storage/graph_store/tugraph_store.py1-200
The CommunitySummaryKnowledgeGraph extends BuiltinKnowledgeGraph with hierarchical community detection and summarization, implementing the GraphRAG approach.
The community detection and summarization process at packages/dbgpt-ext/src/dbgpt_ext/storage/knowledge_graph/community_summary.py100-250:
GraphExtractorCommunitySummarizerCommunityStoreSources: packages/dbgpt-ext/src/dbgpt_ext/storage/knowledge_graph/community_summary.py1-350 packages/dbgpt-ext/src/dbgpt_ext/rag/transformer/graph_extractor.py1-300
The retrieval system combines multiple strategies to find relevant chunks and supports metadata filtering, hybrid search, and reranking.
The KnowledgeSpaceRetriever class orchestrates retrieval across storage backends.
Sources: packages/dbgpt-serve/src/dbgpt_serve/rag/retriever/knowledge_space.py1-200
The MetadataFilters class at packages/dbgpt/storage/vector_store/filters.py1-150 supports filtering chunks by metadata conditions:
Supported operators in FilterOperator enum:
EQ: EqualNE: Not equalGT: Greater thanLT: Less thanGTE: Greater than or equalLTE: Less than or equalIN: In listNIN: Not in listCONTAINS: String containsNOT_CONTAINS: String does not containEach vector store implements convert_metadata_filters() to translate these filters into backend-specific query syntax.
Sources: packages/dbgpt/storage/vector_store/filters.py1-100 packages/dbgpt-ext/src/dbgpt_ext/storage/vector_store/milvus_store.py520-600
The similar_search_with_scores() method retrieves chunks with relevance scores:
Score calculation varies by distance metric:
score = 1 - distance (range: 0 to 1)score = 1.0 - distance / sqrt(2)score = -distanceThe filter_by_score_threshold() method at packages/dbgpt-core/src/dbgpt/storage/vector_store/base.py180-200 removes chunks below the threshold.
Sources: packages/dbgpt-core/src/dbgpt/storage/vector_store/base.py130-200 packages/dbgpt-ext/src/dbgpt_ext/storage/vector_store/chroma_store.py182-219
The RerankEmbeddingsRanker class improves retrieval quality by reordering results using a reranking model.
The reranker is configured through RerankEmbeddingFactory and supports multiple models including cross-encoder models from Hugging Face.
Sources: packages/dbgpt/rag/retriever/rerank.py1-150 packages/dbgpt-app/src/dbgpt_app/knowledge/service.py16-20
The generation layer assembles retrieved chunks into context for LLM prompts.
The EmbeddingAssembler class at packages/dbgpt-ext/src/dbgpt_ext/rag/assembler/embedding.py1-150 coordinates the RAG pipeline from knowledge loading to retrieval.
Key methods:
persist(): Loads knowledge into storage at packages/dbgpt-ext/src/dbgpt_ext/rag/assembler/embedding.py40-80get_chunks(): Retrieves relevant chunks with optional reranking at packages/dbgpt-ext/src/dbgpt_ext/rag/assembler/embedding.py90-130Sources: packages/dbgpt-ext/src/dbgpt_ext/rag/assembler/embedding.py1-180
The assembled context is passed to the LLMClient for generation:
The KnowledgeService at packages/dbgpt-app/src/dbgpt_app/knowledge/service.py60-250 provides high-level methods that integrate retrieval with generation:
document_recall(): Tests retrieval qualityquery_graph(): Queries knowledge graphaembedding_query(): Async retrieval with embeddingsSources: packages/dbgpt-app/src/dbgpt_app/knowledge/service.py60-300 packages/dbgpt-serve/src/dbgpt_serve/rag/service/service.py200-400
The RAG service layer manages knowledge spaces, documents, and chunks through database entities and services.
The DAO classes manage persistence:
KnowledgeSpaceDao at packages/dbgpt-serve/src/dbgpt_serve/rag/models/models.py1-100KnowledgeDocumentDao at packages/dbgpt-serve/src/dbgpt_serve/rag/models/document_db.py1-100DocumentChunkDao at packages/dbgpt-serve/src/dbgpt_serve/rag/models/chunk_db.py1-100Sources: packages/dbgpt-serve/src/dbgpt_serve/rag/models/models.py1-150 packages/dbgpt-serve/src/dbgpt_serve/rag/models/document_db.py1-150 packages/dbgpt-serve/src/dbgpt_serve/rag/models/chunk_db.py1-150
The SyncStatus enum at packages/dbgpt-serve/src/dbgpt_serve/rag/service/service.py56-61 tracks document processing state:
TODO: Pending processingRUNNING: Currently processingFINISHED: Successfully processedFAILED: Processing failedThe synchronization process at packages/dbgpt-serve/src/dbgpt_serve/rag/service/service.py300-500:
RUNNINGKnowledgeFactoryStorageManagerFINISHED or FAILEDSources: packages/dbgpt-serve/src/dbgpt_serve/rag/service/service.py56-600
The RAG system in DB-GPT provides a comprehensive framework for knowledge-based applications with:
IndexStoreBase and VectorStoreBase abstractions enable backend switching without application code changesKnowledgeService and RAG Service classes manage the complete lifecycle from ingestion to retrievalThe modular design allows developers to customize each stage of the pipeline while maintaining compatibility with the broader DB-GPT ecosystem.
Sources: README.md70-74 packages/dbgpt-serve/src/dbgpt_serve/rag/storage_manager.py18-150 packages/dbgpt-app/src/dbgpt_app/knowledge/service.py60-100
Refresh this wiki