This document describes the end-to-end RAG (Retrieval-Augmented Generation) pipeline in DB-GPT, covering knowledge ingestion, storage, and retrieval systems. The pipeline transforms raw documents into searchable knowledge through chunking, embedding, and storage in vector databases or knowledge graphs. For information about model integration and inference, see Model Integration and Inference. For retrieval strategies and reranking, see Retrieval Strategies and Reranking.
The RAG pipeline consists of five main layers: ingestion, chunking, embedding, storage, and retrieval. The StorageManager acts as a factory for creating storage connectors, while the KnowledgeService orchestrates the entire workflow.
Sources:
The KnowledgeFactory creates Knowledge instances from various sources. Documents are loaded and transformed into Chunk objects with metadata.
Chunking Strategies
| Strategy | Description | Configuration |
|---|---|---|
CHUNK_BY_SIZE | Fixed-size chunks with overlap | chunk_size, chunk_overlap |
CHUNK_BY_SEPARATOR | Split by delimiters | Custom separators |
CHUNK_BY_PAGE | Page-based splitting | Page boundaries |
Sources:
The EmbeddingAssembler orchestrates embedding generation and storage persistence. It supports both synchronous and asynchronous document loading.
Sources:
The vector store architecture uses a two-level abstraction: IndexStoreConfig for configuration and IndexStoreBase for operations.
Core Classes
| Class | Purpose | Key Methods |
|---|---|---|
IndexStoreConfig | Configuration container | create_store() |
VectorStoreConfig | Vector-specific config | Extends IndexStoreConfig |
IndexStoreBase | Storage operations interface | load_document(), similar_search_with_scores() |
VectorStoreBase | Vector-specific operations | Extends IndexStoreBase |
Sources:
MilvusStore uses the Milvus vector database with HNSW indexing and supports full-text search in versions >= 2.5.0.
Key Configuration:
pk_idvectorcontentmetadataHNSW (default)COSINEFull-Text Search Support:
sparse_vector field with BM25 functionis_support_full_text_search() checks version compatibilitySources:
ChromaStore provides local persistence with cosine similarity and collection-based organization.
Key Features:
Collection Name Validation:
^[a-zA-Z0-9_][-a-zA-Z0-9_.]*$..Sources:
ElasticStore integrates with Elasticsearch for full-text search and vector similarity.
Configuration:
contextdense_vectorCOSINESources:
| Store | Type | Key Features |
|---|---|---|
PGVectorStore | PostgreSQL extension | Langchain integration, SQL-based storage |
WeaviateStore | Weaviate cloud/local | GraphQL queries, schema-based |
OceanBaseStore | OceanBase database | HNSW index, JSON metadata, L2/cosine/inner product |
Sources:
The BuiltinKnowledgeGraph extracts entities and relationships from documents using LLM-based triplet extraction.
Triplet Extraction:
The TripletExtractor uses LLM prompts to identify entities and relationships:
Input: "Alice works at OpenAI and lives in San Francisco."
Output: [(Alice, works_at, OpenAI), (Alice, lives_in, San Francisco)]
Graph Storage Backends:
GraphStoreFactorySources:
The CommunitySummaryKnowledgeGraph extends BuiltinKnowledgeGraph with hierarchical community detection and summarization.
Configuration Parameters:
| Parameter | Default | Description |
|---|---|---|
kg_extract_top_k | 5 | Top K for extraction search |
kg_extract_score_threshold | 0.3 | Score threshold for extraction |
kg_community_top_k | 50 | Top K communities |
kg_community_score_threshold | 0.3 | Community score threshold |
kg_triplet_graph_enabled | True | Enable triplet graph search |
kg_document_graph_enabled | True | Enable document graph search |
kg_extraction_batch_size | 20 | Batch size for extraction |
kg_community_summary_batch_size | 20 | Batch size for summaries |
Sources:
The StorageManager provides a unified factory interface for creating storage connectors based on configuration.
Storage Type Resolution:
Configuration Loading:
The storage manager reads configuration from app_config.rag.storage:
storage.vector: Vector store configstorage.graph: Knowledge graph configstorage.full_text: Full-text search configSources:
The VectorStoreConnector wraps storage implementations with connection pooling and batch operations.
Connection Pooling:
(vector_store_type, collection_name)pools dictionaryBatch Loading:
Sources:
DB-GPT organizes knowledge using a three-level hierarchy: Space → Document → Chunk.
KnowledgeSpace Context:
The context field stores JSON configuration:
Sources:
The sync_document() method orchestrates the complete ingestion pipeline.
Sync Status Enum:
| Status | Description |
|---|---|
TODO | Pending synchronization |
RUNNING | Currently processing |
FINISHED | Successfully synced |
FAILED | Sync error occurred |
Error Handling:
Sources:
The EmbeddingAssembler creates retrievers that query the index store.
Retrieval Configuration:
Sources:
The TimeWeightedEmbeddingRetriever combines semantic similarity with temporal decay.
Time Decay Formula:
combined_score = relevance_score * exp(-decay_rate * hours_passed)
External Storage Protocol:
Sources:
The KnowledgeSpaceRetriever provides space-level retrieval with reranking support.
Automatic Top-K Adjustment:
Sources:
Full-text search provides keyword-based retrieval alongside semantic search.
Configuration Example (dbgpt-bm25-rag.toml):
ElasticDocumentStore:
Milvus Full-Text Search (v2.5.0+):
Sources:
Storage configuration is defined in .toml files under [rag.storage].
Vector Store Configuration:
Knowledge Graph Configuration:
RAG Pipeline Parameters:
Sources:
Metadata filters enable fine-grained document filtering during retrieval.
Filter Operations:
| Operator | Symbol | Description |
|---|---|---|
EQ | == | Equal |
NE | != | Not equal |
GT | > | Greater than |
LT | < | Less than |
GTE | >= | Greater than or equal |
LTE | <= | Less than or equal |
IN | in | In list |
NIN | not in | Not in list |
Example Usage:
Store-Specific Filter Conversion:
field == 'value'$eq, $gt, etc. operatorsSources:
Refresh this wiki