This document describes the retrieval strategies and reranking mechanisms available in DB-GPT's RAG system. It covers semantic search, full-text search (BM25), graph-based retrieval, time-weighted retrieval, hybrid retrieval approaches, and reranking techniques used to optimize retrieval quality.
For information about the overall RAG pipeline and knowledge management, see RAG Pipeline and Knowledge Management. For details on vector stores and embedding systems, see Vector Stores and Embedding Systems. For knowledge graph integration, see Knowledge Graphs and GraphRAG.
DB-GPT implements multiple retrieval strategies that can be used independently or combined through hybrid retrieval. Each strategy has distinct characteristics optimized for different use cases:
Sources: packages/dbgpt-core/src/dbgpt/storage/base.py214-291 packages/dbgpt-app/src/dbgpt_app/knowledge/service.py323-394
Semantic search uses embedding similarity to find relevant documents based on semantic meaning rather than keyword matching. This is the most common retrieval strategy in DB-GPT.
All vector stores implement the similar_search() and similar_search_with_scores() methods from the IndexStoreBase interface:
| Vector Store | Similarity Metric | Index Type | Score Range |
|---|---|---|---|
| Milvus | COSINE (default) | HNSW | [0, 1] |
| Chroma | Cosine | HNSW | [0, 1] |
| Elasticsearch | Cosine | N/A | [0, ∞) |
| PGVector | Cosine | HNSW | [0, 1] |
| OceanBase | L2, Cosine, Inner Product | HNSW | Variable |
| Weaviate | Cosine | HNSW | [0, 1] |
The Milvus implementation in MilvusStore._search() performs the following steps:
self.embedding.embed_query(query) packages/dbgpt-ext/src/dbgpt_ext/storage/vector_store/milvus_store.py612col.search() with COSINE metric packages/dbgpt-ext/src/dbgpt_ext/storage/vector_store/milvus_store.py619-630Chunk objects with metadata and scores packages/dbgpt-ext/src/dbgpt_ext/storage/vector_store/milvus_store.py631-647All vector stores support score threshold filtering to remove low-quality results:
Sources: packages/dbgpt-ext/src/dbgpt_ext/storage/vector_store/milvus_store.py479-577 packages/dbgpt-ext/src/dbgpt_ext/storage/vector_store/chroma_store.py182-219 packages/dbgpt-core/src/dbgpt/storage/vector_store/base.py133-161
Full-text search uses the BM25 algorithm for keyword-based retrieval, particularly effective for exact term matching and queries with specific terminology.
Milvus 2.5.0+ supports full-text search using built-in BM25 functions:
The implementation creates a BM25 function during collection creation:
sparse_vector field of type SPARSE_FLOAT_VECTOR packages/dbgpt-ext/src/dbgpt_ext/storage/vector_store/milvus_store.py363text_bm25_emb function mapping text to sparse vectors packages/dbgpt-ext/src/dbgpt_ext/storage/vector_store/milvus_store.py369-375Elasticsearch provides native BM25 support through match queries:
The search returns results with BM25 relevance scores computed by Elasticsearch's scoring algorithm.
Sources: packages/dbgpt-ext/src/dbgpt_ext/storage/vector_store/milvus_store.py775-821 packages/dbgpt-ext/src/dbgpt_ext/storage/vector_store/elastic_store.py399-441 configs/dbgpt-bm25-rag.toml32-36
Graph-based retrieval leverages knowledge graphs to find semantically related information through graph traversal and keyword extraction.
The BuiltinKnowledgeGraph implements graph-based retrieval through:
KeywordExtractor packages/dbgpt-ext/src/dbgpt_ext/storage/knowledge_graph/knowledge_graph.py257explore_trigraph() packages/dbgpt-ext/src/dbgpt_ext/storage/knowledge_graph/knowledge_graph.py258-260Chunk with graph context packages/dbgpt-ext/src/dbgpt_ext/storage/knowledge_graph/knowledge_graph.py288The CommunitySummaryKnowledgeGraph extends basic graph retrieval with:
The GraphRetriever orchestrates these strategies based on configuration parameters.
Sources: packages/dbgpt-ext/src/dbgpt_ext/storage/knowledge_graph/knowledge_graph.py245-289 packages/dbgpt-ext/src/dbgpt_ext/storage/knowledge_graph/community_summary.py177-302
Time-weighted retrieval combines semantic similarity with temporal relevance, giving higher scores to more recently accessed or created documents.
The TimeWeightedEmbeddingRetriever applies exponential decay to similarity scores:
The combined score calculation:
The retriever maintains a memory_stream of documents with temporal metadata:
buffer_idx in the stream packages/dbgpt-core/src/dbgpt/rag/retriever/time_weighted.py143last_accessed_at timestamp updated on retrieval packages/dbgpt-core/src/dbgpt/rag/retriever/time_weighted.py140created_at timestamp for initial decay calculation packages/dbgpt-core/src/dbgpt/rag/retriever/time_weighted.py142DocumentStorage protocol for persistence packages/dbgpt-core/src/dbgpt/rag/retriever/time_weighted.py82-104Sources: packages/dbgpt-core/src/dbgpt/rag/retriever/time_weighted.py47-227
Hybrid retrieval combines multiple strategies to leverage their complementary strengths. DB-GPT supports flexible hybrid approaches through its modular architecture.
The recall_test() method in KnowledgeService demonstrates hybrid retrieval:
top_k (20+) to ensure recall packages/dbgpt-app/src/dbgpt_app/knowledge/service.py336-350KnowledgeSpaceRetriever queries different storage types packages/dbgpt-app/src/dbgpt_app/knowledge/service.py352-357The StorageManager routes queries to appropriate storage backends:
Sources: packages/dbgpt-app/src/dbgpt_app/knowledge/service.py323-394 packages/dbgpt-serve/src/dbgpt_serve/rag/storage_manager.py39-88
Reranking refines initial retrieval results by applying a more sophisticated scoring model, typically using cross-encoder models that can capture query-document interactions better than bi-encoder embeddings.
The RerankEmbeddingsRanker is instantiated and applied in the knowledge service:
The reranker:
Reranking is configured in the application settings:
The system checks for reranker availability and adjusts initial retrieval top_k:
Sources: packages/dbgpt-app/src/dbgpt_app/knowledge/service.py344-372 configs/dbgpt-bm25-rag.toml25
Different retrieval strategies are selected based on storage type and configuration.
Key configuration parameters affecting retrieval:
| Parameter | Location | Purpose | Default |
|---|---|---|---|
similarity_top_k | [rag] | Initial retrieval count | 5 |
similarity_score_threshold | [rag] | Minimum similarity score | 0.0 |
rerank_top_k | [rag] | Final count after reranking | 3 |
max_chunks_once_load | [rag] | Batch size for loading | 10 |
max_threads | [rag] | Parallel loading threads | 1 |
The VectorStoreConnector provides a unified interface for all strategies:
Sources: packages/dbgpt-serve/src/dbgpt_serve/rag/connector.py168-227 packages/dbgpt-serve/src/dbgpt_serve/rag/storage_manager.py39-88 configs/dbgpt-bm25-rag.toml18-26
All retrieval strategies support metadata filtering to narrow search scope based on document properties.
DB-GPT supports standard filter operators through the MetadataFilters abstraction:
| Operator | Symbol | Description | Example |
|---|---|---|---|
EQ | == | Equal to | source == "paper.pdf" |
NE | != | Not equal to | status != "draft" |
GT | > | Greater than | page > 10 |
LT | < | Less than | score < 0.5 |
GTE | >= | Greater or equal | date >= "2024-01-01" |
LTE | <= | Less or equal | priority <= 3 |
IN | in | In list | category in ["doc", "paper"] |
NIN | not in | Not in list | status not in ["deleted"] |
Each storage backend converts MetadataFilters to its native format:
Milvus Expression:
Chroma Where Clause:
OceanBase SQL:
Sources: packages/dbgpt-ext/src/dbgpt_ext/storage/vector_store/milvus_store.py715-747 packages/dbgpt-ext/src/dbgpt_ext/storage/vector_store/chroma_store.py340-374 packages/dbgpt-ext/src/dbgpt_ext/storage/vector_store/oceanbase_store.py499-536
| Strategy | Latency | Precision | Recall | Use Case |
|---|---|---|---|---|
| Semantic Search | Medium | High | Medium | General semantic matching |
| BM25 Full-Text | Low | Medium | High | Keyword-specific queries |
| Graph Retrieval | High | High | Medium | Relationship-based queries |
| Time-Weighted | Medium | Medium | Medium | Recency-sensitive retrieval |
| Hybrid + Rerank | High | Very High | High | Quality-critical applications |
HNSW index parameters affect retrieval performance:
For large-scale retrieval, use batch operations with configured limits:
Sources: packages/dbgpt-ext/src/dbgpt_ext/storage/vector_store/milvus_store.py256-272 packages/dbgpt-core/src/dbgpt/storage/base.py115-157
Refresh this wiki