Retrieval Strategies and Reranking

Relevant source files

This document describes the retrieval strategies and reranking mechanisms available in DB-GPT's RAG system. It covers semantic search, full-text search (BM25), graph-based retrieval, time-weighted retrieval, hybrid retrieval approaches, and reranking techniques used to optimize retrieval quality.

For information about the overall RAG pipeline and knowledge management, see RAG Pipeline and Knowledge Management. For details on vector stores and embedding systems, see Vector Stores and Embedding Systems. For knowledge graph integration, see Knowledge Graphs and GraphRAG.

Overview of Retrieval Strategies

DB-GPT implements multiple retrieval strategies that can be used independently or combined through hybrid retrieval. Each strategy has distinct characteristics optimized for different use cases:

Sources: packages/dbgpt-core/src/dbgpt/storage/base.py214-291 packages/dbgpt-app/src/dbgpt_app/knowledge/service.py323-394

Semantic Search (Embedding Retrieval)

Semantic search uses embedding similarity to find relevant documents based on semantic meaning rather than keyword matching. This is the most common retrieval strategy in DB-GPT.

Implementation Across Vector Stores

All vector stores implement the similar_search() and similar_search_with_scores() methods from the IndexStoreBase interface:

Vector Store	Similarity Metric	Index Type	Score Range
Milvus	COSINE (default)	HNSW	[0, 1]
Chroma	Cosine	HNSW	[0, 1]
Elasticsearch	Cosine	N/A	[0, ∞)
PGVector	Cosine	HNSW	[0, 1]
OceanBase	L2, Cosine, Inner Product	HNSW	Variable
Weaviate	Cosine	HNSW	[0, 1]

Milvus Semantic Search

The Milvus implementation in MilvusStore._search() performs the following steps:

Query Embedding: Convert query text to vector using self.embedding.embed_query(query) packages/dbgpt-ext/src/dbgpt_ext/storage/vector_store/milvus_store.py612
Index Selection: Use HNSW index with configurable parameters (M=8, efConstruction=64 by default) packages/dbgpt-ext/src/dbgpt_ext/storage/vector_store/milvus_store.py256-272
Similarity Search: Execute col.search() with COSINE metric packages/dbgpt-ext/src/dbgpt_ext/storage/vector_store/milvus_store.py619-630
Result Parsing: Convert results to Chunk objects with metadata and scores packages/dbgpt-ext/src/dbgpt_ext/storage/vector_store/milvus_store.py631-647

Score Threshold Filtering

All vector stores support score threshold filtering to remove low-quality results:

Sources: packages/dbgpt-ext/src/dbgpt_ext/storage/vector_store/milvus_store.py479-577 packages/dbgpt-ext/src/dbgpt_ext/storage/vector_store/chroma_store.py182-219 packages/dbgpt-core/src/dbgpt/storage/vector_store/base.py133-161

Full-Text Search (BM25)

Full-text search uses the BM25 algorithm for keyword-based retrieval, particularly effective for exact term matching and queries with specific terminology.

Milvus Full-Text Search

Milvus 2.5.0+ supports full-text search using built-in BM25 functions:

The implementation creates a BM25 function during collection creation:

Schema Definition: Add sparse_vector field of type SPARSE_FLOAT_VECTOR packages/dbgpt-ext/src/dbgpt_ext/storage/vector_store/milvus_store.py363
BM25 Function: Create text_bm25_emb function mapping text to sparse vectors packages/dbgpt-ext/src/dbgpt_ext/storage/vector_store/milvus_store.py369-375
Index Creation: Build AUTOINDEX with BM25 metric packages/dbgpt-ext/src/dbgpt_ext/storage/vector_store/milvus_store.py384-390
Search Execution: Query using sparse vector field packages/dbgpt-ext/src/dbgpt_ext/storage/vector_store/milvus_store.py780-787

Elasticsearch Full-Text Search

Elasticsearch provides native BM25 support through match queries:

The search returns results with BM25 relevance scores computed by Elasticsearch's scoring algorithm.

Sources: packages/dbgpt-ext/src/dbgpt_ext/storage/vector_store/milvus_store.py775-821 packages/dbgpt-ext/src/dbgpt_ext/storage/vector_store/elastic_store.py399-441 configs/dbgpt-bm25-rag.toml32-36

Graph-Based Retrieval

Graph-based retrieval leverages knowledge graphs to find semantically related information through graph traversal and keyword extraction.

Keyword Extraction and Graph Exploration

The BuiltinKnowledgeGraph implements graph-based retrieval through:

Keyword Extraction: Extract keywords from query using KeywordExtractor packages/dbgpt-ext/src/dbgpt_ext/storage/knowledge_graph/knowledge_graph.py257
Graph Exploration: Explore 3-hop neighborhood using explore_trigraph() packages/dbgpt-ext/src/dbgpt_ext/storage/knowledge_graph/knowledge_graph.py258-260
Subgraph Formatting: Format entities and relationships as structured text packages/dbgpt-ext/src/dbgpt_ext/storage/knowledge_graph/knowledge_graph.py267-287
Context Assembly: Return as single Chunk with graph context packages/dbgpt-ext/src/dbgpt_ext/storage/knowledge_graph/knowledge_graph.py288

Community Summary Knowledge Graph Retrieval

The CommunitySummaryKnowledgeGraph extends basic graph retrieval with:

Triplet Graph Search: Semantic search over entity-relation triplets
Document Graph Search: Graph structure of documents and chunks
Community Summaries: Pre-computed summaries of graph communities
Text Search: Optional full-text search over graph entities

The GraphRetriever orchestrates these strategies based on configuration parameters.

Sources: packages/dbgpt-ext/src/dbgpt_ext/storage/knowledge_graph/knowledge_graph.py245-289 packages/dbgpt-ext/src/dbgpt_ext/storage/knowledge_graph/community_summary.py177-302

Time-Weighted Retrieval

Time-weighted retrieval combines semantic similarity with temporal relevance, giving higher scores to more recently accessed or created documents.

Time Decay Formula

The TimeWeightedEmbeddingRetriever applies exponential decay to similarity scores:

The combined score calculation:

Memory Stream Management

The retriever maintains a memory_stream of documents with temporal metadata:

Buffer Index: Each document gets a unique buffer_idx in the stream packages/dbgpt-core/src/dbgpt/rag/retriever/time_weighted.py143
Access Tracking: last_accessed_at timestamp updated on retrieval packages/dbgpt-core/src/dbgpt/rag/retriever/time_weighted.py140
Creation Time: created_at timestamp for initial decay calculation packages/dbgpt-core/src/dbgpt/rag/retriever/time_weighted.py142
External Storage: Optional DocumentStorage protocol for persistence packages/dbgpt-core/src/dbgpt/rag/retriever/time_weighted.py82-104

Sources: packages/dbgpt-core/src/dbgpt/rag/retriever/time_weighted.py47-227

Hybrid Retrieval

Hybrid retrieval combines multiple strategies to leverage their complementary strengths. DB-GPT supports flexible hybrid approaches through its modular architecture.

Hybrid Retrieval Flow

Implementation in KnowledgeSpaceRetriever

The recall_test() method in KnowledgeService demonstrates hybrid retrieval:

Initial Retrieval: Retrieve with larger top_k (20+) to ensure recall packages/dbgpt-app/src/dbgpt_app/knowledge/service.py336-350
Multiple Retrievers: KnowledgeSpaceRetriever queries different storage types packages/dbgpt-app/src/dbgpt_app/knowledge/service.py352-357
Score Threshold: Apply initial threshold for quality filtering packages/dbgpt-app/src/dbgpt_app/knowledge/service.py336-341
Reranking: Apply reranker if configured packages/dbgpt-app/src/dbgpt_app/knowledge/service.py367-372
Final Threshold: Apply user-specified threshold packages/dbgpt-app/src/dbgpt_app/knowledge/service.py374-378

Storage Type Detection

The StorageManager routes queries to appropriate storage backends:

Sources: packages/dbgpt-app/src/dbgpt_app/knowledge/service.py323-394 packages/dbgpt-serve/src/dbgpt_serve/rag/storage_manager.py39-88

Reranking

Reranking refines initial retrieval results by applying a more sophisticated scoring model, typically using cross-encoder models that can capture query-document interactions better than bi-encoder embeddings.

RerankEmbeddingsRanker Architecture

Reranking Implementation

The RerankEmbeddingsRanker is instantiated and applied in the knowledge service:

The reranker:

Takes Initial Results: Accepts candidates with their retrieval scores packages/dbgpt-app/src/dbgpt_app/knowledge/service.py367-372
Computes Rerank Scores: Uses cross-encoder model to score query-chunk pairs
Reorders Results: Sorts by rerank scores rather than initial retrieval scores
Returns Top K: Returns specified number of best-ranked results

Configuration

Reranking is configured in the application settings:

The system checks for reranker availability and adjusts initial retrieval top_k:

Sources: packages/dbgpt-app/src/dbgpt_app/knowledge/service.py344-372 configs/dbgpt-bm25-rag.toml25

Retrieval Strategy Selection and Configuration

Different retrieval strategies are selected based on storage type and configuration.

Vector Store Strategy Selection

Configuration Parameters

Key configuration parameters affecting retrieval:

Parameter	Location	Purpose	Default
`similarity_top_k`	`[rag]`	Initial retrieval count	5
`similarity_score_threshold`	`[rag]`	Minimum similarity score	0.0
`rerank_top_k`	`[rag]`	Final count after reranking	3
`max_chunks_once_load`	`[rag]`	Batch size for loading	10
`max_threads`	`[rag]`	Parallel loading threads	1

Runtime Strategy Selection

The VectorStoreConnector provides a unified interface for all strategies:

Sources: packages/dbgpt-serve/src/dbgpt_serve/rag/connector.py168-227 packages/dbgpt-serve/src/dbgpt_serve/rag/storage_manager.py39-88 configs/dbgpt-bm25-rag.toml18-26

Metadata Filtering

All retrieval strategies support metadata filtering to narrow search scope based on document properties.

Filter Operations

DB-GPT supports standard filter operators through the MetadataFilters abstraction:

Operator	Symbol	Description	Example
`EQ`	`==`	Equal to	`source == "paper.pdf"`
`NE`	`!=`	Not equal to	`status != "draft"`
`GT`	`>`	Greater than	`page > 10`
`LT`	`<`	Less than	`score < 0.5`
`GTE`	`>=`	Greater or equal	`date >= "2024-01-01"`
`LTE`	`<=`	Less or equal	`priority <= 3`
`IN`	`in`	In list	`category in ["doc", "paper"]`
`NIN`	`not in`	Not in list	`status not in ["deleted"]`

Storage-Specific Filter Conversion

Each storage backend converts MetadataFilters to its native format:

Milvus Expression:

Chroma Where Clause:

OceanBase SQL:

Sources: packages/dbgpt-ext/src/dbgpt_ext/storage/vector_store/milvus_store.py715-747 packages/dbgpt-ext/src/dbgpt_ext/storage/vector_store/chroma_store.py340-374 packages/dbgpt-ext/src/dbgpt_ext/storage/vector_store/oceanbase_store.py499-536

Performance Considerations

Retrieval Strategy Performance

Strategy	Latency	Precision	Recall	Use Case
Semantic Search	Medium	High	Medium	General semantic matching
BM25 Full-Text	Low	Medium	High	Keyword-specific queries
Graph Retrieval	High	High	Medium	Relationship-based queries
Time-Weighted	Medium	Medium	Medium	Recency-sensitive retrieval
Hybrid + Rerank	High	Very High	High	Quality-critical applications

Index Configuration

HNSW index parameters affect retrieval performance:

M: Number of bi-directional links per node (higher = better recall, more memory)
efConstruction: Size of dynamic candidate list during index building
efSearch: Size of dynamic candidate list during search (configurable at query time)

Batch Processing

For large-scale retrieval, use batch operations with configured limits:

Sources: packages/dbgpt-ext/src/dbgpt_ext/storage/vector_store/milvus_store.py256-272 packages/dbgpt-core/src/dbgpt/storage/base.py115-157

Retrieval Strategies and Reranking

Relevant source files

Overview of Retrieval Strategies

DB-GPT implements multiple retrieval strategies that can be used independently or combined through hybrid retrieval. Each strategy has distinct characteristics optimized for different use cases:

Sources: packages/dbgpt-core/src/dbgpt/storage/base.py214-291 packages/dbgpt-app/src/dbgpt_app/knowledge/service.py323-394

Semantic Search (Embedding Retrieval)

Semantic search uses embedding similarity to find relevant documents based on semantic meaning rather than keyword matching. This is the most common retrieval strategy in DB-GPT.

Implementation Across Vector Stores

All vector stores implement the similar_search() and similar_search_with_scores() methods from the IndexStoreBase interface:

Vector Store	Similarity Metric	Index Type	Score Range
Milvus	COSINE (default)	HNSW	[0, 1]
Chroma	Cosine	HNSW	[0, 1]
Elasticsearch	Cosine	N/A	[0, ∞)
PGVector	Cosine	HNSW	[0, 1]
OceanBase	L2, Cosine, Inner Product	HNSW	Variable
Weaviate	Cosine	HNSW	[0, 1]

Milvus Semantic Search

The Milvus implementation in MilvusStore._search() performs the following steps:

Query Embedding: Convert query text to vector using self.embedding.embed_query(query) packages/dbgpt-ext/src/dbgpt_ext/storage/vector_store/milvus_store.py612
Index Selection: Use HNSW index with configurable parameters (M=8, efConstruction=64 by default) packages/dbgpt-ext/src/dbgpt_ext/storage/vector_store/milvus_store.py256-272
Similarity Search: Execute col.search() with COSINE metric packages/dbgpt-ext/src/dbgpt_ext/storage/vector_store/milvus_store.py619-630
Result Parsing: Convert results to Chunk objects with metadata and scores packages/dbgpt-ext/src/dbgpt_ext/storage/vector_store/milvus_store.py631-647

Score Threshold Filtering

All vector stores support score threshold filtering to remove low-quality results:

Full-Text Search (BM25)

Full-text search uses the BM25 algorithm for keyword-based retrieval, particularly effective for exact term matching and queries with specific terminology.

Milvus Full-Text Search

Milvus 2.5.0+ supports full-text search using built-in BM25 functions:

The implementation creates a BM25 function during collection creation:

Schema Definition: Add sparse_vector field of type SPARSE_FLOAT_VECTOR packages/dbgpt-ext/src/dbgpt_ext/storage/vector_store/milvus_store.py363
BM25 Function: Create text_bm25_emb function mapping text to sparse vectors packages/dbgpt-ext/src/dbgpt_ext/storage/vector_store/milvus_store.py369-375
Index Creation: Build AUTOINDEX with BM25 metric packages/dbgpt-ext/src/dbgpt_ext/storage/vector_store/milvus_store.py384-390
Search Execution: Query using sparse vector field packages/dbgpt-ext/src/dbgpt_ext/storage/vector_store/milvus_store.py780-787

Elasticsearch Full-Text Search

Elasticsearch provides native BM25 support through match queries:

The search returns results with BM25 relevance scores computed by Elasticsearch's scoring algorithm.

Graph-Based Retrieval

Graph-based retrieval leverages knowledge graphs to find semantically related information through graph traversal and keyword extraction.

Keyword Extraction and Graph Exploration

The BuiltinKnowledgeGraph implements graph-based retrieval through:

Keyword Extraction: Extract keywords from query using KeywordExtractor packages/dbgpt-ext/src/dbgpt_ext/storage/knowledge_graph/knowledge_graph.py257
Graph Exploration: Explore 3-hop neighborhood using explore_trigraph() packages/dbgpt-ext/src/dbgpt_ext/storage/knowledge_graph/knowledge_graph.py258-260
Subgraph Formatting: Format entities and relationships as structured text packages/dbgpt-ext/src/dbgpt_ext/storage/knowledge_graph/knowledge_graph.py267-287
Context Assembly: Return as single Chunk with graph context packages/dbgpt-ext/src/dbgpt_ext/storage/knowledge_graph/knowledge_graph.py288

Community Summary Knowledge Graph Retrieval

The CommunitySummaryKnowledgeGraph extends basic graph retrieval with:

Triplet Graph Search: Semantic search over entity-relation triplets
Document Graph Search: Graph structure of documents and chunks
Community Summaries: Pre-computed summaries of graph communities
Text Search: Optional full-text search over graph entities

The GraphRetriever orchestrates these strategies based on configuration parameters.

Sources: packages/dbgpt-ext/src/dbgpt_ext/storage/knowledge_graph/knowledge_graph.py245-289 packages/dbgpt-ext/src/dbgpt_ext/storage/knowledge_graph/community_summary.py177-302

Time-Weighted Retrieval

Time-weighted retrieval combines semantic similarity with temporal relevance, giving higher scores to more recently accessed or created documents.

Time Decay Formula

The TimeWeightedEmbeddingRetriever applies exponential decay to similarity scores:

The combined score calculation:

Memory Stream Management

The retriever maintains a memory_stream of documents with temporal metadata:

Buffer Index: Each document gets a unique buffer_idx in the stream packages/dbgpt-core/src/dbgpt/rag/retriever/time_weighted.py143
Access Tracking: last_accessed_at timestamp updated on retrieval packages/dbgpt-core/src/dbgpt/rag/retriever/time_weighted.py140
Creation Time: created_at timestamp for initial decay calculation packages/dbgpt-core/src/dbgpt/rag/retriever/time_weighted.py142
External Storage: Optional DocumentStorage protocol for persistence packages/dbgpt-core/src/dbgpt/rag/retriever/time_weighted.py82-104

Sources: packages/dbgpt-core/src/dbgpt/rag/retriever/time_weighted.py47-227

Hybrid Retrieval

Hybrid retrieval combines multiple strategies to leverage their complementary strengths. DB-GPT supports flexible hybrid approaches through its modular architecture.

Hybrid Retrieval Flow

Implementation in KnowledgeSpaceRetriever

The recall_test() method in KnowledgeService demonstrates hybrid retrieval:

Initial Retrieval: Retrieve with larger top_k (20+) to ensure recall packages/dbgpt-app/src/dbgpt_app/knowledge/service.py336-350
Multiple Retrievers: KnowledgeSpaceRetriever queries different storage types packages/dbgpt-app/src/dbgpt_app/knowledge/service.py352-357
Score Threshold: Apply initial threshold for quality filtering packages/dbgpt-app/src/dbgpt_app/knowledge/service.py336-341
Reranking: Apply reranker if configured packages/dbgpt-app/src/dbgpt_app/knowledge/service.py367-372
Final Threshold: Apply user-specified threshold packages/dbgpt-app/src/dbgpt_app/knowledge/service.py374-378

Storage Type Detection

The StorageManager routes queries to appropriate storage backends:

Sources: packages/dbgpt-app/src/dbgpt_app/knowledge/service.py323-394 packages/dbgpt-serve/src/dbgpt_serve/rag/storage_manager.py39-88

Reranking

RerankEmbeddingsRanker Architecture

Reranking Implementation

The RerankEmbeddingsRanker is instantiated and applied in the knowledge service:

The reranker:

Takes Initial Results: Accepts candidates with their retrieval scores packages/dbgpt-app/src/dbgpt_app/knowledge/service.py367-372
Computes Rerank Scores: Uses cross-encoder model to score query-chunk pairs
Reorders Results: Sorts by rerank scores rather than initial retrieval scores
Returns Top K: Returns specified number of best-ranked results

Configuration

Reranking is configured in the application settings:

The system checks for reranker availability and adjusts initial retrieval top_k:

Sources: packages/dbgpt-app/src/dbgpt_app/knowledge/service.py344-372 configs/dbgpt-bm25-rag.toml25

Retrieval Strategy Selection and Configuration

Different retrieval strategies are selected based on storage type and configuration.

Vector Store Strategy Selection

Configuration Parameters

Key configuration parameters affecting retrieval:

Parameter	Location	Purpose	Default
`similarity_top_k`	`[rag]`	Initial retrieval count	5
`similarity_score_threshold`	`[rag]`	Minimum similarity score	0.0
`rerank_top_k`	`[rag]`	Final count after reranking	3
`max_chunks_once_load`	`[rag]`	Batch size for loading	10
`max_threads`	`[rag]`	Parallel loading threads	1

Runtime Strategy Selection

The VectorStoreConnector provides a unified interface for all strategies:

Sources: packages/dbgpt-serve/src/dbgpt_serve/rag/connector.py168-227 packages/dbgpt-serve/src/dbgpt_serve/rag/storage_manager.py39-88 configs/dbgpt-bm25-rag.toml18-26

Metadata Filtering

All retrieval strategies support metadata filtering to narrow search scope based on document properties.

Filter Operations

DB-GPT supports standard filter operators through the MetadataFilters abstraction:

Operator	Symbol	Description	Example
`EQ`	`==`	Equal to	`source == "paper.pdf"`
`NE`	`!=`	Not equal to	`status != "draft"`
`GT`	`>`	Greater than	`page > 10`
`LT`	`<`	Less than	`score < 0.5`
`GTE`	`>=`	Greater or equal	`date >= "2024-01-01"`
`LTE`	`<=`	Less or equal	`priority <= 3`
`IN`	`in`	In list	`category in ["doc", "paper"]`
`NIN`	`not in`	Not in list	`status not in ["deleted"]`

Storage-Specific Filter Conversion

Each storage backend converts MetadataFilters to its native format:

Milvus Expression:

Chroma Where Clause:

OceanBase SQL:

Performance Considerations

Retrieval Strategy Performance

Strategy	Latency	Precision	Recall	Use Case
Semantic Search	Medium	High	Medium	General semantic matching
BM25 Full-Text	Low	Medium	High	Keyword-specific queries
Graph Retrieval	High	High	Medium	Relationship-based queries
Time-Weighted	Medium	Medium	Medium	Recency-sensitive retrieval
Hybrid + Rerank	High	Very High	High	Quality-critical applications

Index Configuration

HNSW index parameters affect retrieval performance:

M: Number of bi-directional links per node (higher = better recall, more memory)
efConstruction: Size of dynamic candidate list during index building
efSearch: Size of dynamic candidate list during search (configurable at query time)

Batch Processing

For large-scale retrieval, use batch operations with configured limits:

Sources: packages/dbgpt-ext/src/dbgpt_ext/storage/vector_store/milvus_store.py256-272 packages/dbgpt-core/src/dbgpt/storage/base.py115-157

Retrieval Strategies and Reranking

Overview of Retrieval Strategies

Semantic Search (Embedding Retrieval)

Implementation Across Vector Stores

Milvus Semantic Search

Score Threshold Filtering

Full-Text Search (BM25)

Milvus Full-Text Search

Elasticsearch Full-Text Search

Graph-Based Retrieval

Keyword Extraction and Graph Exploration

Community Summary Knowledge Graph Retrieval

Time-Weighted Retrieval

Time Decay Formula

Memory Stream Management

Hybrid Retrieval

Hybrid Retrieval Flow

Implementation in KnowledgeSpaceRetriever

Storage Type Detection

Reranking

RerankEmbeddingsRanker Architecture

Reranking Implementation

Configuration

Retrieval Strategy Selection and Configuration

Vector Store Strategy Selection

Configuration Parameters

Runtime Strategy Selection

Metadata Filtering

Filter Operations

Storage-Specific Filter Conversion

Performance Considerations

Retrieval Strategy Performance

Index Configuration

Batch Processing

On this page

Retrieval Strategies and Reranking

Overview of Retrieval Strategies

Semantic Search (Embedding Retrieval)

Implementation Across Vector Stores

Milvus Semantic Search

Score Threshold Filtering

Full-Text Search (BM25)

Milvus Full-Text Search

Elasticsearch Full-Text Search

Graph-Based Retrieval

Keyword Extraction and Graph Exploration

Community Summary Knowledge Graph Retrieval

Time-Weighted Retrieval

Time Decay Formula

Memory Stream Management

Hybrid Retrieval

Hybrid Retrieval Flow

Implementation in KnowledgeSpaceRetriever

Storage Type Detection

Reranking

RerankEmbeddingsRanker Architecture

Reranking Implementation

Configuration

Retrieval Strategy Selection and Configuration

Vector Store Strategy Selection

Configuration Parameters

Runtime Strategy Selection

Metadata Filtering

Filter Operations

Storage-Specific Filter Conversion

Performance Considerations

Retrieval Strategy Performance

Index Configuration

Batch Processing

On this page