Model Adapters and Proxy Models

Relevant source files

Purpose and Scope

This document details the adapter pattern implementation for integrating external LLM APIs into DB-GPT's Service-oriented Multi-model Management Framework (SMMF). It covers the proxy adapter architecture, supported commercial LLM providers, authentication mechanisms, and how proxy models integrate with the unified model registry.

For information about local model deployment strategies (HuggingFace, vLLM, LLAMA.cpp), see Model Configuration and Deployment. For details about model workers and inference backends, see Model Workers and Inference Backends. For the broader SMMF architecture context, see Service-oriented Multi-model Management Framework.

Adapter Pattern Architecture

DB-GPT employs the adapter pattern to provide a unified interface for both locally deployed models and remote API-based models. This design allows applications to switch between different model backends transparently without changing application code.

Adapter Pattern Overview

Sources: docs/docs/modules/smmf.md1-147

The adapter layer abstracts differences between local inference frameworks and remote API providers, presenting a consistent interface to the application layer. This enables:

Transparent Model Switching: Applications can change model backends through configuration without code modifications
Unified Error Handling: Common error handling patterns across different providers
Consistent Response Formats: Standardized output regardless of underlying provider
Load Balancing: Ability to route requests to different backends based on availability and performance

Proxy Model Integration in SMMF

Proxy models integrate into DB-GPT's SMMF architecture through a dedicated proxy adapter layer that translates between DB-GPT's internal model interface and external API provider protocols.

Proxy Adapter Component Architecture

Sources: docs/docs/modules/smmf.md10-23 docs/docs/modules/smmf.md41-42

Key Components

Component	Responsibility	Notes
`ProxyModelFactory`	Instantiates proxy adapters based on provider type	Factory pattern implementation
`Authentication Manager`	Manages API keys and authentication tokens	Supports multiple auth schemes
`Request Handler`	Translates internal requests to provider-specific formats	Protocol adaptation
`Response Parser`	Normalizes provider responses to unified format	Ensures consistency
`Retry Logic`	Handles transient failures and rate limiting	Exponential backoff
`Proxy Model Worker`	Manages lifecycle of proxy model instances	Registered with Model Registry

Supported Proxy Model Providers

DB-GPT supports a comprehensive set of commercial and open-source API providers through dedicated proxy adapters.

Provider Matrix

Provider	API Type	Models Supported	Authentication	Status
OpenAI	REST API	GPT-3.5, GPT-4, GPT-4 Turbo	API Key	Stable
Azure OpenAI	REST API	GPT-3.5, GPT-4 (Azure-hosted)	API Key + Endpoint	Stable
DeepSeek	REST API	DeepSeek-V2, DeepSeek-Coder	API Key	Stable
Qwen (Alibaba Tongyi)	REST API	Qwen-7B, Qwen-14B, Qwen-Plus	API Key	Stable
Ollama	REST API	Llama2, Mistral, Custom Models	None (Local)	Stable
Baidu Wenxin	REST API	ERNIE-Bot, ERNIE-Bot-Turbo	API Key + Secret Key	Stable
Google Bard	REST API	Bard/Gemini	API Key	Stable
Zhipu AI (ChatGLM)	REST API	GLM-3, GLM-4	API Key	Stable
Xunfei Xinghuo	REST API	Spark v1.5, v2.0	App ID + API Key + Secret	Stable
Baichuan	REST API	Baichuan2-53B	API Key	Stable

Sources: docs/docs/modules/smmf.md68-76 docs/docs/modules/smmf.md41-42

OpenAI Proxy Adapter

The OpenAI proxy adapter serves as the reference implementation and is compatible with any OpenAI-compatible API endpoint.

Configuration Example:

Key Features:

Streaming response support
Function calling capabilities
Token usage tracking
Automatic retry on rate limit errors

Sources: examples/rag/rag_embedding_api_example.py10-12 examples/rag/rag_embedding_api_example.py38-52

DeepSeek Proxy Adapter

DeepSeek provides high-performance models optimized for coding and reasoning tasks.

Configuration Example:

Sources: docs/docs/modules/smmf.md41-42

Qwen (Alibaba Tongyi) Proxy Adapter

Alibaba's Qwen series models accessed through the DashScope platform.

Configuration Example:

Sources: docs/docs/modules/smmf.md71

Ollama Proxy Adapter

Ollama provides a local API server for running open-source models with a simple REST interface.

Configuration Example:

Ollama-Specific Features:

No authentication required for local deployment
Custom model support (GGUF format)
GPU acceleration support
Model pull and management through API

Sources: .mypy.ini117-118 docs/docs/modules/smmf.md41-42

Baidu Wenxin Proxy Adapter

Baidu's ERNIE Bot series accessed through Wenxin Workshop platform.

Configuration Example:

Authentication Flow:

Use API Key and Secret Key to obtain access token
Include access token in subsequent requests
Token refresh before expiration

Sources: docs/docs/modules/smmf.md73

Zhipu AI (ChatGLM) Proxy Adapter

Zhipu AI's GLM series models through their open platform.

Configuration Example:

Sources: docs/docs/modules/smmf.md74

Authentication and API Key Management

Proxy adapters require secure management of API credentials. DB-GPT provides multiple mechanisms for authentication configuration.

Authentication Mechanisms by Provider

Sources: examples/rag/rag_embedding_api_example.py42-51

Environment Variable Configuration

API keys can be specified through environment variables, which are resolved at runtime:

Sources: examples/rag/rag_embedding_api_example.py42-47

Configuration File Authentication

TOML configuration files support variable substitution using ${VAR_NAME} syntax:

The :- syntax provides default values when environment variables are not set.

Sources: examples/rag/rag_embedding_api_example.py38-52

Authentication Types

Auth Type	Providers	Implementation
Bearer Token	OpenAI, DeepSeek, Qwen	`Authorization: Bearer {api_key}` header
API Key Header	Custom APIs	`X-API-Key: {api_key}` or `API-Key: {api_key}`
OAuth 2.0	Azure, Google	Token acquisition flow with refresh
Dual Key	Baidu, Xunfei	API Key + Secret Key → Access Token
No Auth	Ollama (local)	No authentication required

Proxy Model Request Flow

The following diagram illustrates the complete request flow through a proxy adapter:

End-to-End Request Processing

Sources: docs/docs/modules/smmf.md10-23

Request Transformation Steps

Request Preparation
- Extract model parameters (temperature, max_tokens, etc.)
- Format prompt according to provider schema
- Validate request parameters
Authentication Injection
- Resolve API credentials from configuration
- Format authentication headers
- Handle token refresh if needed
Protocol Translation
- Convert internal format to provider-specific format
- Map parameter names (e.g., max_new_tokens → max_tokens)
- Handle provider-specific fields
Response Normalization
- Parse provider response
- Extract generated text
- Normalize token usage statistics
- Handle streaming chunks if applicable
Error Handling
- Detect rate limiting (HTTP 429)
- Implement exponential backoff
- Retry transient failures
- Propagate errors with context

Model Registry Integration

Proxy models register with the Model Registry alongside local models, enabling unified model management and discovery.

Registration Process

Sources: docs/docs/modules/smmf.md89-93

Model Metadata Structure

Each registered proxy model maintains metadata in the registry:

Field	Type	Description
`model_name`	String	User-facing model identifier (e.g., "gpt-4")
`model_type`	String	Provider type (e.g., "openai", "deepseek")
`worker_id`	String	Unique worker instance identifier
`api_base`	URL	Base URL for API endpoint
`healthy`	Boolean	Health status from last check
`last_heartbeat`	Timestamp	Last successful heartbeat
`capabilities`	Dict	Supported features (streaming, functions, etc.)
`rate_limits`	Dict	Rate limit configuration

Health Monitoring

Proxy workers maintain health status through periodic checks:

Unhealthy workers are temporarily removed from the available instance pool until health is restored.

Sources: docs/docs/modules/smmf.md103-115

Lightweight Dependency Management

DB-GPT's modular design allows selective installation of proxy dependencies based on required providers.

Optional Proxy Dependencies

Provider-Specific Dependencies

Provider	Python Package	Installation Extra	Notes
OpenAI	`openai>=1.0.0`	`[openai]`	Official SDK
Ollama	`ollama`	`[ollama]`	Optional for enhanced features
Qianfan (Baidu)	`qianfan`	`[qianfan]`	Baidu SDK
Others	`requests`, `httpx`	`[core]`	REST client libraries

Sources: docs/docs/modules/smmf.md116-142 .mypy.ini126-127

The mypy configuration shows which external libraries are optional:

These settings indicate that proxy provider libraries are not required for core DB-GPT functionality.

Sources: .mypy.ini117-127

Configuration Reference

Proxy Model Configuration Template

Environment Variable Naming Convention

DB-GPT follows a consistent naming convention for environment variables:

Sources: examples/rag/rag_embedding_api_example.py10-12 examples/rag/rag_embedding_api_example.py42-50

Embedding Model Proxies

Proxy adapters also support embedding models through external APIs, enabling RAG applications without local embedding model deployment.

OpenAPI Embeddings Adapter

The OpenAPIEmbeddings class provides a generic adapter for OpenAI-compatible embedding endpoints:

Sources: examples/rag/rag_embedding_api_example.py1-89

Embedding Proxy Configuration

Supported Embedding Providers

Provider	Model Examples	Dimensions	Notes
OpenAI	text-embedding-ada-002, text-embedding-3-small	1536, 1536	Production-ready
Azure OpenAI	text-embedding-ada-002	1536	Enterprise deployment
DB-GPT API Server	text2vec, bge-large	Varies	Self-hosted option
Cohere	embed-english-v3.0	1024	Multilingual support

Sources: examples/rag/rag_embedding_api_example.py1-24

Error Handling and Retry Logic

Proxy adapters implement robust error handling for common API failure scenarios.

Error Types and Responses

Error Type	HTTP Code	Handling Strategy
Rate Limit	429	Exponential backoff with retry
Authentication	401, 403	Fail fast, log credential issue
Server Error	500, 502, 503	Retry with backoff (transient)
Timeout	Connection timeout	Retry with increased timeout
Invalid Request	400, 422	No retry, propagate to caller
Model Not Found	404	No retry, check configuration

Retry Configuration

Retry Algorithm

Performance Considerations

Proxy models introduce network latency compared to local models. Key optimization strategies:

Latency Optimization

Strategy	Description	Impact
Connection Pooling	Reuse HTTP connections across requests	Reduces TCP handshake overhead
Request Batching	Combine multiple requests when possible	Improves throughput
Async I/O	Non-blocking API calls	Increases concurrency
Caching	Cache responses for identical prompts	Eliminates redundant API calls
Regional Endpoints	Use geographically closer API endpoints	Reduces network latency

Cost Management

Best Practices:

Monitor token usage per model
Set token limits in configurations
Use cheaper models for simpler tasks
Implement prompt compression strategies

Summary

DB-GPT's proxy adapter architecture provides:

Unified Interface: Consistent API across 10+ providers
Transparent Integration: Applications unaware of underlying provider
Robust Authentication: Multiple auth schemes with secure credential management
Error Resilience: Automatic retry with exponential backoff
Flexible Configuration: Environment variables and TOML files
Lightweight Dependencies: Optional installation per provider
Production Ready: Health monitoring and observability

This architecture enables DB-GPT to leverage the latest commercial LLMs while maintaining the flexibility to switch between providers or deploy locally without application code changes.

Sources: docs/docs/modules/smmf.md1-147

Model Adapters and Proxy Models

Relevant source files

Purpose and Scope

Adapter Pattern Architecture

Adapter Pattern Overview

Sources: docs/docs/modules/smmf.md1-147

The adapter layer abstracts differences between local inference frameworks and remote API providers, presenting a consistent interface to the application layer. This enables:

Transparent Model Switching: Applications can change model backends through configuration without code modifications
Unified Error Handling: Common error handling patterns across different providers
Consistent Response Formats: Standardized output regardless of underlying provider
Load Balancing: Ability to route requests to different backends based on availability and performance

Proxy Model Integration in SMMF

Proxy models integrate into DB-GPT's SMMF architecture through a dedicated proxy adapter layer that translates between DB-GPT's internal model interface and external API provider protocols.

Proxy Adapter Component Architecture

Sources: docs/docs/modules/smmf.md10-23 docs/docs/modules/smmf.md41-42

Key Components

Component	Responsibility	Notes
`ProxyModelFactory`	Instantiates proxy adapters based on provider type	Factory pattern implementation
`Authentication Manager`	Manages API keys and authentication tokens	Supports multiple auth schemes
`Request Handler`	Translates internal requests to provider-specific formats	Protocol adaptation
`Response Parser`	Normalizes provider responses to unified format	Ensures consistency
`Retry Logic`	Handles transient failures and rate limiting	Exponential backoff
`Proxy Model Worker`	Manages lifecycle of proxy model instances	Registered with Model Registry

Supported Proxy Model Providers

DB-GPT supports a comprehensive set of commercial and open-source API providers through dedicated proxy adapters.

Provider Matrix

Provider	API Type	Models Supported	Authentication	Status
OpenAI	REST API	GPT-3.5, GPT-4, GPT-4 Turbo	API Key	Stable
Azure OpenAI	REST API	GPT-3.5, GPT-4 (Azure-hosted)	API Key + Endpoint	Stable
DeepSeek	REST API	DeepSeek-V2, DeepSeek-Coder	API Key	Stable
Qwen (Alibaba Tongyi)	REST API	Qwen-7B, Qwen-14B, Qwen-Plus	API Key	Stable
Ollama	REST API	Llama2, Mistral, Custom Models	None (Local)	Stable
Baidu Wenxin	REST API	ERNIE-Bot, ERNIE-Bot-Turbo	API Key + Secret Key	Stable
Google Bard	REST API	Bard/Gemini	API Key	Stable
Zhipu AI (ChatGLM)	REST API	GLM-3, GLM-4	API Key	Stable
Xunfei Xinghuo	REST API	Spark v1.5, v2.0	App ID + API Key + Secret	Stable
Baichuan	REST API	Baichuan2-53B	API Key	Stable

Sources: docs/docs/modules/smmf.md68-76 docs/docs/modules/smmf.md41-42

OpenAI Proxy Adapter

The OpenAI proxy adapter serves as the reference implementation and is compatible with any OpenAI-compatible API endpoint.

Configuration Example:

Key Features:

Streaming response support
Function calling capabilities
Token usage tracking
Automatic retry on rate limit errors

Sources: examples/rag/rag_embedding_api_example.py10-12 examples/rag/rag_embedding_api_example.py38-52

DeepSeek Proxy Adapter

DeepSeek provides high-performance models optimized for coding and reasoning tasks.

Configuration Example:

Sources: docs/docs/modules/smmf.md41-42

Qwen (Alibaba Tongyi) Proxy Adapter

Alibaba's Qwen series models accessed through the DashScope platform.

Configuration Example:

Sources: docs/docs/modules/smmf.md71

Ollama Proxy Adapter

Ollama provides a local API server for running open-source models with a simple REST interface.

Configuration Example:

Ollama-Specific Features:

No authentication required for local deployment
Custom model support (GGUF format)
GPU acceleration support
Model pull and management through API

Sources: .mypy.ini117-118 docs/docs/modules/smmf.md41-42

Baidu Wenxin Proxy Adapter

Baidu's ERNIE Bot series accessed through Wenxin Workshop platform.

Configuration Example:

Authentication Flow:

Use API Key and Secret Key to obtain access token
Include access token in subsequent requests
Token refresh before expiration

Sources: docs/docs/modules/smmf.md73

Zhipu AI (ChatGLM) Proxy Adapter

Zhipu AI's GLM series models through their open platform.

Configuration Example:

Sources: docs/docs/modules/smmf.md74

Authentication and API Key Management

Proxy adapters require secure management of API credentials. DB-GPT provides multiple mechanisms for authentication configuration.

Authentication Mechanisms by Provider

Sources: examples/rag/rag_embedding_api_example.py42-51

Environment Variable Configuration

API keys can be specified through environment variables, which are resolved at runtime:

Sources: examples/rag/rag_embedding_api_example.py42-47

Configuration File Authentication

TOML configuration files support variable substitution using ${VAR_NAME} syntax:

The :- syntax provides default values when environment variables are not set.

Sources: examples/rag/rag_embedding_api_example.py38-52

Authentication Types

Auth Type	Providers	Implementation
Bearer Token	OpenAI, DeepSeek, Qwen	`Authorization: Bearer {api_key}` header
API Key Header	Custom APIs	`X-API-Key: {api_key}` or `API-Key: {api_key}`
OAuth 2.0	Azure, Google	Token acquisition flow with refresh
Dual Key	Baidu, Xunfei	API Key + Secret Key → Access Token
No Auth	Ollama (local)	No authentication required

Proxy Model Request Flow

The following diagram illustrates the complete request flow through a proxy adapter:

End-to-End Request Processing

Sources: docs/docs/modules/smmf.md10-23

Request Transformation Steps

Request Preparation
- Extract model parameters (temperature, max_tokens, etc.)
- Format prompt according to provider schema
- Validate request parameters
Authentication Injection
- Resolve API credentials from configuration
- Format authentication headers
- Handle token refresh if needed
Protocol Translation
- Convert internal format to provider-specific format
- Map parameter names (e.g., max_new_tokens → max_tokens)
- Handle provider-specific fields
Response Normalization
- Parse provider response
- Extract generated text
- Normalize token usage statistics
- Handle streaming chunks if applicable
Error Handling
- Detect rate limiting (HTTP 429)
- Implement exponential backoff
- Retry transient failures
- Propagate errors with context

Model Registry Integration

Proxy models register with the Model Registry alongside local models, enabling unified model management and discovery.

Registration Process

Sources: docs/docs/modules/smmf.md89-93

Model Metadata Structure

Each registered proxy model maintains metadata in the registry:

Field	Type	Description
`model_name`	String	User-facing model identifier (e.g., "gpt-4")
`model_type`	String	Provider type (e.g., "openai", "deepseek")
`worker_id`	String	Unique worker instance identifier
`api_base`	URL	Base URL for API endpoint
`healthy`	Boolean	Health status from last check
`last_heartbeat`	Timestamp	Last successful heartbeat
`capabilities`	Dict	Supported features (streaming, functions, etc.)
`rate_limits`	Dict	Rate limit configuration

Health Monitoring

Proxy workers maintain health status through periodic checks:

Unhealthy workers are temporarily removed from the available instance pool until health is restored.

Sources: docs/docs/modules/smmf.md103-115

Lightweight Dependency Management

DB-GPT's modular design allows selective installation of proxy dependencies based on required providers.

Optional Proxy Dependencies

Provider-Specific Dependencies

Provider	Python Package	Installation Extra	Notes
OpenAI	`openai>=1.0.0`	`[openai]`	Official SDK
Ollama	`ollama`	`[ollama]`	Optional for enhanced features
Qianfan (Baidu)	`qianfan`	`[qianfan]`	Baidu SDK
Others	`requests`, `httpx`	`[core]`	REST client libraries

Sources: docs/docs/modules/smmf.md116-142 .mypy.ini126-127

The mypy configuration shows which external libraries are optional:

These settings indicate that proxy provider libraries are not required for core DB-GPT functionality.

Sources: .mypy.ini117-127

Configuration Reference

Proxy Model Configuration Template

Environment Variable Naming Convention

DB-GPT follows a consistent naming convention for environment variables:

Sources: examples/rag/rag_embedding_api_example.py10-12 examples/rag/rag_embedding_api_example.py42-50

Embedding Model Proxies

Proxy adapters also support embedding models through external APIs, enabling RAG applications without local embedding model deployment.

OpenAPI Embeddings Adapter

The OpenAPIEmbeddings class provides a generic adapter for OpenAI-compatible embedding endpoints:

Sources: examples/rag/rag_embedding_api_example.py1-89

Embedding Proxy Configuration

Supported Embedding Providers

Provider	Model Examples	Dimensions	Notes
OpenAI	text-embedding-ada-002, text-embedding-3-small	1536, 1536	Production-ready
Azure OpenAI	text-embedding-ada-002	1536	Enterprise deployment
DB-GPT API Server	text2vec, bge-large	Varies	Self-hosted option
Cohere	embed-english-v3.0	1024	Multilingual support

Sources: examples/rag/rag_embedding_api_example.py1-24

Error Handling and Retry Logic

Proxy adapters implement robust error handling for common API failure scenarios.

Error Types and Responses

Error Type	HTTP Code	Handling Strategy
Rate Limit	429	Exponential backoff with retry
Authentication	401, 403	Fail fast, log credential issue
Server Error	500, 502, 503	Retry with backoff (transient)
Timeout	Connection timeout	Retry with increased timeout
Invalid Request	400, 422	No retry, propagate to caller
Model Not Found	404	No retry, check configuration

Retry Configuration

Retry Algorithm

Performance Considerations

Proxy models introduce network latency compared to local models. Key optimization strategies:

Latency Optimization

Strategy	Description	Impact
Connection Pooling	Reuse HTTP connections across requests	Reduces TCP handshake overhead
Request Batching	Combine multiple requests when possible	Improves throughput
Async I/O	Non-blocking API calls	Increases concurrency
Caching	Cache responses for identical prompts	Eliminates redundant API calls
Regional Endpoints	Use geographically closer API endpoints	Reduces network latency

Cost Management

Best Practices:

Monitor token usage per model
Set token limits in configurations
Use cheaper models for simpler tasks
Implement prompt compression strategies

Summary

DB-GPT's proxy adapter architecture provides:

Unified Interface: Consistent API across 10+ providers
Transparent Integration: Applications unaware of underlying provider
Robust Authentication: Multiple auth schemes with secure credential management
Error Resilience: Automatic retry with exponential backoff
Flexible Configuration: Environment variables and TOML files
Lightweight Dependencies: Optional installation per provider
Production Ready: Health monitoring and observability

This architecture enables DB-GPT to leverage the latest commercial LLMs while maintaining the flexibility to switch between providers or deploy locally without application code changes.

Sources: docs/docs/modules/smmf.md1-147

Model Adapters and Proxy Models

Purpose and Scope

Adapter Pattern Architecture

Adapter Pattern Overview

Proxy Model Integration in SMMF

Proxy Adapter Component Architecture

Key Components

Supported Proxy Model Providers

Provider Matrix

OpenAI Proxy Adapter

DeepSeek Proxy Adapter

Qwen (Alibaba Tongyi) Proxy Adapter

Ollama Proxy Adapter

Baidu Wenxin Proxy Adapter

Zhipu AI (ChatGLM) Proxy Adapter

Authentication and API Key Management

Authentication Mechanisms by Provider

Environment Variable Configuration

Configuration File Authentication

Authentication Types

Proxy Model Request Flow

End-to-End Request Processing

Request Transformation Steps

Model Registry Integration

Registration Process

Model Metadata Structure

Health Monitoring

Lightweight Dependency Management

Optional Proxy Dependencies

Provider-Specific Dependencies

Configuration Reference

Proxy Model Configuration Template

Environment Variable Naming Convention

Embedding Model Proxies

OpenAPI Embeddings Adapter

Embedding Proxy Configuration

Supported Embedding Providers

Error Handling and Retry Logic

Error Types and Responses

Retry Configuration

Retry Algorithm

Performance Considerations

Latency Optimization

Cost Management

Summary

On this page

Model Adapters and Proxy Models

Purpose and Scope

Adapter Pattern Architecture

Adapter Pattern Overview

Proxy Model Integration in SMMF

Proxy Adapter Component Architecture

Key Components

Supported Proxy Model Providers

Provider Matrix

OpenAI Proxy Adapter

DeepSeek Proxy Adapter

Qwen (Alibaba Tongyi) Proxy Adapter

Ollama Proxy Adapter

Baidu Wenxin Proxy Adapter

Zhipu AI (ChatGLM) Proxy Adapter

Authentication and API Key Management

Authentication Mechanisms by Provider

Environment Variable Configuration

Configuration File Authentication

Authentication Types

Proxy Model Request Flow

End-to-End Request Processing

Request Transformation Steps

Model Registry Integration

Registration Process

Model Metadata Structure

Health Monitoring

Lightweight Dependency Management

Optional Proxy Dependencies

Provider-Specific Dependencies

Configuration Reference

Proxy Model Configuration Template

Environment Variable Naming Convention

Embedding Model Proxies