This document details the adapter pattern implementation for integrating external LLM APIs into DB-GPT's Service-oriented Multi-model Management Framework (SMMF). It covers the proxy adapter architecture, supported commercial LLM providers, authentication mechanisms, and how proxy models integrate with the unified model registry.
For information about local model deployment strategies (HuggingFace, vLLM, LLAMA.cpp), see Model Configuration and Deployment. For details about model workers and inference backends, see Model Workers and Inference Backends. For the broader SMMF architecture context, see Service-oriented Multi-model Management Framework.
DB-GPT employs the adapter pattern to provide a unified interface for both locally deployed models and remote API-based models. This design allows applications to switch between different model backends transparently without changing application code.
Sources: docs/docs/modules/smmf.md1-147
The adapter layer abstracts differences between local inference frameworks and remote API providers, presenting a consistent interface to the application layer. This enables:
Proxy models integrate into DB-GPT's SMMF architecture through a dedicated proxy adapter layer that translates between DB-GPT's internal model interface and external API provider protocols.
Sources: docs/docs/modules/smmf.md10-23 docs/docs/modules/smmf.md41-42
| Component | Responsibility | Notes |
|---|---|---|
ProxyModelFactory | Instantiates proxy adapters based on provider type | Factory pattern implementation |
Authentication Manager | Manages API keys and authentication tokens | Supports multiple auth schemes |
Request Handler | Translates internal requests to provider-specific formats | Protocol adaptation |
Response Parser | Normalizes provider responses to unified format | Ensures consistency |
Retry Logic | Handles transient failures and rate limiting | Exponential backoff |
Proxy Model Worker | Manages lifecycle of proxy model instances | Registered with Model Registry |
DB-GPT supports a comprehensive set of commercial and open-source API providers through dedicated proxy adapters.
| Provider | API Type | Models Supported | Authentication | Status |
|---|---|---|---|---|
| OpenAI | REST API | GPT-3.5, GPT-4, GPT-4 Turbo | API Key | Stable |
| Azure OpenAI | REST API | GPT-3.5, GPT-4 (Azure-hosted) | API Key + Endpoint | Stable |
| DeepSeek | REST API | DeepSeek-V2, DeepSeek-Coder | API Key | Stable |
| Qwen (Alibaba Tongyi) | REST API | Qwen-7B, Qwen-14B, Qwen-Plus | API Key | Stable |
| Ollama | REST API | Llama2, Mistral, Custom Models | None (Local) | Stable |
| Baidu Wenxin | REST API | ERNIE-Bot, ERNIE-Bot-Turbo | API Key + Secret Key | Stable |
| Google Bard | REST API | Bard/Gemini | API Key | Stable |
| Zhipu AI (ChatGLM) | REST API | GLM-3, GLM-4 | API Key | Stable |
| Xunfei Xinghuo | REST API | Spark v1.5, v2.0 | App ID + API Key + Secret | Stable |
| Baichuan | REST API | Baichuan2-53B | API Key | Stable |
Sources: docs/docs/modules/smmf.md68-76 docs/docs/modules/smmf.md41-42
The OpenAI proxy adapter serves as the reference implementation and is compatible with any OpenAI-compatible API endpoint.
Configuration Example:
Key Features:
Sources: examples/rag/rag_embedding_api_example.py10-12 examples/rag/rag_embedding_api_example.py38-52
DeepSeek provides high-performance models optimized for coding and reasoning tasks.
Configuration Example:
Sources: docs/docs/modules/smmf.md41-42
Alibaba's Qwen series models accessed through the DashScope platform.
Configuration Example:
Sources: docs/docs/modules/smmf.md71
Ollama provides a local API server for running open-source models with a simple REST interface.
Configuration Example:
Ollama-Specific Features:
Sources: .mypy.ini117-118 docs/docs/modules/smmf.md41-42
Baidu's ERNIE Bot series accessed through Wenxin Workshop platform.
Configuration Example:
Authentication Flow:
Sources: docs/docs/modules/smmf.md73
Zhipu AI's GLM series models through their open platform.
Configuration Example:
Sources: docs/docs/modules/smmf.md74
Proxy adapters require secure management of API credentials. DB-GPT provides multiple mechanisms for authentication configuration.
Sources: examples/rag/rag_embedding_api_example.py42-51
API keys can be specified through environment variables, which are resolved at runtime:
Sources: examples/rag/rag_embedding_api_example.py42-47
TOML configuration files support variable substitution using ${VAR_NAME} syntax:
The :- syntax provides default values when environment variables are not set.
Sources: examples/rag/rag_embedding_api_example.py38-52
| Auth Type | Providers | Implementation |
|---|---|---|
| Bearer Token | OpenAI, DeepSeek, Qwen | Authorization: Bearer {api_key} header |
| API Key Header | Custom APIs | X-API-Key: {api_key} or API-Key: {api_key} |
| OAuth 2.0 | Azure, Google | Token acquisition flow with refresh |
| Dual Key | Baidu, Xunfei | API Key + Secret Key → Access Token |
| No Auth | Ollama (local) | No authentication required |
The following diagram illustrates the complete request flow through a proxy adapter:
Sources: docs/docs/modules/smmf.md10-23
Request Preparation
Authentication Injection
Protocol Translation
max_new_tokens → max_tokens)Response Normalization
Error Handling
Proxy models register with the Model Registry alongside local models, enabling unified model management and discovery.
Sources: docs/docs/modules/smmf.md89-93
Each registered proxy model maintains metadata in the registry:
| Field | Type | Description |
|---|---|---|
model_name | String | User-facing model identifier (e.g., "gpt-4") |
model_type | String | Provider type (e.g., "openai", "deepseek") |
worker_id | String | Unique worker instance identifier |
api_base | URL | Base URL for API endpoint |
healthy | Boolean | Health status from last check |
last_heartbeat | Timestamp | Last successful heartbeat |
capabilities | Dict | Supported features (streaming, functions, etc.) |
rate_limits | Dict | Rate limit configuration |
Proxy workers maintain health status through periodic checks:
Unhealthy workers are temporarily removed from the available instance pool until health is restored.
Sources: docs/docs/modules/smmf.md103-115
DB-GPT's modular design allows selective installation of proxy dependencies based on required providers.
| Provider | Python Package | Installation Extra | Notes |
|---|---|---|---|
| OpenAI | openai>=1.0.0 | [openai] | Official SDK |
| Ollama | ollama | [ollama] | Optional for enhanced features |
| Qianfan (Baidu) | qianfan | [qianfan] | Baidu SDK |
| Others | requests, httpx | [core] | REST client libraries |
Sources: docs/docs/modules/smmf.md116-142 .mypy.ini126-127
The mypy configuration shows which external libraries are optional:
These settings indicate that proxy provider libraries are not required for core DB-GPT functionality.
Sources: .mypy.ini117-127
DB-GPT follows a consistent naming convention for environment variables:
Sources: examples/rag/rag_embedding_api_example.py10-12 examples/rag/rag_embedding_api_example.py42-50
Proxy adapters also support embedding models through external APIs, enabling RAG applications without local embedding model deployment.
The OpenAPIEmbeddings class provides a generic adapter for OpenAI-compatible embedding endpoints:
Sources: examples/rag/rag_embedding_api_example.py1-89
| Provider | Model Examples | Dimensions | Notes |
|---|---|---|---|
| OpenAI | text-embedding-ada-002, text-embedding-3-small | 1536, 1536 | Production-ready |
| Azure OpenAI | text-embedding-ada-002 | 1536 | Enterprise deployment |
| DB-GPT API Server | text2vec, bge-large | Varies | Self-hosted option |
| Cohere | embed-english-v3.0 | 1024 | Multilingual support |
Sources: examples/rag/rag_embedding_api_example.py1-24
Proxy adapters implement robust error handling for common API failure scenarios.
| Error Type | HTTP Code | Handling Strategy |
|---|---|---|
| Rate Limit | 429 | Exponential backoff with retry |
| Authentication | 401, 403 | Fail fast, log credential issue |
| Server Error | 500, 502, 503 | Retry with backoff (transient) |
| Timeout | Connection timeout | Retry with increased timeout |
| Invalid Request | 400, 422 | No retry, propagate to caller |
| Model Not Found | 404 | No retry, check configuration |
Proxy models introduce network latency compared to local models. Key optimization strategies:
| Strategy | Description | Impact |
|---|---|---|
| Connection Pooling | Reuse HTTP connections across requests | Reduces TCP handshake overhead |
| Request Batching | Combine multiple requests when possible | Improves throughput |
| Async I/O | Non-blocking API calls | Increases concurrency |
| Caching | Cache responses for identical prompts | Eliminates redundant API calls |
| Regional Endpoints | Use geographically closer API endpoints | Reduces network latency |
Best Practices:
DB-GPT's proxy adapter architecture provides:
This architecture enables DB-GPT to leverage the latest commercial LLMs while maintaining the flexibility to switch between providers or deploy locally without application code changes.
Sources: docs/docs/modules/smmf.md1-147
Refresh this wiki