This page documents the ModelClient and ModelClientSession components that handle all communication with model provider APIs (OpenAI, Ollama, etc.). These types encapsulate HTTP and WebSocket transports, request construction, response streaming, connection preconnect, turn state management, and automatic fallback behavior.
For overall session lifecycle and how ModelClient is created, see Codex Interface and Session Lifecycle. For prompt construction logic that feeds into the client, see Turn Execution and Prompt Construction. For how response events are processed, see Event Processing and State Management.
The client layer uses a two-tier design: ModelClient is session-scoped and shared across all turns, while ModelClientSession is turn-scoped and manages per-turn state like WebSocket connections and sticky routing tokens.
Sources: codex-rs/core/src/client.rs119-226 codex-rs/core/src/client.rs228-278
ModelClient holds configuration and state that remains stable for the lifetime of a Codex session. It is created once during Session::new() and cloned cheaply to create turn-scoped sessions.
The client wraps an Arc<ModelClientState> containing:
| Field | Type | Purpose |
|---|---|---|
auth_manager | Option<Arc<AuthManager>> | Auth token provider for API requests |
conversation_id | ThreadId | Unique session identifier sent as session_id header |
provider | ModelProviderInfo | Provider configuration (base URL, API type, timeouts) |
session_source | SessionSource | Session origin (TUI, Exec, SubAgent, etc.) for telemetry |
model_verbosity | Option<VerbosityConfig> | Model verbosity preference |
enable_responses_websockets | bool | Feature flag for WebSocket transport |
enable_responses_websockets_v2 | bool | Feature flag for WebSocket V2 protocol |
enable_request_compression | bool | Feature flag for request compression |
include_timing_metrics | bool | Whether to request timing metrics from server |
beta_features_header | Option<String> | Comma-separated experimental feature keys |
disable_websockets | AtomicBool | Session-sticky fallback flag |
preconnect | Mutex<Option<PreconnectTask>> | Single-use preconnected socket |
Sources: codex-rs/core/src/client.rs119-164 codex-rs/core/src/client.rs228-262
The client is constructed with session-level settings derived from Config:
The beta_features_header is sent as x-codex-beta-features to enable server-side experimental features.
Sources: codex-rs/core/src/codex.rs706-726 codex-rs/core/src/codex.rs949-1007
Each Codex turn creates a fresh ModelClientSession via client.new_session(). This type manages per-turn state including WebSocket connections, incremental request tracking, and sticky routing tokens.
| Field | Type | Purpose |
|---|---|---|
client | ModelClient | Parent session-scoped client (cheap clone) |
connection | Option<ApiWebSocketConnection> | Live WebSocket connection for this turn |
websocket_last_items | Vec<ResponseItem> | Previous request's input items |
websocket_last_response_id | Option<String> | Previous response ID for V2 protocol |
websocket_last_response_id_rx | Option<oneshot::Receiver<String>> | Async receiver for response ID |
turn_state | Arc<OnceLock<String>> | Sticky routing token from server |
Sources: codex-rs/core/src/client.rs209-226
Sources: codex-rs/core/src/client.rs269-278 codex-rs/core/src/client.rs777-832
The client supports two transport mechanisms: HTTP with Server-Sent Events (SSE) and WebSocket. WebSocket is preferred when enabled and supported by the provider, with automatic fallback to HTTP on persistent failures.
HTTP transport uses POST /v1/responses with stream=true and processes SSE events:
Sources: codex-rs/core/src/client.rs849-921
WebSocket transport establishes a persistent connection and sends JSON payloads:
Sources: codex-rs/core/src/client.rs496-545 codex-rs/core/src/client.rs658-768
When input items are an incremental extension of the previous request, the client can optimize by sending only new items:
V1 protocol: Sends response.append with incremental items.
V2 protocol: Sends response.create with previous_response_id and incremental items.
Sources: codex-rs/core/src/client.rs658-671 codex-rs/core/src/client.rs733-768
Preconnect warms a WebSocket connection during session initialization to reduce first-turn latency. The connection is established but no prompt is sent until a turn starts.
Key properties:
Sources: codex-rs/core/src/client.rs280-304 codex-rs/core/src/client.rs316-351 codex-rs/core/src/turn_metadata.rs1-81
The x-codex-turn-state header implements sticky routing: the server sends an opaque token on turn start, and the client must replay it for all subsequent requests within the same turn.
| Phase | Client Behavior | Server Expectation |
|---|---|---|
| Turn start | Omit header | Server generates new routing token |
| Turn continuation | Send x-codex-turn-state: <token> | Server uses token for sticky routing |
| Next turn | Omit header | Server generates fresh token |
The turn_state field is an Arc<OnceLock<String>> shared between handshake processing and request construction:
Sources: codex-rs/core/src/client.rs209-226 codex-rs/core/src/client.rs515-545 codex-rs/core/tests/suite/turn_state.rs1-126
When WebSocket connections persistently fail, the client activates session-sticky HTTP fallback. Once enabled, all subsequent turns in the session use HTTP instead of attempting WebSocket.
The disable_websockets flag is an AtomicBool in ModelClientState, shared by all turns via Arc. Once set, it remains true for the session lifetime:
This ensures subsequent turns don't waste retry budget on WebSocket attempts.
Sources: codex-rs/core/src/client.rs464-469 codex-rs/core/src/client.rs571-578 codex-rs/core/src/client.rs923-994 codex-rs/core/tests/suite/websocket_fallback.rs1-107
The Prompt struct (codex-rs/core/src/client_common.rs25-65) contains:
input: conversation history as Vec<ResponseItem>tools: available tool specificationsparallel_tool_calls: whether parallel execution is allowedbase_instructions: system instructionspersonality: optional personality preferenceoutput_schema: optional JSON schema for structured outputModelClientSession::build_responses_request() converts this to codex_api::Prompt:
Sources: codex-rs/core/src/client_common.rs25-65 codex-rs/core/src/client.rs580-584
build_responses_options() constructs ApiResponsesOptions with:
reasoning: effort and summary settings (if model supports reasoning)include: ["reasoning.encrypted_content"] when reasoning is enabledtext: verbosity controls or output schema (mutually exclusive)prompt_cache_key: conversation ID for cachingconversation_id: session identifiersession_source: session origin for telemetryextra_headers: beta features and turn statecompression: request compression modeturn_state: sticky routing lockSources: codex-rs/core/src/client.rs586-656
ResponseStream (codex-rs/core/src/client_common.rs225-236) is a Stream implementation wrapping mpsc::Receiver<Result<ResponseEvent>>. It emits events parsed from SSE or WebSocket messages.
ResponseEvent variants (codex-api crate, re-exported via codex-rs/core/src/client_common.rs4):
Created: response startedContentDelta: incremental text or reasoningItemStarted: new response item (message, function call, etc.)ItemDone: item completedCompleted: response finished (includes token usage)Failed: response failed with errorRateLimits: rate limit snapshotModelsEtag: models list versionServerReasoningIncluded: reasoning support flagSources: codex-rs/core/src/client_common.rs225-236 codex-rs/core/src/client.rs849-921 codex-rs/core/src/client.rs996-1216
The client emits OpenTelemetry metrics via ApiTelemetry (codex-rs/core/src/client.rs1077-1216):
include_timing_metrics is enabled)These metrics feed into OtelManager runtime metrics tracking.
Sources: codex-rs/core/src/client.rs1077-1216 codex-rs/core/tests/suite/client_websockets.rs157-186
Retries are configured per-provider in ModelProviderInfo:
request_max_retries: unary request retry limitstream_max_retries: streaming request retry limitstream_idle_timeout_ms: timeout for idle streamsThe client performs exponential backoff between retries (codex-rs/core/src/client.rs996-1074), with special handling for:
401 Unauthorized: triggers auth refresh via UnauthorizedRecovery429 Rate Limit: respects Retry-After headerWhen stream retries are exhausted for WebSocket:
activate_http_fallback() to set session flagSources: codex-rs/core/src/client.rs923-1074 codex-rs/core/tests/suite/websocket_fallback.rs14-57
| Component | Scope | Key Responsibilities |
|---|---|---|
ModelClient | Session | Auth, provider config, preconnect, fallback state |
ModelClientSession | Turn | WebSocket connection, request tracking, turn state |
Prompt | Request | Input items, tools, instructions, output schema |
ResponseStream | Response | Event streaming to consumer |
| HTTP transport | Fallback | POST /v1/responses with SSE |
| WebSocket transport | Preferred | Persistent connection with JSON messages |
| Preconnect | Optimization | Warm connection during session init |
| Turn state | Routing | Sticky routing token for multi-request turns |
| Fallback | Resilience | Session-sticky HTTP after WebSocket failures |
Sources: codex-rs/core/src/client.rs1-1216 codex-rs/core/src/client_common.rs1-347
Refresh this wiki