Model Client and API Communication

Relevant source files

Purpose and Scope

This page documents the ModelClient and ModelClientSession components that handle all communication with model provider APIs (OpenAI, Ollama, etc.). These types encapsulate HTTP and WebSocket transports, request construction, response streaming, connection preconnect, turn state management, and automatic fallback behavior.

For overall session lifecycle and how ModelClient is created, see Codex Interface and Session Lifecycle. For prompt construction logic that feeds into the client, see Turn Execution and Prompt Construction. For how response events are processed, see Event Processing and State Management.

Architecture Overview

The client layer uses a two-tier design: ModelClient is session-scoped and shared across all turns, while ModelClientSession is turn-scoped and manages per-turn state like WebSocket connections and sticky routing tokens.

Client Layer Structure

Sources: codex-rs/core/src/client.rs119-226 codex-rs/core/src/client.rs228-278

ModelClient (Session-Scoped)

ModelClient holds configuration and state that remains stable for the lifetime of a Codex session. It is created once during Session::new() and cloned cheaply to create turn-scoped sessions.

State and Configuration

The client wraps an Arc<ModelClientState> containing:

Field	Type	Purpose
`auth_manager`	`Option<Arc<AuthManager>>`	Auth token provider for API requests
`conversation_id`	`ThreadId`	Unique session identifier sent as `session_id` header
`provider`	`ModelProviderInfo`	Provider configuration (base URL, API type, timeouts)
`session_source`	`SessionSource`	Session origin (TUI, Exec, SubAgent, etc.) for telemetry
`model_verbosity`	`Option<VerbosityConfig>`	Model verbosity preference
`enable_responses_websockets`	`bool`	Feature flag for WebSocket transport
`enable_responses_websockets_v2`	`bool`	Feature flag for WebSocket V2 protocol
`enable_request_compression`	`bool`	Feature flag for request compression
`include_timing_metrics`	`bool`	Whether to request timing metrics from server
`beta_features_header`	`Option<String>`	Comma-separated experimental feature keys
`disable_websockets`	`AtomicBool`	Session-sticky fallback flag
`preconnect`	`Mutex<Option<PreconnectTask>>`	Single-use preconnected socket

Sources: codex-rs/core/src/client.rs119-164 codex-rs/core/src/client.rs228-262

Creation and Initialization

The client is constructed with session-level settings derived from Config:

The beta_features_header is sent as x-codex-beta-features to enable server-side experimental features.

Sources: codex-rs/core/src/codex.rs706-726 codex-rs/core/src/codex.rs949-1007

ModelClientSession (Turn-Scoped)

Each Codex turn creates a fresh ModelClientSession via client.new_session(). This type manages per-turn state including WebSocket connections, incremental request tracking, and sticky routing tokens.

Turn-Local State

Field	Type	Purpose
`client`	`ModelClient`	Parent session-scoped client (cheap clone)
`connection`	`Option<ApiWebSocketConnection>`	Live WebSocket connection for this turn
`websocket_last_items`	`Vec<ResponseItem>`	Previous request's input items
`websocket_last_response_id`	`Option<String>`	Previous response ID for V2 protocol
`websocket_last_response_id_rx`	`Option<oneshot::Receiver<String>>`	Async receiver for response ID
`turn_state`	`Arc<OnceLock<String>>`	Sticky routing token from server

Sources: codex-rs/core/src/client.rs209-226

Session Lifecycle

Sources: codex-rs/core/src/client.rs269-278 codex-rs/core/src/client.rs777-832

Transport Layer

The client supports two transport mechanisms: HTTP with Server-Sent Events (SSE) and WebSocket. WebSocket is preferred when enabled and supported by the provider, with automatic fallback to HTTP on persistent failures.

HTTP Transport (SSE)

HTTP transport uses POST /v1/responses with stream=true and processes SSE events:

Sources: codex-rs/core/src/client.rs849-921

WebSocket Transport

WebSocket transport establishes a persistent connection and sends JSON payloads:

Sources: codex-rs/core/src/client.rs496-545 codex-rs/core/src/client.rs658-768

Incremental Requests (response.append)

When input items are an incremental extension of the previous request, the client can optimize by sending only new items:

V1 protocol: Sends response.append with incremental items.
V2 protocol: Sends response.create with previous_response_id and incremental items.

Sources: codex-rs/core/src/client.rs658-671 codex-rs/core/src/client.rs733-768

Preconnect Mechanism

Preconnect warms a WebSocket connection during session initialization to reduce first-turn latency. The connection is established but no prompt is sent until a turn starts.

Preconnect Flow

Key properties:

Single-use: Once consumed by the first turn, subsequent turns establish fresh connections
Best-effort: Failure is silently handled; first turn falls back to normal connection
Timeout-aware: Turn metadata construction uses 250ms timeout to avoid blocking startup

Sources: codex-rs/core/src/client.rs280-304 codex-rs/core/src/client.rs316-351 codex-rs/core/src/turn_metadata.rs1-81

Turn State Management

The x-codex-turn-state header implements sticky routing: the server sends an opaque token on turn start, and the client must replay it for all subsequent requests within the same turn.

Turn State Lifecycle

Turn State Contract

Phase	Client Behavior	Server Expectation
Turn start	Omit header	Server generates new routing token
Turn continuation	Send `x-codex-turn-state: <token>`	Server uses token for sticky routing
Next turn	Omit header	Server generates fresh token

The turn_state field is an Arc<OnceLock<String>> shared between handshake processing and request construction:

Sources: codex-rs/core/src/client.rs209-226 codex-rs/core/src/client.rs515-545 codex-rs/core/tests/suite/turn_state.rs1-126

Fallback Mechanism

When WebSocket connections persistently fail, the client activates session-sticky HTTP fallback. Once enabled, all subsequent turns in the session use HTTP instead of attempting WebSocket.

Fallback Activation

Session-Sticky Property

The disable_websockets flag is an AtomicBool in ModelClientState, shared by all turns via Arc. Once set, it remains true for the session lifetime:

This ensures subsequent turns don't waste retry budget on WebSocket attempts.

Sources: codex-rs/core/src/client.rs464-469 codex-rs/core/src/client.rs571-578 codex-rs/core/src/client.rs923-994 codex-rs/core/tests/suite/websocket_fallback.rs1-107

Request Construction

Prompt to API Request

The Prompt struct (codex-rs/core/src/client_common.rs25-65) contains:

input: conversation history as Vec<ResponseItem>
tools: available tool specifications
parallel_tool_calls: whether parallel execution is allowed
base_instructions: system instructions
personality: optional personality preference
output_schema: optional JSON schema for structured output

ModelClientSession::build_responses_request() converts this to codex_api::Prompt:

Sources: codex-rs/core/src/client_common.rs25-65 codex-rs/core/src/client.rs580-584

Request Options and Headers

build_responses_options() constructs ApiResponsesOptions with:

reasoning: effort and summary settings (if model supports reasoning)
include: ["reasoning.encrypted_content"] when reasoning is enabled
text: verbosity controls or output schema (mutually exclusive)
prompt_cache_key: conversation ID for caching
conversation_id: session identifier
session_source: session origin for telemetry
extra_headers: beta features and turn state
compression: request compression mode
turn_state: sticky routing lock

Sources: codex-rs/core/src/client.rs586-656

Response Streaming

ResponseStream and ResponseEvent

ResponseStream (codex-rs/core/src/client_common.rs225-236) is a Stream implementation wrapping mpsc::Receiver<Result<ResponseEvent>>. It emits events parsed from SSE or WebSocket messages.

ResponseEvent variants (codex-api crate, re-exported via codex-rs/core/src/client_common.rs4):

Created: response started
ContentDelta: incremental text or reasoning
ItemStarted: new response item (message, function call, etc.)
ItemDone: item completed
Completed: response finished (includes token usage)
Failed: response failed with error
RateLimits: rate limit snapshot
ModelsEtag: models list version
ServerReasoningIncluded: reasoning support flag

Streaming Flow

Sources: codex-rs/core/src/client_common.rs225-236 codex-rs/core/src/client.rs849-921 codex-rs/core/src/client.rs996-1216

Telemetry Integration

The client emits OpenTelemetry metrics via ApiTelemetry (codex-rs/core/src/client.rs1077-1216):

API call counts and durations
Streaming event counts
WebSocket call/event counts
ResponsesAPI timing metrics (when include_timing_metrics is enabled)

These metrics feed into OtelManager runtime metrics tracking.

Sources: codex-rs/core/src/client.rs1077-1216 codex-rs/core/tests/suite/client_websockets.rs157-186

Error Handling and Retry

Retry Strategy

Retries are configured per-provider in ModelProviderInfo:

request_max_retries: unary request retry limit
stream_max_retries: streaming request retry limit
stream_idle_timeout_ms: timeout for idle streams

The client performs exponential backoff between retries (codex-rs/core/src/client.rs996-1074), with special handling for:

401 Unauthorized: triggers auth refresh via UnauthorizedRecovery
429 Rate Limit: respects Retry-After header
Transient network errors: retries with backoff

Fallback on Exhaustion

When stream retries are exhausted for WebSocket:

Check if WebSocket is enabled and fallback not yet active
Call activate_http_fallback() to set session flag
Retry the request using HTTP transport
All subsequent turns use HTTP automatically

Sources: codex-rs/core/src/client.rs923-1074 codex-rs/core/tests/suite/websocket_fallback.rs14-57

Summary Table

Component	Scope	Key Responsibilities
`ModelClient`	Session	Auth, provider config, preconnect, fallback state
`ModelClientSession`	Turn	WebSocket connection, request tracking, turn state
`Prompt`	Request	Input items, tools, instructions, output schema
`ResponseStream`	Response	Event streaming to consumer
HTTP transport	Fallback	POST /v1/responses with SSE
WebSocket transport	Preferred	Persistent connection with JSON messages
Preconnect	Optimization	Warm connection during session init
Turn state	Routing	Sticky routing token for multi-request turns
Fallback	Resilience	Session-sticky HTTP after WebSocket failures

Sources: codex-rs/core/src/client.rs1-1216 codex-rs/core/src/client_common.rs1-347

Model Client and API Communication

Relevant source files

Purpose and Scope

Architecture Overview

Client Layer Structure

Sources: codex-rs/core/src/client.rs119-226 codex-rs/core/src/client.rs228-278

ModelClient (Session-Scoped)

ModelClient holds configuration and state that remains stable for the lifetime of a Codex session. It is created once during Session::new() and cloned cheaply to create turn-scoped sessions.

State and Configuration

The client wraps an Arc<ModelClientState> containing:

Field	Type	Purpose
`auth_manager`	`Option<Arc<AuthManager>>`	Auth token provider for API requests
`conversation_id`	`ThreadId`	Unique session identifier sent as `session_id` header
`provider`	`ModelProviderInfo`	Provider configuration (base URL, API type, timeouts)
`session_source`	`SessionSource`	Session origin (TUI, Exec, SubAgent, etc.) for telemetry
`model_verbosity`	`Option<VerbosityConfig>`	Model verbosity preference
`enable_responses_websockets`	`bool`	Feature flag for WebSocket transport
`enable_responses_websockets_v2`	`bool`	Feature flag for WebSocket V2 protocol
`enable_request_compression`	`bool`	Feature flag for request compression
`include_timing_metrics`	`bool`	Whether to request timing metrics from server
`beta_features_header`	`Option<String>`	Comma-separated experimental feature keys
`disable_websockets`	`AtomicBool`	Session-sticky fallback flag
`preconnect`	`Mutex<Option<PreconnectTask>>`	Single-use preconnected socket

Sources: codex-rs/core/src/client.rs119-164 codex-rs/core/src/client.rs228-262

Creation and Initialization

The client is constructed with session-level settings derived from Config:

The beta_features_header is sent as x-codex-beta-features to enable server-side experimental features.

Sources: codex-rs/core/src/codex.rs706-726 codex-rs/core/src/codex.rs949-1007

ModelClientSession (Turn-Scoped)

Turn-Local State

Field	Type	Purpose
`client`	`ModelClient`	Parent session-scoped client (cheap clone)
`connection`	`Option<ApiWebSocketConnection>`	Live WebSocket connection for this turn
`websocket_last_items`	`Vec<ResponseItem>`	Previous request's input items
`websocket_last_response_id`	`Option<String>`	Previous response ID for V2 protocol
`websocket_last_response_id_rx`	`Option<oneshot::Receiver<String>>`	Async receiver for response ID
`turn_state`	`Arc<OnceLock<String>>`	Sticky routing token from server

Sources: codex-rs/core/src/client.rs209-226

Session Lifecycle

Sources: codex-rs/core/src/client.rs269-278 codex-rs/core/src/client.rs777-832

Transport Layer

HTTP Transport (SSE)

HTTP transport uses POST /v1/responses with stream=true and processes SSE events:

Sources: codex-rs/core/src/client.rs849-921

WebSocket Transport

WebSocket transport establishes a persistent connection and sends JSON payloads:

Sources: codex-rs/core/src/client.rs496-545 codex-rs/core/src/client.rs658-768

Incremental Requests (response.append)

When input items are an incremental extension of the previous request, the client can optimize by sending only new items:

V1 protocol: Sends response.append with incremental items.
V2 protocol: Sends response.create with previous_response_id and incremental items.

Sources: codex-rs/core/src/client.rs658-671 codex-rs/core/src/client.rs733-768

Preconnect Mechanism

Preconnect warms a WebSocket connection during session initialization to reduce first-turn latency. The connection is established but no prompt is sent until a turn starts.

Preconnect Flow

Key properties:

Single-use: Once consumed by the first turn, subsequent turns establish fresh connections
Best-effort: Failure is silently handled; first turn falls back to normal connection
Timeout-aware: Turn metadata construction uses 250ms timeout to avoid blocking startup

Sources: codex-rs/core/src/client.rs280-304 codex-rs/core/src/client.rs316-351 codex-rs/core/src/turn_metadata.rs1-81

Turn State Management

The x-codex-turn-state header implements sticky routing: the server sends an opaque token on turn start, and the client must replay it for all subsequent requests within the same turn.

Turn State Lifecycle

Turn State Contract

Phase	Client Behavior	Server Expectation
Turn start	Omit header	Server generates new routing token
Turn continuation	Send `x-codex-turn-state: <token>`	Server uses token for sticky routing
Next turn	Omit header	Server generates fresh token

The turn_state field is an Arc<OnceLock<String>> shared between handshake processing and request construction:

Sources: codex-rs/core/src/client.rs209-226 codex-rs/core/src/client.rs515-545 codex-rs/core/tests/suite/turn_state.rs1-126

Fallback Mechanism

When WebSocket connections persistently fail, the client activates session-sticky HTTP fallback. Once enabled, all subsequent turns in the session use HTTP instead of attempting WebSocket.

Fallback Activation

Session-Sticky Property

The disable_websockets flag is an AtomicBool in ModelClientState, shared by all turns via Arc. Once set, it remains true for the session lifetime:

This ensures subsequent turns don't waste retry budget on WebSocket attempts.

Sources: codex-rs/core/src/client.rs464-469 codex-rs/core/src/client.rs571-578 codex-rs/core/src/client.rs923-994 codex-rs/core/tests/suite/websocket_fallback.rs1-107

Request Construction

Prompt to API Request

The Prompt struct (codex-rs/core/src/client_common.rs25-65) contains:

input: conversation history as Vec<ResponseItem>
tools: available tool specifications
parallel_tool_calls: whether parallel execution is allowed
base_instructions: system instructions
personality: optional personality preference
output_schema: optional JSON schema for structured output

ModelClientSession::build_responses_request() converts this to codex_api::Prompt:

Sources: codex-rs/core/src/client_common.rs25-65 codex-rs/core/src/client.rs580-584

Request Options and Headers

build_responses_options() constructs ApiResponsesOptions with:

reasoning: effort and summary settings (if model supports reasoning)
include: ["reasoning.encrypted_content"] when reasoning is enabled
text: verbosity controls or output schema (mutually exclusive)
prompt_cache_key: conversation ID for caching
conversation_id: session identifier
session_source: session origin for telemetry
extra_headers: beta features and turn state
compression: request compression mode
turn_state: sticky routing lock

Sources: codex-rs/core/src/client.rs586-656

Response Streaming

ResponseStream and ResponseEvent

ResponseStream (codex-rs/core/src/client_common.rs225-236) is a Stream implementation wrapping mpsc::Receiver<Result<ResponseEvent>>. It emits events parsed from SSE or WebSocket messages.

ResponseEvent variants (codex-api crate, re-exported via codex-rs/core/src/client_common.rs4):

Created: response started
ContentDelta: incremental text or reasoning
ItemStarted: new response item (message, function call, etc.)
ItemDone: item completed
Completed: response finished (includes token usage)
Failed: response failed with error
RateLimits: rate limit snapshot
ModelsEtag: models list version
ServerReasoningIncluded: reasoning support flag

Streaming Flow

Sources: codex-rs/core/src/client_common.rs225-236 codex-rs/core/src/client.rs849-921 codex-rs/core/src/client.rs996-1216

Telemetry Integration

The client emits OpenTelemetry metrics via ApiTelemetry (codex-rs/core/src/client.rs1077-1216):

API call counts and durations
Streaming event counts
WebSocket call/event counts
ResponsesAPI timing metrics (when include_timing_metrics is enabled)

These metrics feed into OtelManager runtime metrics tracking.

Sources: codex-rs/core/src/client.rs1077-1216 codex-rs/core/tests/suite/client_websockets.rs157-186

Error Handling and Retry

Retry Strategy

Retries are configured per-provider in ModelProviderInfo:

request_max_retries: unary request retry limit
stream_max_retries: streaming request retry limit
stream_idle_timeout_ms: timeout for idle streams

The client performs exponential backoff between retries (codex-rs/core/src/client.rs996-1074), with special handling for:

401 Unauthorized: triggers auth refresh via UnauthorizedRecovery
429 Rate Limit: respects Retry-After header
Transient network errors: retries with backoff

Fallback on Exhaustion

When stream retries are exhausted for WebSocket:

Check if WebSocket is enabled and fallback not yet active
Call activate_http_fallback() to set session flag
Retry the request using HTTP transport
All subsequent turns use HTTP automatically

Sources: codex-rs/core/src/client.rs923-1074 codex-rs/core/tests/suite/websocket_fallback.rs14-57

Summary Table

Component	Scope	Key Responsibilities
`ModelClient`	Session	Auth, provider config, preconnect, fallback state
`ModelClientSession`	Turn	WebSocket connection, request tracking, turn state
`Prompt`	Request	Input items, tools, instructions, output schema
`ResponseStream`	Response	Event streaming to consumer
HTTP transport	Fallback	POST /v1/responses with SSE
WebSocket transport	Preferred	Persistent connection with JSON messages
Preconnect	Optimization	Warm connection during session init
Turn state	Routing	Sticky routing token for multi-request turns
Fallback	Resilience	Session-sticky HTTP after WebSocket failures

Sources: codex-rs/core/src/client.rs1-1216 codex-rs/core/src/client_common.rs1-347

Model Client and API Communication

Purpose and Scope

Architecture Overview

Client Layer Structure

ModelClient (Session-Scoped)

State and Configuration

Creation and Initialization

ModelClientSession (Turn-Scoped)

Turn-Local State

Session Lifecycle

Transport Layer

HTTP Transport (SSE)

WebSocket Transport

Incremental Requests (response.append)

Preconnect Mechanism

Preconnect Flow

Turn State Management

Turn State Lifecycle

Turn State Contract

Fallback Mechanism

Fallback Activation

Session-Sticky Property

Request Construction

Prompt to API Request

Request Options and Headers

Response Streaming

ResponseStream and ResponseEvent

Streaming Flow

Telemetry Integration

Error Handling and Retry

Retry Strategy

Fallback on Exhaustion

Summary Table

On this page

Model Client and API Communication

Purpose and Scope

Architecture Overview

Client Layer Structure

ModelClient (Session-Scoped)

State and Configuration

Creation and Initialization

ModelClientSession (Turn-Scoped)

Turn-Local State

Session Lifecycle

Transport Layer

HTTP Transport (SSE)

WebSocket Transport

Incremental Requests (response.append)

Preconnect Mechanism

Preconnect Flow

Turn State Management

Turn State Lifecycle

Turn State Contract

Fallback Mechanism

Fallback Activation

Session-Sticky Property

Request Construction

Prompt to API Request

Request Options and Headers

Response Streaming

ResponseStream and ResponseEvent

Streaming Flow

Telemetry Integration

Error Handling and Retry

Retry Strategy

Fallback on Exhaustion

Summary Table

On this page