This page documents the abstract base classes and factory functions that define how language models are integrated in LangChain: BaseLanguageModel, BaseChatModel, BaseLLM, and the init_chat_model factory. It covers the class hierarchy, abstract methods implementors must provide, and runtime features such as caching, rate limiting, streaming control, and output versioning.
For the Runnable base interface that all language models inherit, see 2.1. For specific partner integrations (OpenAI, Anthropic, etc.), see 3.1 and 3.2. For provider-agnostic patterns across integrations, see 3.5.
All language model wrappers inherit from BaseLanguageModel, which itself extends RunnableSerializable. Two concrete branches exist:
BaseChatModel — accepts structured message lists and returns AIMessage. Used by all modern providers.BaseLLM — accepts raw strings and returns str. Used by older completion-style APIs.Language Model Inheritance
Sources: libs/core/langchain_core/language_models/base.py139-141 libs/core/langchain_core/language_models/chat_models.py246-294 libs/core/langchain_core/language_models/llms.py292-296
Defined in libs/core/langchain_core/language_models/base.py139-373
BaseLanguageModel is a generic abstract class parameterized by its output type (AIMessage for chat models, str for LLMs). It defines fields and utilities shared by all model wrappers.
| Field | Type | Default | Description |
|---|---|---|---|
cache | BaseCache | bool | None | None | Response caching. True = global cache, False = no cache, None = use global if set, BaseCache instance = that cache. |
verbose | bool | global setting | Whether to print response text. |
callbacks | Callbacks | None | Callbacks attached to all runs from this instance. |
tags | list[str] | None | None | Tags added to all run traces. |
metadata | dict[str, Any] | None | None | Metadata added to all run traces. |
custom_get_token_ids | Callable[[str], list[int]] | None | None | Optional custom tokenizer for token counting. |
Sources: libs/core/langchain_core/language_models/base.py148-174
| Method | Description |
|---|---|
get_token_ids(text) | Returns ordered list of token IDs. Uses custom_get_token_ids if set, otherwise falls back to GPT-2 tokenizer. |
get_num_tokens(text) | Returns the count of tokens in text. |
get_num_tokens_from_messages(messages, tools) | Returns total token count across a list of BaseMessage objects. |
The fallback tokenizer (_get_token_ids_default_method) uses the GPT-2 tokenizer from the transformers package and emits a warning noting that counts may be inaccurate for non-GPT-2 models. Provider subclasses should override these methods with model-specific tokenizers.
Sources: libs/core/langchain_core/language_models/base.py307-373
BaseLanguageModel requires subclasses to implement:
generate_prompt(prompts, stop, callbacks, **kwargs) -> LLMResultagenerate_prompt(prompts, stop, callbacks, **kwargs) -> LLMResultwith_structured_output(schema, **kwargs) -> Runnable (optional, raises NotImplementedError by default)| Alias | Value | Description |
|---|---|---|
LanguageModelInput | PromptValue | str | Sequence[MessageLikeRepresentation] | Accepted input types |
LanguageModelOutput | BaseMessage | str | Output type |
LanguageModelLike | Runnable[LanguageModelInput, LanguageModelOutput] | Duck-type for anything that behaves like a model |
Sources: libs/core/langchain_core/language_models/base.py122-132
Defined in libs/core/langchain_core/language_models/chat_models.py246
BaseChatModel extends BaseLanguageModel[AIMessage]. Partner integrations subclass this and implement a small set of methods.
| Field | Type | Default | Description |
|---|---|---|---|
rate_limiter | BaseRateLimiter | None | None | Controls request throughput. Called with acquire()/aacquire() before each API call during streaming. |
disable_streaming | bool | Literal["tool_calling"] | False | Controls when streaming is bypassed (see below). |
output_version | str | None | env LC_OUTPUT_VERSION | Content format stored in AIMessage. 'v1' = standardized blocks; 'v0' = provider-specific. |
profile | ModelProfile | None | None | Beta. Model capability profile (context window, supported modalities, etc.). Auto-loaded from partner package if available. |
Sources: libs/core/langchain_core/language_models/chat_models.py296-358
Custom chat models must implement _generate. All others are optional.
| Method | Required | Signature | Description |
|---|---|---|---|
_generate | Yes | (messages, stop, run_manager, **kwargs) -> ChatResult | Core invocation logic. |
_llm_type | Yes (property) | -> str | Unique string identifier for logging. |
_identifying_params | No (property) | -> Mapping[str, Any] | Model parameters for tracing. |
_stream | No | (messages, stop, run_manager, **kwargs) -> Iterator[ChatGenerationChunk] | Sync streaming. |
_agenerate | No | (messages, stop, run_manager, **kwargs) -> ChatResult | Native async generation. |
_astream | No | (messages, stop, run_manager, **kwargs) -> AsyncIterator[ChatGenerationChunk] | Async streaming. |
Sources: libs/core/langchain_core/language_models/chat_models.py280-294
Method dispatch and call flow
Sources: libs/core/langchain_core/language_models/chat_models.py388-733
disable_streamingThe _should_stream method libs/core/langchain_core/language_models/chat_models.py439-477 decides whether to use the streaming API for a given call. The disable_streaming field controls this:
| Value | Effect |
|---|---|
False (default) | Always use streaming if _stream/_astream is implemented. |
True | Always bypass streaming; stream()/astream() fall back to invoke()/ainvoke(). |
"tool_calling" | Bypass streaming only when a tools keyword argument is provided. |
Additionally, if a _StreamingCallbackHandler is present in the active callback managers, streaming will be activated even if not explicitly requested.
Sources: libs/core/langchain_core/language_models/chat_models.py299-315 libs/core/langchain_core/language_models/chat_models.py439-477
output_versionThe output_version field controls the format of content stored in AIMessage.content for streamed responses. When set to 'v1', the _update_message_content_to_blocks utility is called on each chunk to rewrite content into a standardized block format consistent with AIMessage.content_blocks. The 'v0' format preserves provider-specific content representations. This can also be set via the environment variable LC_OUTPUT_VERSION.
Sources: libs/core/langchain_core/language_models/chat_models.py317-338
Caching applies at the generate() level (not streaming). The cache key is built by _get_llm_string(), which serializes the model configuration and call parameters. On a cache hit, stored ChatGeneration objects are returned; on a cache miss, _generate() is called and the result is stored.
For cached responses, the total_cost field in usage_metadata is set to 0 libs/core/langchain_core/language_models/chat_models.py740-778
Sources: libs/core/langchain_core/language_models/chat_models.py830-843 libs/core/tests/unit_tests/language_models/chat_models/test_cache.py42-101
| Method | Description |
|---|---|
bind_tools(tools, **kwargs) | Returns a new Runnable that always passes tools to the model. |
with_structured_output(schema, **kwargs) | Returns a chain that coerces model output to schema. Must be implemented by subclasses. |
with_retry(**kwargs) | Wraps the model with retry logic (inherited from Runnable). |
with_fallbacks(fallbacks, **kwargs) | Wraps the model with fallback models on failure. |
configurable_fields(**kwargs) | Makes specified init fields configurable at runtime. |
configurable_alternatives(which, **kwargs) | Makes the model swappable at runtime. |
_get_ls_params() produces a LangSmithParams dict for tracing. It auto-extracts model/model_name, temperature, and max_tokens from instance attributes or kwargs. The ls_model_type is always 'chat' for BaseChatModel.
The LangSmithParams TypedDict is defined in libs/core/langchain_core/language_models/base.py49-72
Defined in libs/core/langchain_core/language_models/llms.py292
BaseLLM extends BaseLanguageModel[str]. It accepts and returns plain strings. Its invoke() returns str; its stream() yields str chunks.
| Method | Required | Description |
|---|---|---|
_generate(prompts, stop, run_manager, **kwargs) -> LLMResult | Yes | Takes a list of string prompts and returns an LLMResult. |
_stream(prompt, stop, run_manager, **kwargs) -> Iterator[GenerationChunk] | No | Sync streaming. Raises NotImplementedError by default. |
_agenerate(prompts, stop, run_manager, **kwargs) -> LLMResult | No | Async generation. Defaults to running _generate in a thread executor. |
_astream(prompt, stop, run_manager, **kwargs) -> AsyncIterator[GenerationChunk] | No | Async streaming. Defaults to wrapping _stream in an executor. |
The LLM helper class (a concrete subclass of BaseLLM) simplifies implementation by exposing a single _call(prompt, stop, run_manager, **kwargs) -> str method.
BaseLLM.batch() overrides the default Runnable batching behavior. If max_concurrency is not set in the config, it calls generate_prompt() once with all prompts (enabling providers that support native batching). If max_concurrency is set, it splits inputs into chunks and processes them sequentially.
Sources: libs/core/langchain_core/language_models/llms.py416-461
Defined in libs/langchain_v1/langchain/chat_models/base.py208-488
init_chat_model is the primary entry point for provider-agnostic model initialization. It is exported from the langchain package as langchain.chat_models.init_chat_model.
Sources: libs/langchain_v1/langchain/chat_models/base.py208-499
| Parameter | Type | Description |
|---|---|---|
model | str | None | Model name, optionally prefixed with provider: (e.g. "openai:gpt-4o"). |
model_provider | str | None | Explicit provider key. Inferred from model name if omitted. |
configurable_fields | None | "any" | list[str] | tuple[str, ...] | Which fields can be overridden at runtime via RunnableConfig. |
config_prefix | str | None | Prefix for configurable keys, e.g. "foo" makes model configurable as "foo_model". |
**kwargs | Any | Passed directly to the provider's chat model constructor (e.g. temperature, max_tokens). |
The _attempt_infer_model_provider function libs/langchain_v1/langchain/chat_models/base.py502-566 maps model name prefixes to providers:
| Model prefix | Inferred provider |
|---|---|
gpt-, o1, o3, chatgpt, text-davinci | openai |
claude | anthropic |
command | cohere |
accounts/fireworks | fireworks |
gemini | google_vertexai |
amazon., anthropic., meta. | bedrock |
mistral, mixtral | mistralai |
deepseek | deepseek |
grok | xai |
sonar | perplexity |
solar | upstage |
The provider:model format (e.g. "anthropic:claude-opus-4-1") is also parsed directly and takes precedence over inference.
_BUILTIN_PROVIDERS Registry_BUILTIN_PROVIDERS libs/langchain_v1/langchain/chat_models/base.py38-98 is a dict mapping provider keys to (module_path, class_name, creator_func) tuples. It is used by _get_chat_model_creator() (LRU-cached) to import and instantiate the appropriate class on first use.
| Provider key | Module | Class |
|---|---|---|
openai | langchain_openai | ChatOpenAI |
anthropic | langchain_anthropic | ChatAnthropic |
azure_openai | langchain_openai | AzureChatOpenAI |
google_vertexai | langchain_google_vertexai | ChatVertexAI |
groq | langchain_groq | ChatGroq |
mistralai | langchain_mistralai | ChatMistralAI |
ollama | langchain_ollama | ChatOllama |
deepseek | langchain_deepseek | ChatDeepSeek |
xai | langchain_xai | ChatXAI |
perplexity | langchain_perplexity | ChatPerplexity |
fireworks | langchain_fireworks | ChatFireworks |
| (and more) |
| Condition | Return type |
|---|---|
model given, configurable_fields is None | A fully-initialized BaseChatModel instance. |
model not given (or configurable_fields set) | A _ConfigurableModel that defers initialization until invoke() is called with a config. |
_ConfigurableModel_ConfigurableModel libs/langchain_v1/langchain/chat_models/base.py607 is a Runnable that acts as a proxy. It stores default parameters and a queue of deferred declarative operations (e.g. bind_tools, with_structured_output). On each invoke()/stream() call, it:
config["configurable"] values.BaseChatModel via _init_chat_model_helper.This design allows calling bind_tools() or with_structured_output() on a configurable model before the provider is known.
Security note: Setting configurable_fields="any" allows runtime override of fields like api_key and base_url. Use explicit field lists when accepting untrusted configurations.
Sources: libs/langchain_v1/langchain/chat_models/base.py607-659 libs/langchain_v1/tests/unit_tests/chat_models/test_chat_models.py114-236
Generation wrapper types used internally
BaseChatModel produces ChatResult internally; public invoke() unwraps this to return the AIMessage directly. The streaming methods yield AIMessageChunk objects.
Sources: libs/core/langchain_core/language_models/chat_models.py54-63
Minimum required implementation:
class MyChatModel(BaseChatModel):
model: str
@property
def _llm_type(self) -> str:
return "my-chat-model"
def _generate(
self,
messages: list[BaseMessage],
stop: list[str] | None = None,
run_manager: CallbackManagerForLLMRun | None = None,
**kwargs: Any,
) -> ChatResult:
# Call your API here
response_text = call_my_api(messages)
return ChatResult(
generations=[ChatGeneration(message=AIMessage(content=response_text))]
)
To add streaming, implement _stream yielding ChatGenerationChunk objects wrapping AIMessageChunk. To add async, implement _agenerate and/or _astream.
Sources: libs/core/langchain_core/language_models/chat_models.py280-294 libs/core/tests/unit_tests/language_models/chat_models/test_base.py186-215
langchain_core ships several fake implementations for unit tests:
| Class | Module | Description |
|---|---|---|
FakeListChatModel | fake_chat_models.py | Cycles through a list of string responses. |
FakeMessagesListChatModel | fake_chat_models.py | Cycles through a list of BaseMessage responses. |
FakeChatModel | fake_chat_models.py | Always returns "fake response". |
GenericFakeChatModel | fake_chat_models.py | Accepts an Iterator[AIMessage]; streams by splitting on word boundaries. |
ParrotFakeChatModel | (exported from language_models) | Returns the last message in the input unchanged. |
FakeListLLM | fake_llms.py | BaseLLM equivalent of FakeListChatModel. |
Sources: libs/core/langchain_core/language_models/fake_chat_models.py21-390
Refresh this wiki
This wiki was recently refreshed. Please wait 2 days to refresh again.