This page documents the HuggingFace provider implementations, which enable access to models hosted on HuggingFace's platform. Three distinct providers offer different access patterns:
HuggingChat: Authenticated access to HuggingFace's chat UI with conversation management, web search, and image generation toolsHuggingFaceInference: Direct API access to HuggingFace's Inference API, supporting text and image generationHuggingSpace and HuggingFaceMedia: Specialized providers for Space-hosted models and video generationThese providers demonstrate dynamic model discovery, conversation state management, and flexible authentication patterns used throughout the g4f system.
Related documentation: Provider Base Classes, Provider Selection & AnyProvider, Authentication Overview
The HuggingFace ecosystem consists of three provider implementations with distinct capabilities and authentication requirements:
Provider Comparison
| Provider | Base Class | Authentication | Model Discovery | Streaming | Image Support | Conversation State |
|---|---|---|---|---|---|---|
HuggingChat | AsyncAuthedProvider | Required (cookies/nodriver) | API + static fallback | Yes | Yes (via tool) | Yes (per-model tracking) |
HuggingFaceInference | AsyncGeneratorProvider | Optional (API key) | API + dynamic trending | Yes | Yes (native + router) | No |
HuggingFaceMedia | AsyncGeneratorProvider | No | Static video models | No | Video generation | No |
HuggingSpace | Provider | No | Static | No | Varies by space | No |
Sources: g4f/Provider/needs_auth/hf/HuggingChat.py33-47 g4f/Provider/needs_auth/hf/HuggingFaceInference.py24-33 g4f/providers/any_provider.py27-30
HuggingChat is an AsyncAuthedProvider that interfaces with HuggingFace's chat UI at https://huggingface.co/chat. It requires authentication and extends the provider base classes for authenticated providers and model management:
Class Definition:
The provider supports conversation persistence with per-model conversation tracking, web search integration, image generation through tools, and reasoning model support.
Sources: g4f/Provider/needs_auth/hf/HuggingChat.py33-47 g4f/Provider/needs_auth/hf/HuggingChat.py62-80
HuggingChat maintains conversation state using a custom Conversation class that extends JsonConversation. The system creates per-model conversations and tracks message IDs for continuation.
Conversation State Structure
The Conversation class (defined at g4f/Provider/needs_auth/hf/HuggingChat.py29-31) stores model-specific conversation data, enabling conversation continuity across multiple exchanges:
This architecture allows switching between models within the same client session while maintaining separate conversation contexts.
Sources: g4f/Provider/needs_auth/hf/HuggingChat.py29-31 g4f/Provider/needs_auth/hf/HuggingChat.py82-176 g4f/Provider/needs_auth/hf/HuggingChat.py177-193 g4f/Provider/needs_auth/hf/HuggingChat.py195-209
HuggingChat uses curl_cffi for HTTP requests with multipart form data for file uploads. The response is a Server-Sent Events (SSE) stream with JSON objects.
Request Data Structure
The request payload sent to HuggingFace's conversation endpoint:
The tools array includes "000000000000000000000001" when using image generation models to enable the built-in image generation tool.
Response Event Types
The SSE stream from HuggingFace returns JSON objects with different type fields:
| Type | Description | Response Object Yielded |
|---|---|---|
stream | Text token | String token (with null bytes stripped) |
finalAnswer | Generation complete | FinishReason("stop") |
file | Generated image | ImageResponse(url, prompt, cookies) |
webSearch | Search results | Sources(search_results) |
title | Conversation title | TitleGeneration(title) |
reasoning | Reasoning trace | Reasoning(token, status) |
Sources: g4f/Provider/needs_auth/hf/HuggingChat.py117-147 g4f/Provider/needs_auth/hf/HuggingChat.py149-176
Media files are encoded as base64 and uploaded via multipart form data with special filename prefixes:
Sources: g4f/Provider/needs_auth/hf/HuggingChat.py134-139 g4f/Provider/needs_auth/hf/HuggingChat.py167-169
HuggingChat performs dynamic model discovery from the API with static fallback support:
Dynamic Discovery:
Static Fallback:
When API discovery fails, fallback models from g4f/Provider/needs_auth/hf/models.py9-20 are used:
This two-tier approach ensures the provider remains operational even when the HuggingFace API is unavailable.
Sources: g4f/Provider/needs_auth/hf/HuggingChat.py49-60 g4f/Provider/needs_auth/hf/models.py1-58
HuggingFaceInference is an AsyncGeneratorProvider that directly interfaces with HuggingFace's Inference API. Unlike HuggingChat, it does not require authentication for most models but supports API key authentication for improved rate limits and access to gated models.
Class Definition:
The provider interfaces with three primary endpoints:
api-inference.huggingface.co/models/{model} - Standard inferencehuggingface.co/api/models - Model discoveryrouter.huggingface.co/together/v1/images/generations - Together router for FLUX modelsSources: g4f/Provider/needs_auth/hf/HuggingFaceInference.py24-36
HuggingFaceInference implements trending-based model discovery to automatically surface popular models:
Discovery Process:
Priority Order:
models.py)This approach ensures users have access to both stable, well-tested models and newly popular models from the community.
Sources: g4f/Provider/needs_auth/hf/HuggingFaceInference.py37-53
The provider uses model metadata to determine the appropriate request format and parameters:
Sources: g4f/Provider/needs_auth/hf/HuggingFaceInference.py55-64 g4f/Provider/needs_auth/hf/HuggingFaceInference.py109-161
The provider uses model-specific prompt formats based on the model_type and tokenizer configuration:
| Model Type | Format Function | Template Example |
|---|---|---|
gpt2, gpt_neo, gemma | format_prompt() | Standard Q&A format |
mistral (mistralai author) | format_prompt_mistral() | <s>[INST]...[/INST]...</s> |
qwen | format_prompt_qwen() | <|im_start|>role\ncontent<|im_end|> |
llama | format_prompt_llama() | <|begin_of_text|><|start_header_id|>... |
| Custom (by eos_token) | format_prompt_custom() | <|role|>\ncontent{eos_token} |
Format Examples:
Mistral format (Mistral author models):
<s>[INST]What is machine learning? [/INST] Machine learning is...</s>
<s>[INST] Can you explain more? [/INST]
Qwen format (Qwen models):
<|im_start|>system
You are a helpful assistant
<|im_end|>
<|im_start|>user
What is machine learning?
<|im_end|>
<|im_start|>assistant
Llama format (Meta Llama models):
<|begin_of_text|><|start_header_id|>system<|end_header_id|>
You are a helpful assistant
<|eot_id|>
<|start_header_id|>user<|end_header_id|>
What is machine learning?
<|eot_id|>
<|start_header_id|>assistant<|end_header_id|>
The formatting functions are defined at g4f/Provider/needs_auth/hf/HuggingFaceInference.py192-255
Sources: g4f/Provider/needs_auth/hf/HuggingFaceInference.py192-255
For text generation, the provider handles streaming responses by parsing Server-Sent Events:
SSE Data Format:
Sources: g4f/Provider/needs_auth/hf/HuggingFaceInference.py163-191
For certain popular image models, the provider routes requests through Together's router:
This enables faster inference for FLUX models by leveraging Together's optimized infrastructure.
Sources: g4f/Provider/needs_auth/hf/HuggingFaceInference.py19-22 g4f/Provider/needs_auth/hf/HuggingFaceInference.py110-123
The HuggingFace providers use a centralized model registry in g4f/Provider/needs_auth/hf/models.py:
Aliases map user-friendly names to full model identifiers:
Sources: g4f/Provider/needs_auth/hf/models.py1-58
HuggingFace providers are registered in the global model map through any_provider.py:
This enables automatic model discovery and provider selection when users request models by name.
Sources: g4f/providers/any_provider.py27-30 g4f/providers/any_provider.py151-183
HuggingSpace provides access to models hosted on HuggingFace Spaces, which are interactive demos and applications built by the community.
Integration in Provider System:
PROVIDERS_LIST_3 at g4f/providers/any_provider.py27-30get_models() method for model discoveryworking status checked before provider selectionUsage Pattern: HuggingSpace accesses specific Space endpoints based on the Space ID and model name, allowing integration of custom models and demos hosted on HuggingFace's platform.
HuggingFaceMedia is a specialized provider for video generation models, extending HuggingFaceInference:
Integration:
The provider is handled specially in model map creation at g4f/providers/any_provider.py159-160:
This separation allows video generation models to be categorized and routed appropriately while reusing the inference infrastructure from HuggingFaceInference.
Sources: g4f/providers/any_provider.py27-30 g4f/providers/any_provider.py151-183
HuggingChat handles authentication failures with specific error types:
When authentication fails during conversation creation, the error is propagated to trigger re-authentication or nodriver fallback.
Both providers raise ModelNotFoundError when models are unavailable:
Sources: g4f/Provider/needs_auth/hf/HuggingChat.py185-193 g4f/Provider/needs_auth/hf/HuggingFaceInference.py59-64 g4f/Provider/needs_auth/hf/HuggingFaceInference.py160-161
HuggingFaceInference handles context length issues by truncating message history:
Sources: g4f/Provider/needs_auth/hf/HuggingFaceInference.py148-154
HuggingFace providers integrate into the global model map through AnyProvider.create_model_map():
Model Map Registration Flow:
Data Structures:
Sources: g4f/providers/any_provider.py151-183
Complete request flow for a HuggingFace model:
Flow Steps:
"llama-3.3-70b" → "meta-llama/Llama-3.3-70B-Instruct"HuggingChat and HuggingFaceInferenceIterListProvider tries providers sequentially (free-first when no API key)HuggingChat authenticates via cookies or nodriver if neededSources: g4f/providers/any_provider.py367-401 g4f/providers/retry_provider.py113-174
For HuggingFaceInference, API keys can be provided via:
api_key="hf_..."HUGGINGFACE_API_KEY (loaded by AuthManager)HuggingChat reads cookies from:
get_cookies(domain=".huggingface.co")har_and_cookies/ directorySources: g4f/Provider/needs_auth/hf/HuggingChat.py62-80
Both providers cache model metadata to avoid repeated API calls:
Sources: g4f/Provider/needs_auth/hf/HuggingChat.py49-60 g4f/Provider/needs_auth/hf/HuggingFaceInference.py55-64
HuggingChat supports web search through the web_search parameter:
When enabled, the response includes a webSearch type with sources:
Sources: g4f/Provider/needs_auth/hf/HuggingChat.py117-125 g4f/Provider/needs_auth/hf/HuggingChat.py170-171
HuggingFaceInference supports continuing previous generations via the action="continue" parameter. This uses special prompt formatting to append to the last assistant message:
Sources: g4f/Provider/needs_auth/hf/HuggingFaceInference.py133 g4f/Provider/needs_auth/hf/HuggingFaceInference.py220-226
HuggingChat handles reasoning models that emit thinking tokens:
The Reasoning object contains the internal thought process which can be displayed separately from the main response.
Sources: g4f/Provider/needs_auth/hf/HuggingChat.py174-175
The HuggingFace provider ecosystem in g4f offers three complementary approaches to accessing HuggingFace models:
Key architectural patterns:
Sources: g4f/Provider/needs_auth/hf/HuggingChat.py g4f/Provider/needs_auth/hf/HuggingFaceInference.py g4f/Provider/needs_auth/hf/models.py g4f/providers/any_provider.py27-183
Refresh this wiki