This document describes the component system architecture that forms the foundation of RAGFlow's Canvas workflow engine. Components are the building blocks of Canvas workflows—self-contained, configurable units that perform specific tasks such as LLM invocation, retrieval, categorization, and tool execution. This page covers the base classes, lifecycle management, dynamic loading, and input/output patterns that enable component composition.
For information about how components are orchestrated in workflows, see Canvas Engine and DSL. For details on specific built-in components, see Built-in Components. For tool integration patterns, see Agent Tools and ReAct Loop.
The component system is built on two foundational abstract classes that all components inherit from:
Sources: agent/component/base.py40-363 agent/component/llm.py33-80 agent/component/agent_with_tools.py38-80 agent/tools/base.py77-124 agent/component/categorize.py29-95
ComponentParamBase defines the parameter schema, validation rules, and configuration management for components. Every component has an associated parameter class that inherits from this base.
| Responsibility | Description | Key Methods |
|---|---|---|
| Parameter Schema | Define input/output structure with type information | inputs, outputs dictionaries |
| Validation | Enforce constraints on parameter values | check(), validate() |
| Configuration Updates | Merge runtime configuration with defaults | update(conf) |
| Metadata Management | Store component description, retry settings | description, max_retries, delay_after_error |
| Serialization | Convert parameters to/from JSON | as_dict(), __str__() |
The update() method performs recursive parameter merging with validation:
1. Load configuration dictionary
2. Recursively traverse parameter tree (respects PARAM_MAXDEPTH=10)
3. For each attribute:
- Check if deprecated (log warning)
- Validate type compatibility
- Update value or recurse into nested objects
4. Track user-modified parameters in _USER_FEEDED_PARAMS
5. Reject redundant parameters if not allowed
Sources: agent/component/base.py40-200 agent/component/llm.py33-80
ComponentBase is the abstract base class that all executable components inherit from. It provides the execution framework, input/output management, and integration with the Canvas graph.
Every component is initialized with three required parameters:
The _canvas reference enables components to:
sys.*, env.* variables)Sources: agent/component/base.py384-391
Components follow a strict lifecycle managed by the Canvas execution engine:
Components support both synchronous and asynchronous execution patterns:
| Pattern | Method | Use Case | Implementation |
|---|---|---|---|
| Sync | invoke(**kwargs) | Simple operations, blocking tools | Calls _invoke() directly |
| Async | invoke_async(**kwargs) | I/O-bound operations, LLM calls | Prefers _invoke_async() if defined, falls back to thread_pool_exec(_invoke) |
The async pattern enables concurrent execution of multiple components within a batch (see Canvas batch execution in agent/canvas.py422-469):
Sources: agent/component/base.py407-451 agent/canvas.py422-469
RAGFlow uses a dynamic class discovery mechanism to register components without explicit imports:
The component_class(class_name) function searches three namespaces in order:
This allows the Canvas DSL to reference components by name string (e.g., "LLM", "Retrieval") which are resolved to classes at runtime.
Sources: agent/component/__init__.py22-59 agent/canvas.py94-106
Components define inputs through parameter templates and access outputs from other components via variable references.
Components extract input requirements by scanning parameter strings for variable references:
The variable reference pattern supports three namespaces:
| Pattern | Example | Resolution |
|---|---|---|
| Component Output | {llm_0@content} | Get output "content" from component "llm_0" |
| System Variable | {sys.query} | Get system variable (query, user_id, conversation_turns, etc.) |
| Environment Variable | {env.api_key} | Get environment/global variable defined in Canvas |
Variable Reference Pattern:
\{* *\{([a-zA-Z:0-9]+@[A-Za-z0-9_.-]+|sys\.[A-Za-z0-9_.]+|env\.[A-Za-z0-9_.]+)\} *\}*
Sources: agent/component/base.py368-511 agent/canvas.py164-235
Components store outputs in a typed dictionary structure:
Special reserved output keys:
_ERROR - Error message if component failed_created_time - Timestamp when invocation started_elapsed_time - Duration of component execution_next - Next component IDs for control flow (Categorize, Switch)Sources: agent/component/base.py453-476
The Canvas provides sophisticated variable resolution to enable data flow between components:
Variables support dot notation and array indexing for complex data structures:
{[email protected][0].property}
└─────────────────────────┘
nested path
The resolution algorithm (agent/canvas.py208-235):
@ to get component_id and root variable nameSources: agent/canvas.py191-267 agent/component/base.py478-511
Components implement multi-layered error handling with retry logic and graceful degradation:
Components check cancellation status at multiple points using is_canceled() and check_if_canceled():
Cancellation can be triggered via API endpoint PUT /canvas/cancel/{task_id}.
Sources: agent/component/base.py393-447 agent/canvas.py269-278 api/apps/canvas_app.py261-268
Long-running components use the @timeout decorator to enforce execution limits:
Sources: agent/component/base.py449-451 agent/component/llm.py365-366
ToolBase extends ComponentBase for components that serve as tools in Agent workflows:
Key differences from base ComponentBase:
get_meta() returns OpenAI-compatible function definitions for LLM tool calling_retrieve_chunks() helper for adding retrieval results to Canvas referencesLLMToolPluginCallSession which handles callback trackingSources: agent/tools/base.py77-216
The LLM component adds vision model support, prompt formatting, and streaming capabilities:
Image handling: LLM components detect data:image/ base64 strings in inputs and automatically switch from LLMType.CHAT to LLMType.IMAGE2TEXT model type.
Sources: agent/component/llm.py82-447
When a Canvas is loaded from DSL JSON, components are instantiated dynamically:
DSL Component Structure:
Sources: agent/canvas.py92-107 api/apps/canvas_app.py161-165
Components support debug mode for testing individual component execution without running the full workflow:
Debug mode special handling:
debug_inputs instead of resolving from Canvas stateSources: api/apps/canvas_app.py332-366
Components share a class-level semaphore to limit concurrent LLM operations:
This prevents overwhelming external APIs when multiple components execute in parallel within a Canvas batch.
Components use Canvas-provided thread pool for blocking operations:
Sources: agent/component/base.py365-367 agent/canvas.py89-90 agent/canvas.py438
The component system provides a robust foundation for Canvas workflows through:
ComponentParamBase handles configuration, ComponentBase handles executionThis architecture enables RAGFlow to support 20+ built-in components while allowing developers to easily add custom components that integrate seamlessly with the Canvas execution engine.
Sources: agent/component/base.py agent/canvas.py40-831 agent/component/__init__.py api/apps/canvas_app.py
Refresh this wiki