This document covers the build pipeline, dependency management, Docker image construction, and continuous integration testing for RAGFlow. The build system uses modern tools including uv for Python dependency management, multi-stage Docker builds for optimized images, and GitHub Actions for automated testing.
For runtime deployment configuration, see Docker Compose Deployment. For frontend compilation details, see UI Component Architecture.
The RAGFlow build system consists of:
uv with pyproject.toml and uv.lockThe system supports both x86_64 and aarch64 architectures and can optionally use China-based mirrors for builds in restricted network environments.
Sources: pyproject.toml1-282 Dockerfile1-215 .github/workflows/tests.yml1-598
RAGFlow uses uv as the Python package manager, configured via pyproject.toml. The project requires Python 3.12-3.14 and declares 120+ production dependencies plus separate test dependency groups.
The [[tool.uv.index]] section configures Tsinghua University's PyPI mirror for builds in China:
Sources: pyproject.toml1-159 pyproject.toml179-181
The uv.lock file pins exact dependency versions with cryptographic hashes for reproducible builds. It supports multiple Python versions and platforms through resolution markers:
resolution-markers = [
"python_full_version >= '3.14' and sys_platform == 'darwin'",
"python_full_version < '3.13' and platform_machine == 'aarch64' and sys_platform == 'linux'",
...
]
During Docker builds, the lock file is conditionally rewritten to use different mirrors based on the NEED_MIRROR build argument.
Sources: uv.lock1-14 Dockerfile158-163
The base Docker stage installs uv from pre-downloaded tarballs and configures it for the build environment:
Sources: Dockerfile68-82
The Docker build uses a three-stage architecture to minimize final image size while caching build artifacts efficiently.
Sources: Dockerfile1-215
The base stage (FROM ubuntu:24.04) installs all system-level dependencies including:
| Component | Purpose | Installation |
|---|---|---|
libglib2.0-0, libglx-mesa0, libgl1 | OpenCV dependencies | apt install |
default-jdk | Tika Java server | apt install |
libatk-bridge2.0-0 | Selenium Chrome driver | apt install |
nginx | Web server (v1.29.5) | apt from nginx.org repository |
uv | Python package manager | Extract from tarball |
| Node.js 20.x | Frontend build tool | apt from nodesource.com |
| Rust toolchain | Compile Rust dependencies | rustup install |
| ODBC drivers | SQL Server connectivity | msodbcsql17/msodbcsql18 |
The stage also copies pre-downloaded resources from infiniflow/ragflow_deps:latest:
Chrome and ChromeDriver for Selenium are extracted from pre-downloaded zips:
Sources: Dockerfile1-145 Dockerfile12-23 Dockerfile130-137
The builder stage compiles the application:
Python Dependencies:
Frontend Build:
Version Generation:
Sources: Dockerfile147-180
The production stage copies only runtime necessities from the builder:
Sources: Dockerfile182-214
The React frontend is built during the builder stage using npm. The build process involves:
npm install downloads packages from package.jsonnpm run build invokes Vite to bundle the React applicationweb/dist/Memory allocation is increased to 4GB to prevent out-of-memory errors during compilation:
The compiled web/dist/ directory is copied to the production stage and served by nginx.
Sources: Dockerfile168-172
The download_deps.py script pre-downloads all external dependencies for offline builds:
URL Selection Based on Mirror Flag:
HuggingFace Model Download:
Sources: download_deps.py1-82
The Dockerfile.deps creates a minimal image containing all pre-downloaded artifacts:
This image is pushed to infiniflow/ragflow_deps:latest and mounted during the main build using --mount=type=bind, avoiding layer bloat.
Sources: Dockerfile.deps1-11
The test workflow runs on:
Concurrency Control:
Sources: .github/workflows/tests.yml5-29
Sources: .github/workflows/tests.yml38-598
The workflow builds a RAGFlow Docker image tagged with the GitHub run ID:
Test level is determined by event type:
Sources: .github/workflows/tests.yml140-154
The workflow allocates unique ports for each runner to support parallel execution:
These computed ports are appended to docker/.env for Docker Compose to consume.
Sources: .github/workflows/tests.yml157-198
The CI pipeline executes four categories of tests against both Elasticsearch and Infinity document engines:
| Test Type | Command | Markers | Coverage |
|---|---|---|---|
| Unit Tests | python3 run_tests.py | pytest markers | Common utilities |
| SDK Tests | pytest test/testcases/test_sdk_api | --level=${HTTP_API_TEST_LEVEL} | sdk/python/ragflow_sdk |
| Web API Tests | pytest test/testcases/test_web_api | --level=${HTTP_API_TEST_LEVEL} | Internal API endpoints |
| HTTP API Tests | pytest test/testcases/test_http_api | --level=${HTTP_API_TEST_LEVEL} | Public REST API |
| CLI Tests | admin/client/ragflow_cli.py | N/A | CLI functionality |
Test Level Markers:
Sources: pyproject.toml212-216 .github/workflows/tests.yml132-139
Sources: .github/workflows/tests.yml211-218
The CLI tests validate end-to-end functionality including user creation, dataset management, document parsing, and retrieval benchmarks:
Error detection uses regex matching:
Sources: .github/workflows/tests.yml238-337
The workflow patches entrypoint.sh to run the server under coverage:
After tests complete, coverage is saved by sending SIGINT to the server process:
Sources: .github/workflows/tests.yml206 .github/workflows/tests.yml339-352
The .coverage file is extracted from the container and converted to XML format with path mapping:
Separate coverage reports are generated for:
coverage-es-server.xml - Server code tested with Elasticsearchcoverage-es-sdk.xml - SDK tests with Elasticsearchcoverage-infinity-server.xml - Server code tested with Infinitycoverage-infinity-sdk.xml - SDK tests with InfinitySources: .github/workflows/tests.yml354-374 pyproject.toml234-282
The pyproject.toml defines coverage settings:
Sources: pyproject.toml234-282
| Argument | Purpose | Values |
|---|---|---|
NEED_MIRROR | Use China mirrors | 0 (default), 1 (enabled) |
HTTP_PROXY | Proxy for build-time downloads | URL |
HTTPS_PROXY | Secure proxy for downloads | URL |
Usage:
Sources: Dockerfile6 .github/workflows/tests.yml146
Dockerfile Environment:
CI/CD Environment:
Sources: Dockerfile25 Dockerfile83-84 Dockerfile188-192 .github/workflows/tests.yml240
Pytest Options:
Sources: pyproject.toml202-231
The build produces the following image:
infiniflow/ragflow:{GITHUB_RUN_ID} (CI builds)infiniflow/ragflow:v0.24.0 (tagged releases)infiniflow/ragflow:latest (main branch)Version Extraction:
The VERSION file is included in the production image and can be queried via the admin API.
Sources: Dockerfile176-179 .github/workflows/tests.yml143
The Helm chart references the Docker image in values.yaml:
Sources: helm/values.yaml78-82
The Python SDK has its own isolated build configuration in sdk/python/:
During CI, the SDK is installed in editable mode for testing:
Sources: sdk/python/pyproject.toml1-32 .github/workflows/tests.yml209
Refresh this wiki