The VM Benchmarking System is responsible for measuring the performance of all fuel-vm operations (300+ opcodes) to generate accurate gas costs for the network's consensus parameters. It uses the Criterion benchmarking framework to execute VM instructions under controlled conditions, collects timing data, and processes the results through linear regression to produce both constant-time and throughput-dependent cost models.
For information about executing transactions in the VM during normal operation, see Core Execution Engine. For details on how generated gas costs are applied in consensus parameters, see Consensus Parameters and Gas Costs.
The benchmarking system performs three primary functions:
ConsensusParametersThe system outputs gas costs in multiple formats including Rust code, YAML, JSON, and directly as ConsensusParameters objects.
Sources: benches/src/lib.rs1-570 benches/src/bin/collect.rs1-859
Diagram: VM Benchmarking Architecture - The system flows from benchmark configuration through execution to statistical analysis and output generation. VmBench configures each test, Criterion measures performance, and collect.rs processes results into gas costs.
Sources: benches/src/lib.rs110-187 benches/benches/vm.rs1-110 benches/src/bin/collect.rs32-249
Diagram: Benchmark Execution Sequence - Shows the complete flow from benchmark setup through measurement to cost generation. The key insight is that each benchmark executes the target instruction once to record state changes, then resets to that state for accurate repeated measurement.
Sources: benches/src/lib.rs369-569 benches/benches/vm.rs28-76
The VmBench struct is the primary configuration builder for VM benchmarks. It provides a fluent API to set up complex benchmark scenarios with contracts, storage state, and execution context.
| Field | Type | Purpose |
|---|---|---|
params | ConsensusParameters | VM execution parameters (gas costs set to free for measurement) |
gas_limit | Word | Script gas limit (default: u64::MAX - 1001) |
memory | Option<MemoryInstance> | Pre-initialized VM memory (default: 123-byte pattern) |
prepare_script | Vec<Instruction> | Instructions executed before measurement |
post_call | Vec<Instruction> | Instructions executed after contract call setup |
instruction | Instruction | The target instruction being benchmarked |
db | Option<VmStorage<...>> | Database instance with deployed contracts |
contract_code | Option<ContractCode> | Contract code to deploy |
blob | Option<BlobCode> | Blob data to insert |
inputs | Vec<Input> | Transaction inputs |
outputs | Vec<Output> | Transaction outputs |
witnesses | Vec<Witness> | Transaction witnesses |
Sources: benches/src/lib.rs110-131
Diagram: VmBench Builder Pattern - The fluent API allows composing complex benchmark scenarios. contract() and contract_using_db() are convenience methods that pre-configure contract deployment.
Sources: benches/src/lib.rs146-363
The TryFrom<VmBench> implementation transforms configuration into a VmBenchPrepared instance ready for measurement:
Diagram: VmBench Preparation Flow - The preparation process ensures the VM is in the exact state needed for benchmarking. The key innovation is recording the instruction's state changes once, then resetting to that state for each measurement iteration.
Sources: benches/src/lib.rs369-569
BenchDb provides a production-like RocksDB environment for benchmarking storage-dependent operations:
Diagram: BenchDb Structure and Initialization - BenchDb creates a temporary RocksDB instance with realistic contract state. The _tmp_dir ensures cleanup on drop.
Sources: benches/benches/vm_set/blockchain.rs73-188
The BenchDb::new() method initializes contract storage with a configurable size:
STATE_SIZE entries with sequential Bytes32 keys starting from zeroSTATE_SIZE balance entries, alternating between:
contract.asset_id(&sub_id))AssetId::new(*sub_id)This setup ensures benchmarks test realistic storage access patterns with sufficient data to avoid cache effects.
Sources: benches/benches/vm_set/blockchain.rs81-143
The to_vm_database() method creates a VmStorage instance suitable for VM execution:
This creates a transactional view at the specified block height, allowing benchmarks to simulate transaction execution against historical state.
Sources: benches/benches/vm_set/blockchain.rs168-187
The run_group_ref() function implements the core measurement loop using Criterion's custom iteration timing:
Diagram: Criterion Measurement Loop - Each iteration measures only the target instruction execution, with VM and database state reset between iterations. The nested transaction structure simulates production block processing.
Sources: benches/benches/vm.rs28-76
Nested Transaction Simulation: Benchmarks create three levels of storage transactions to match production:
original_db -> block_database_tx -> relayer_database_tx -> tx_database_tx
This ensures the benchmark measures the overhead of transactional storage access as it occurs during real block execution.
High-Resolution Timing: Uses quanta::Clock for nanosecond-precision measurements without system call overhead.
State Reset Optimization: The diff from preparation contains only the changes made by the target instruction, allowing fast state reset via reset_vm_state() instead of full VM reconstruction.
Sources: benches/benches/vm.rs41-75
The collect.rs binary processes Criterion's JSON output to generate gas costs:
Criterion outputs two types of JSON events:
benchmark-complete: Contains timing statistics for a single benchmark
group-complete: Lists all benchmarks in a group
Sources: benches/src/bin/collect.rs277-325
Diagram: collect.rs Processing Pipeline - The collector accumulates timing data, performs statistical analysis, and generates gas costs in multiple output formats.
Sources: benches/src/bin/collect.rs142-249 benches/src/bin/collect.rs374-600
Operations with throughput measurements (e.g., memory copy with varying byte counts) require calculating cost as a function of input size. The linear_regression() function implements this:
Diagram: Linear Regression and Cost Classification - The algorithm analyzes the relationship between throughput (x) and execution time (y) to determine if an operation is light (many units per gas) or heavy (many gas per unit).
Sources: benches/src/bin/collect.rs608-758
The dependent_cost() function classifies operations into three curve types:
| Type | Characteristic | Cost Model |
|---|---|---|
| Linear | first.amount() ≈ last.amount() (within 20% of regression) | Base cost + constant units per gas |
| Logarithm | first.price() > last.price() (efficiency improves with size) | Base at inflection point + linear after |
| Exponential | first.price() < last.price() (efficiency degrades) | Treated as logarithm with warning |
The function then determines if the operation is:
units_per_gas > 1.0 (e.g., can copy 493 bytes per unit of gas)units_per_gas ≤ 1.0 (e.g., requires 515 gas per cleared storage slot)Sources: benches/src/bin/collect.rs640-758
The collect.rs binary supports four output formats via the --format flag:
Human-readable format for inspection:
Machine-readable format with tagged unions:
Generated Rust code for direct inclusion in the codebase:
Complete ConsensusParameters JSON with gas costs embedded:
Sources: benches/src/bin/collect.rs220-248 benches/src/bin/collect.rs374-492
Sources: benches/src/bin/collect.rs32-58
The generated gas costs integrate into fuel-core's consensus parameters:
Diagram: Integration Flow - Benchmark results flow into the codebase through generated Rust code, which is then referenced in chain configuration and tests.
Sources: benches/src/default_gas_costs.rs1-208
The generated default_gas_costs() function provides baseline costs for network initialization:
These costs are typically loaded into ConsensusParameters during genesis block creation or test setup. Networks may adjust these values based on their specific performance characteristics.
Sources: benches/src/default_gas_costs.rs3-207
The VM benchmarks utilize memory instance reuse to avoid allocation overhead during measurements. The VmBench::memory field pre-initializes memory with a test pattern, which is cloned for each VM instance.
The measurement loop creates nested transactions that match production behavior:
This ensures benchmarks measure realistic transaction overhead.
Rather than reconstructing the VM for each iteration, the system:
vm.add_recording() and storage_diff()reset_vm_state(&diff)reset_changes()This approach reduces measurement overhead while maintaining accurate results.
Sources: benches/benches/vm.rs32-75 benches/src/lib.rs522-561
The benchmark suite uses jemalloc as the global allocator for consistent memory allocation performance:
Jemalloc provides:
Sources: benches/benches/vm.rs24-26
Refresh this wiki