Skip to main content

Erigon's monolithic node: 25% faster execution, 2x slower commits

· 19 min read
Stefan Kobrc
Founder RockLogic
StereumLabs AI
Artificial Intelligence

Before The Merge in September 2022, an Ethereum node was a single Proof-of-Work client. Geth, Parity, Erigon, Nethermind, or Besu, each handling everything from networking and the EVM to mining and chain selection inside one binary. The Merge split that role in two: an execution layer (EL) for the EVM, transactions, mempool, and world state, and a consensus layer (CL) for Proof-of-Stake fork choice, finality, attestations, and block proposals. The two halves talk to each other over the engine API, a JSON-RPC channel authenticated with a JWT secret.

Most operators run those two layers as two separate binaries that talk over the engine API. Caplin v3.3.10 collapses them back into a single Erigon binary. We added a Caplin standalone host to the StereumLabs fleet on April 8, 2026, and let it run alongside the classic Erigon plus CC pairings under identical mainnet conditions. This post reports what we measured: where the monolithic architecture wins, where it pays a tax, and where it leaves observability holes.

Caplin standalone vs classic Erigon and CC split: architectural diagram

Setup

The standalone deployment runs on a single bare-metal VM in our Vienna NDC2 cluster. Caplin runs as part of the Erigon binary with the consensus and execution layers sharing one executable, one MDBX database, and one combined libp2p plus devp2p networking stack. The host has 16 vCPU, 31.3 GB RAM, and 1.9 TB of NVMe storage, running Ubuntu 22.04.

The classic Erigon combos run as two separate binaries on two VMs per pairing. We split EC and CC by design across two hosts so that node-exporter measurements give us clean per-layer resource consumption. The EC side gets 12 vCPU, 15.6 GB RAM, and 1.9 TB. The CC side gets 8 vCPU, 11.7 GB RAM, and 527.8 GB. Combined, that is 20 vCPU, 27.3 GB RAM, and roughly 2.4 TB of disk per pairing.

ProfileArchitectureCPURAMDisk
Caplin standaloneEL plus CL combined in one Erigon binary16 vCPU31.3 GB1.9 TB
Classic Erigon plus CCTwo separate binaries (EL and CL)20 vCPU27.3 GB2.4 TB

That asymmetry matters. The standalone has 20% fewer vCPUs and 21% less disk, but 15% more RAM concentrated on one host. Any direct comparison has to account for it. We summed the resource consumption across the CC and EC VMs of each classic pairing and compared the total to the standalone single host.

We measured resource and stability metrics over a 7 day window using avg_over_time(...[7d:1h]) against our cold Prometheus datasource. Block processing latency comes from parsing 50,165 head updated log lines extracted via Elasticsearch, covering the trailing 24 hours across the standalone and all six Erigon plus CC pairings.

All six classic Erigon EC instances are paired with the same CC versions: Lighthouse v8.1.3, Lodestar v1.42.0, Nimbus v26.3.1, Prysm v7.1.3, Teku 26.4.0, and Grandine 2.0.4. Both the standalone and the classic ECs run Erigon v3.3.10. For the StereumLabs label conventions and metric availability per client, see Client metrics scope.

Resource footprint

The first question is whether one binary holding two layers consumes more or less than two binaries holding one layer each.

Resource footprint comparison: Caplin standalone vs sum of classic Erigon plus CC

MetricCaplin standaloneErigon plus CC rangeErigon plus CC median
CPU (vCPU cores)1.210.68 to 2.201.20
RAM used (GB)13.011.35 to 14.3213.09
Disk read (MB/s)5.05.17 to 17.015.82
Disk write (MB/s)18.114.69 to 31.4216.07
Network RX (MB/s)0.730.77 to 2.041.75
Network TX (MB/s)0.730.46 to 1.701.23
Swap usage (GB)3.832.47 to 4.583.61

Caplin standalone lands at or near the median of the Erigon plus CC pairings on every dimension except network, where it consumes well under half what the classic combos do. That gap is structural. In our split deployment, every engine_forkchoiceUpdated, engine_newPayload, and engine_getBlobsV2 call traverses the network between the CC and EC VMs. In the standalone binary, those calls become in-process function invocations that never touch a NIC.

Memory deserves a closer look. Caplin's container log reports Rss=26.1GB Pss=26.1GB Anonymous=11.7GB Swap=3.7GB, which seems to contradict the 13 GB we measured from node_memory_MemTotal_bytes minus node_memory_MemAvailable_bytes. Both numbers are correct. MDBX is memory-mapped, and roughly 14.5 GB of Erigon's RSS is shared-clean page cache that the kernel can reclaim instantly under pressure. MemAvailable correctly counts those pages as free. The 13 GB figure represents the working set that cannot be evicted without forcing reads from disk.

The 3.83 GB swap is also nothing unusual. Every classic Erigon EC swaps between 2.4 and 2.8 GB on its own, and the paired CC adds another 0 to 2.1 GB. Combined, the Erigon plus CC swap usage ranges from 2.5 to 4.6 GB, putting Caplin's 3.83 GB right in the middle. This is normal MDBX cold-page eviction behavior.

The peer story

Caplin standalone runs two separate P2P networks from a single binary. devp2p for execution layer peers, libp2p for the consensus layer. Each has its own peer count.

On the execution side, devp2p reports 64 connected peers, split evenly across the eth/68 and eth/69 protocol versions. The classic Erigon EC instances on the same fleet report 42 to 43 peers each. Same Erigon binary, same network, same configuration philosophy, but the standalone consistently maintains roughly 50% more EC peers. The likely cause is the larger CPU and memory budget on the standalone host. For more on how the various ECs behave at the P2P layer, see our EC P2P peering deep dive.

On the consensus side, the picture is harder to extract. Caplin v3.3.10 does not expose a current libp2p peer count gauge in Prometheus. The metric appears only in the periodic container log line P2P app=caplin peers=127. We had to switch to log parsing to recover the value.

CClibp2p peer count
Grandine201
Lodestar200
Lighthouse198 to 200
Teku98 to 100
Caplin standalone127
Nimbus81
Prysm~73

Caplin sits in the upper-middle band, below the aggressive peer collectors (Grandine, Lodestar, Lighthouse) but well above the more conservative implementations.

Observability gap

Caplin v3.3.10 ships a comprehensive libp2p_rcmgr_* metric family covering connection lifetime counters, but no current peers gauge. Operators have to scrape the P2P app=caplin peers=N log line, which is emitted once per minute. This is the most significant observability gap we encountered.

A side benefit of Prysm's connected_libp2p_peers{agent="..."} metric is that we can ask the Prysm fleet how many Caplin nodes it sees in the wider network. The answer is sobering: across all five working Prysm pairings combined, only 7 peers identify as agent="erigon/caplin". Caplin nodes remain rare in the mainnet peer graph. Some of those 7 may even be our own standalone connecting to multiple Prysm instances on the same datacenter network. Standalone Caplin is still a young deployment pattern.

Block processing: where Caplin wins

To compare execution performance precisely, we parsed every head updated log line from the trailing 24 hours on the standalone and all six classic Erigon EC instances. The line format gives us four numeric fields per imported block: arrival age relative to slot start, EVM execution time, MDBX commit time, and the moving average gas throughput.

After parsing, we have between 5,113 and 7,174 samples per host, totaling roughly 50,000 blocks. The dataset is large enough that the percentile estimates are stable.

EVM execution time

This is the time from the block being received and verified to its transactions being fully executed by the EVM, before the database commit phase.

Block execution time percentiles across Caplin standalone and six classic Erigon pairings

Hostnp50 (ms)p90 (ms)p99 (ms)
Caplin standalone714286519555233
Erigon plus Lodestar715798120495239
Erigon plus Lighthouse7118101421025396
Erigon plus Grandine7141102621615268
Erigon plus Nimbus7164103722085587
Erigon plus Prysm6847103921705532
Erigon plus Teku5113113526375371

At the median, Caplin executes blocks 12% faster than the next-fastest pairing (Lodestar) and 31% faster than the slowest (Teku). At the 90th percentile the lead is similar. At the 99th percentile the standalone still leads, though by a smaller margin.

The same story shows up in throughput. Mgas per second is the metric Erigon reports as a 12-block moving average inside the head updated line.

Hostp50 (Mgas/s)p90 (Mgas/s)p99 (Mgas/s)
Caplin standalone33.649.489.6
Erigon plus Lodestar29.343.080.7
Erigon plus Lighthouse28.342.074.8
Erigon plus Grandine28.241.475.1
Erigon plus Nimbus27.741.373.9
Erigon plus Prysm27.741.177.4
Erigon plus Teku25.838.793.3

Caplin processes 14% to 30% more gas per second than any classic Erigon pairing across the percentile range. The Erigon binary is identical on every host, so the speedup is some combination of three things. The standalone has four extra vCPUs (16 vs 12). It avoids engine API marshaling because the CL hands payloads to the EL through in-process calls instead of JSON-RPC. And state lookups during execution skip serialization across the engine API boundary. Our data cannot fully separate the architectural contribution from the hardware contribution, but both push in the same direction.

Block processing: where Caplin pays

Now the unflattering side. After execution, Erigon writes state changes to MDBX. In a classic deployment, this commit phase covers only EL state. In the standalone, the same phase also has to persist consensus-layer updates from the imported block, including beacon block headers, attestation aggregations, sync committee participation, and validator state changes.

Commit time percentiles: Caplin standalone vs classic Erigon pairings

Hostp50 (ms)p90 (ms)p99 (ms)
Erigon plus Nimbus118163234
Erigon plus Lighthouse120167240
Erigon plus Lodestar121166246
Erigon plus Teku121172297
Erigon plus Grandine123172294
Erigon plus Prysm126175274
Caplin standalone262362519

Caplin's commit takes more than twice as long. The penalty is consistent across percentiles and time of day. The most plausible explanation is workload size: the standalone has to write more state to MDBX during the same block boundary because it owns both layers. Whether that happens as one MDBX transaction, two, or several sequential ones is something we did not verify at the source level. The metric is the total time the commit phase reports, regardless of how many transactions sit inside it.

End-to-end: a wash at the median, slightly long-tailed

Adding execution and commit gives the total time to fully process a new head block. This is the number that matters for downstream signaling, RPC freshness, and validator readiness.

End to end head update time at p50: stacked execution plus commit

HostExec p50 + Commit p50 (ms)
Erigon plus Lodestar1102
Caplin standalone1127
Erigon plus Lighthouse1134
Erigon plus Grandine1149
Erigon plus Nimbus1155
Erigon plus Prysm1165
Erigon plus Teku1256

At the 50th percentile end-to-end, Caplin is essentially tied with the fastest classic pairings. Lodestar plus Erigon edges it out by 25 ms, which is well within sampling noise. Caplin beats every other pairing by 7 to 129 ms.

At the 99th percentile, the picture inverts slightly. The standalone's longer commit pushes its tail above some of the classic pairings, even though its execution tail is among the fastest. The total p99 spreads from 5485 ms (Lodestar plus Erigon) to 5821 ms (Nimbus plus Erigon), with Caplin at 5752 ms.

So the honest framing is this: the monolithic node executes faster, commits slower, and ends up roughly equivalent at the median. The execution win is real but the commit penalty cancels most of it. Caplin's advantage is that it gets there with one binary instead of two.

Block age, the time from slot start to when the EC first observes the block via gossip, is identical across all hosts: p50 of 3 to 4 seconds, p99 of 9 to 12 seconds. The network is not the differentiator here. Whatever happens after a block arrives is what we are measuring.

Stability and log hygiene

Over the 7 day window, the Caplin standalone process did not restart once. Its current uptime is 9.7 days. Internal RPC failure counter (rpc_failure) reads zero. The single warning we found in the entire week was a forkchoice was invalid message at 06:53 UTC on May 8, an isolated event with no follow-up symptoms.

We compared container log volumes and warning rates across the standalone and all six classic Erigon EC hosts. The classic ECs ship between 123,000 and 154,000 log lines per week. The standalone ships 100,744. Less log volume on a setup that handles both layers seems counterintuitive until you look at what the classic ECs are saying.

Container log volume and warning rate per host over 7 days

HostLines / 7dWARNERRORPanicWARN rate
Caplin standalone100,7441000.001%
Erigon plus Lodestar123,02016000.013%
Erigon plus Nimbus137,8572000.001%
Erigon plus Prysm139,585365000.261%
Erigon plus Lighthouse140,86522000.016%
Erigon plus Grandine141,02666000.047%
Erigon plus Teku154,00613000.008%

The dominant Erigon log pattern that appears in classic ECs but not in the standalone is [NewPayload] Handling new payload, fired once per engine API call from the CC. On the Teku-paired Erigon EC we measured 28 of these in 5 minutes. Extrapolated, that accounts for roughly 336 lines per hour, or about 56,000 lines per week of pure engine API chatter. Take that out and the classic ECs fall close to Caplin's volume.

Zero ERROR-level lines and zero panics across the entire Erigon family for 7 days. Both architectures are equally calm in steady state. The only meaningful difference is that the standalone has fewer log lines to ignore.

Database and storage efficiency

The Caplin standalone reports an MDBX chaindata size of 34.34 GB. Five of the six classic Erigon EC pairings sit between 20.05 GB (Lodestar pair, recently resynced) and 39.94 GB (Prysm pair, with extra unprune'd cache), with most clustering around 34.4 GB.

That is striking. The standalone holds the entire EL state plus the entire CL beacon state, including beacon blocks, attestations, sync committees, validator state, and randao mixes. Yet its database is the same size as an EC-only Erigon. Either MDBX's append-only B-tree representation is exceptionally efficient for the beacon-state schema, or the standalone backfill is not yet complete. Caplin v3.3.10 does not expose a backfill progress metric, so we cannot tell from outside the process which case applies.

This is the second observability gap worth flagging. Operators upgrading or restarting a Caplin standalone have no externally visible signal that historical CL state has finished downloading.

Caplin's CL-side health

The consensus side of Caplin exposes a different metric set than dedicated CCs, but the few metrics that exist look healthy:

MetricCaplin value
aggregate_quality_500.995
attestation_block_processing_time65 ms
active_validators_count913,144
committee_size438
current_epoch446,409

The attestation aggregation quality at the 50th percentile is 99.5%, meaning Caplin is folding nearly every unique-validator attestation it sees into the largest available aggregate. Attestation block processing time of 65 ms is faster than most CC implementations report for similar work.

Block proposal metrics (block_producer_delay) read NaN. The Caplin standalone host is one of the few in our fleet that runs without attached validator keys, so it never proposes blocks. The classic Erigon plus CC pairings on Vienna NDC2 do have validators attached on their CC sides, but that workload is on the CC, not on the EC, and it does not influence the EC commit numbers we measured. Comparing block proposal performance directly between the architectures requires a future investigation that runs Caplin standalone with attached validators. The Nimbus block-building work in our Nimbus v26.3.1 post gives a sense of what that comparison would look like once we have the data.

Architectural trade-offs that don't show up in metrics

Two operational properties of the standalone are real but not measurable from the outside.

The first is failure-domain coupling. With separate binaries (whether on one VM or two), an EC restart leaves the CC running and able to continue gossiping, observing the chain head, and serving its own RPC clients. The CC may flag the EC as offline, but it does not crash. In the standalone, an EC restart is a CC restart by definition. There is no degraded mode where one layer keeps operating while the other recovers.

The second is upgrade granularity. With separate binaries, you can update the EC while leaving the CC stable, or vice versa. This matters when one layer ships a critical fix and the other is in a regression window. The standalone bundles the upgrade. There is no path to upgrade only the EL or only the CL while keeping the other version pinned.

Neither point is a defect, both are intentional consequences of the design choice. They are worth weighing against the resource and stability gains.

Who is this for?

Erigon has a strong reputation as the most efficient archive-class Ethereum execution client. Its compact MDBX storage and pipelined sync turn full historical state into a workable handful of terabytes where many other clients require multiples of that. We documented some of that elsewhere in Execution client sync speed. Caplin standalone extends that strength by adding the consensus layer to the same binary. The operational unit becomes a single binary holding everything: full historical EVM state, beacon chain history, and a complete view of Ethereum from genesis to tip.

That makes Caplin standalone a particularly good match for:

  • Archive node operators wanting the smallest possible disk and CPU footprint for a complete node.
  • RPC providers prioritizing operational simplicity, predictable resource ceilings, and a single point of telemetry.
  • Indexers, subgraph operators, and data extractors that need to query both EL state and CL beacon data from the same host without engineering two-binary coordination.
  • MEV searchers and analytics platforms consuming real-time chain state plus historical data.
  • Solo stakers who are comfortable accepting failure-domain coupling in exchange for a smaller hardware bill and simpler runbooks.

It is less appropriate for:

  • High-availability validator infrastructure where you want the CC to keep functioning while the EC is being restarted, replaced, or upgraded.
  • Operators with strict change-management policies that pin EL and CL upgrades on independent cadences, particularly during regression windows.
  • Multi-CC redundancy patterns where running a different CC alongside the primary protects against bugs in one implementation. Caplin replaces the CC role, it cannot run alongside one.

For pure staking-only setups without external RPC consumers, the case for the standalone is genuinely close. The resource efficiency is real, the stability so far has been very good, and the operational surface area is smaller. The case against it comes down to one question: how much do you value being able to recover one layer while the other keeps working? If the answer is "a lot", the split deployment still has a meaningful edge. If the answer is "less than I value the simpler operations", the standalone earns its place.

Summary

DimensionResult
EVM execution speed🟢 Caplin +12 to +31% faster at p50
Gas throughput🟢 Caplin +14 to +30% higher across percentiles
MDBX commit time🔴 Caplin 2x slower (more state per block boundary)
End-to-end head update at p50⚪ Roughly equivalent to fastest classic pair
End-to-end at p99⚪ Mid-pack, slightly long-tailed
EL peer count🟢 Caplin +50% vs classic Erigon EC
CL peer count🟢 Upper-middle band (127 vs 73 to 201)
Resource footprint🟢 At or below median of combo sums; minus 40 to 60% network
Log volume🟢 18 to 35% less than classic ECs
Warning rate (7d)🟢 0.001% (tied with Nimbus pairing for lowest)
Stability🟢 0 restarts, 9.7d continuous uptime
libp2p peer gauge🔴 Not exposed in Prometheus, log-only
Backfill progress🔴 Not exposed
Failure-domain isolation🔴 Single binary for both layers
Update granularity🔴 EL and CL upgrade together

We will revisit this comparison once Caplin v3.3.10 has accumulated enough peer adoption to escape the seven-peers-in-our-Prysm-fleet floor, and once we can measure block proposal performance with attached validators on a standalone host. Until then, the standalone holds up well as a production candidate, the trade-offs are exactly where the architecture says they should be, and the audience it fits best is broader than the staking-only crowd usually assumes.

Methodology notes

Resource and stability metrics queried from our cold Prometheus datasource using avg_over_time(...[7d:1h]) instant queries evaluated at the end of the 7 day window. Host-level metrics filtered with device!~"lo|veth.*|docker.*|br.*" to exclude virtual interfaces. Process-level metrics filtered by job to separate Caplin and Erigon scrapes from node_exporter.

Block processing latencies extracted from Filebeat-shipped container logs via Elasticsearch. We parsed head updated log lines using regex matching for execution=, commit=, age=, and mgas/s= fields. Sample sizes are 5,113 to 7,174 blocks per host over 24 hours. ANSI color codes stripped before parsing.

Peer counts on the consensus side use a mix of Prometheus gauges and log-extracted values. Caplin standalone CL peers come from the P2P app=caplin peers=N log line emitted once per minute, since v3.3.10 does not expose a current libp2p peer gauge. Cross-fleet Caplin visibility uses Prysm's connected_libp2p_peers{agent="erigon/caplin"}.

For the StereumLabs label conventions and how to point your own dashboards at our data, see Build your own dashboards. For prior context on EC behavior under similar conditions, see our posts on EC P2P peering, EC sync speed, and Teku cross-version analysis.

If you want a custom version of this analysis for your own fleet or a deeper look at any single dimension, write to us at contact@stereumlabs.com.