Ethereum EC pruning: disk size and the 4-second deadline
A Geth node and an Erigon node sit in the same rack, follow the same Ethereum mainnet, take the same engine-API traffic from their paired consensus clients. Today the Geth host's root filesystem reports ~1,688 GB used. The Erigon host reports ~509 GB. That is a 3.3× spread, on the same chain, with no archive mode involved. Both are full nodes, both synced to mainnet head.
Disk is only half of it. The same pruning that bounds those footprints also runs on the engine-API hot path, and on some consensus-client pairings it pushes engine_newPayload past the 4-second sync-committee deadline. This post covers both halves: how big each client gets and whether you can see it pruning, then what that pruning costs in latency once a node is at the tip.
- Absolute disk numbers are upper bounds, not steady state. Every EC host here was rebuilt in the last 7 to 16 days (we checked boot times and on-chain history). A fresh node has just finished an initial sync, whose on-disk layout is looser than a long-running, compacted one. The relative ordering (Erigon < Ethrex < Nethermind ≈ Besu < Reth < Geth) comes from architecture, not host age; read "Geth ~1.7 TB" as "heaviest by a wide margin," not a capacity-planning figure.
- We run no validators on this fleet. The latency half measures
engine_newPayload, the EC's slice of a validator's slot budget, not actual missed duties. Read those figures as a risk indicator, not a miss count. - newPayload P99 is observer-dependent. Each CC calls
notifyNewPayloadat a different point in the slot, so the same EC reads differently per CC. Nimbus is the headline; the full matrix is in the latency section. - Reth never finished syncing on this fleet, so it is excluded from the synced comparisons throughout (details in its section).
Key findings:
- 3.3× footprint spread on the same chain (root-FS used per host): Erigon ~509 GB, Ethrex ~598 GB, Nethermind ~1,225 GB, Besu ~1,261 GB, Reth ~1,352 GB (not synced), Geth ~1,688 GB.
- Five of six ECs grew 15 to 19 GB in the 7-day window (~2–3 GB/day, normal block ingestion). The cumulative spread comes from retention policy, not the past week's growth.
- Each EC exposes a different pruning surface, and Ethrex exposes none. Erigon's domain pruner fires ~501 k
blocksevents per host per week; Nethermind'sHybridpruner runs continuously at ~1,497 nodes/sec/host; Geth path-mode prunes implicitly; Ethrex has no pruning metric at all. - Reth is still in initial staged sync: its checkpoint sits ~70 to 130 k blocks behind the tip and is not advancing, so it is excluded from the synced comparisons.
- From Nimbus, three of five synced ECs cross the 4-second deadline on P99
engine_newPayload(Nethermind 4.44 s, Geth 4.67 s, Ethrex ≥5.00 s), but the ranking flips by consensus client. It is a pairing property, not a fixed EC property. - Nethermind's spikes trace to two deliberate v1.37.1 choices.
nethermind_pruning_timespikes past 5 s, over the deadline, entirely from the in-memory pruner. - Sync-committee duties get hit before attestations. A late sync-committee message misses the next block's
sync_aggregateoutright; a late attestation still has one slot of grace.
What we measured: the StereumLabs Ethereum execution-client fleet
The bare-metal NDC2 fleet in Vienna runs the full pairing matrix: every consensus client (Grandine, Lighthouse, Lodestar, Nimbus, Prysm, Teku) paired with every execution client (Besu, Erigon, Ethrex, Geth, Nethermind, Reth). The GCP comparator cohort runs only Geth on the EC side. Host counts: Geth 13 (7 NDC2 + 6 GCP), Besu 7, all others 6. Versions current at query time: Geth v1.17.3, Nethermind 1.37.2 (PruningMode=Hybrid), Reth v2.2.0, Besu 26.5.0, Erigon v3.4.1, Ethrex 13.0.0. Hardware setup is documented in the StereumLabs measurement stack post.
All counters come from our prometheus-cold datasource over a rolling 7-day window ending around 2026-05-28 12:00 UTC. Per-host averages divide each EC's totals by its live host count. Queries are in the methodology section. (Host ages range 7 to 16 days, per the caveat above; Besu and Ethrex hosts are youngest at ~8 days.)
Every EC exposes its own pruning vocabulary on Prometheus. No two share names, and one exposes nothing at all:
- Geth (path-mode):
pathdb_gc_node_count,pathdb_history_state_bytes_data,pathdb_history_state_time. No explicit prune-event counter; pruning is structural to path-mode. - Nethermind (Hybrid):
nethermind_pruned_persisted_nodes_count,nethermind_deep_pruned_persisted_nodes_count,nethermind_pruning_time,nethermind_state_db_in_pruning_writes,nethermind_pruning_cutoff_blocknumber. The most complete surface, with an explicit continuous-vs-deep split. - Besu (Bonsai):
besu_pruner_trie_log_added_to_prune_queue_total,besu_pruner_trie_log_pruned_from_queue_total,besu_pruner_trie_log_pruned_orphan_total, plus thebesu_executors_ethscheduler_chaindatapruner_*thread-pool view. Trie-log queue only. - Erigon:
prune_seconds_count(pertypelabel),domain_prune_size,domain_prunable,domain_pruning_progress,domain_prune_took_bucket. Domain-based, the cleanest cross-cutting surface. - Reth:
reth_pruner_duration_seconds,reth_pruner_segments_duration_seconds,reth_pruner_segments_highest_pruned_block. Per-segment timing; onlySenderRecoveryis populated here. - Ethrex: no pruning metric at all. Operators derive pruning state from
datadir_size_bytes.
So "did my client prune in the last hour" is answerable for some ECs and unanswerable for others, before we look at a single number.
How big are the datadirs?
The numbers below come from node_filesystem_* (root-FS used per host), not the EC-internal size metrics. Those internal metrics each report only a slice of the datadir and understate it by hundreds of GB; the methodology has the per-client gap analysis (it is what produced the wrong 26 GB Erigon figure). On these hosts the EC datadir lives on / and the CC runs elsewhere, so root-FS is EC datadir plus ~5 to 10 GB of OS and logs.
Every EC's own size metric understates its real disk use. Erigon's db_table_size_bytes sees 26 GB of a 509 GB datadir; Besu's RocksDB metric reports 0.
| EC | root-FS used (current) | range across hosts | 7-day net delta | Sync state |
|---|---|---|---|---|
| Erigon v3.4.1 | ~509 GB | 444 – 548 GB | +17.7 GB | synced ✓ |
| Ethrex 13.0.0 | ~598 GB | 597 – 598 GB | +15.3 GB | synced ✓ |
| Nethermind 1.37.2 | ~1,225 GB | 1,218 – 1,226 GB | +16.2 GB | synced ✓ |
| Besu 26.5.0 | ~1,261 GB | 1,254 – 1,262 GB | +18.6 GB | synced ✓ |
| Reth v2.2.0 | ~1,352 GB | 1,214 – 1,416 GB | +112 GB | not synced (staged sync, ~127k behind) |
| Geth v1.17.3 | ~1,688 GB | 1,633 – 1,733 GB | +16.2 GB | synced ✓ |
A head-block metric is not a reliable sign that an EC is actually synced (a client can follow the consensus layer's forkchoice head optimistically while still syncing), so we confirmed each one from its own container logs, not just its metrics. Five of six are importing at the tip in real time: Geth logs Imported new potential chain segment, Nethermind Synced Chain Head to 25213…, Besu AbstractEngineNewPayload | Imported #25,213,…, Erigon Post-Forkchoice prune … initialCycle=false, and Ethrex [METRIC] BLOCK 25213… exec … 99% BOTTLENECK (slow per block, but genuinely executing at the tip). Reth is the exception and gets its own section below; read its 1,352 GB as mid-sync state.
The 3.3× spread comes from retention architecture: Erigon drops account, storage, and block-body history once fork-choice no longer needs it and keeps no ancient store, while Geth path-mode adds a continuously growing ancient store on top of its bounded state journal. The per-week growth is the same ~16 GB across the synced clients (normal block ingestion); only the starting footprint differs. The per-client sections below break down each mechanism. It is the storage-side version of what the EC sync-speed comparison found on the same fleet: same hardware, same chain, very different footprints.
Geth: path-mode prunes implicitly
Geth v1.17.3 runs path-based storage. Historical trie nodes for each state transition flow into a bounded "state history" journal and drop out as new history is written. There is no prune event; the boundedness is the prune. Three metrics show it:
pathdb_gc_node_count: ~300 M nodes garbage-collected per host since startup (291 M to 301 M across our 13 hosts), growing monotonically as path-mode GC trims old tries.pathdb_history_state_bytes_data: 1.33 GB per host, barely moving. The state-history journal, bounded by--history.state(default 90,000 blocks).pathdb_history_state_timep99: 510 µs to 1.29 ms per host (one GCP host at 1.66 ms). Time to write a single state-history block.
Geth exposes no per-cycle prune counter and no cutoff block. The only health signal is pathdb_history_state_time p99 climbing over hours, which means the pruner is falling behind writes, though it will not tell you whether the pruner or the writes are the slow part.
Geth's +16.2 GB/week is ancient-store growth, not state. The ancient store holds frozen headers, bodies, and receipts that path-mode never touches, and at ~1,688 GB it is the bulk of Geth's footprint. Bounding total Geth disk needs separate --history.transactions and --history.logs flags, left at defaults here.
Nethermind: the most complete pruning surface
Nethermind 1.37.2 with PruningMode=Hybrid is the only EC that exposes both continuous and deep pruning as distinct counters. The series that matter:
nethermind_pruned_persisted_nodes_count: ~1.13 billion increments in 7 days, ~1,497 nodes/sec/host (range 1,478 to 1,517).nethermind_deep_pruned_persisted_nodes_count: full-prune count. Zero on every host.nethermind_state_db_in_pruning_writes: 0.33 to 0.41, so the state DB is in a pruning write roughly a third of the time.nethermind_pruning_cutoff_blocknumber: 0 across every host. OnHybrid, that means the deep-prune branch has never fired.nethermind_pruning_time,nethermind_pruning_cutoff_timestamp: cycle duration and cutoff in epoch seconds.
So the continuous in-memory pruner does all the work in our window and the deep prune has not been needed. On nethermind_db_size the state DB stays roughly flat, but the full datadir (receipts, indexes, logs) still grows ~16 GB/week, so monitor root-FS for capacity, not nethermind_db_size. That continuous pruner is also what makes engine_newPayload spike on this version, covered in the latency section below.
Nethermind is the easiest EC to alert on: rate(nethermind_pruned_persisted_nodes_count[5m]) + rate(nethermind_deep_pruned_persisted_nodes_count[5m]) from the public Nethermind dashboard gives pruning throughput in nodes/sec. That rate dropping while disk grows means the pruner stalled.
Reth: still in initial sync, so its pruning is not yet observable
Reth v2.2.0 has not finished its initial staged sync on any of our six hosts, and still had not as of this writing. Its container logs show the sync::stages pipeline still running rather than importing at the tip, and the metrics agree: the staged-sync checkpoint sits ~75 to 135 k blocks behind the tip and has not advanced in 12 hours on any host, the final pipeline stages (Finish, TransactionLookup, the history indexes) are still at 0, reth_blockchain_tree_canonical_chain_height is 0, and every newPayload returns SYNCING, never VALID. The process is alive (CPU and disk I/O are non-zero, with static-file rewrites both adding and reclaiming tens of GB), it just is not catching up. We cannot say from one fleet whether the stall is the client or something in our deployment, so we are not drawing any conclusion about Reth beyond "not comparable here."
So Reth's pruning is genuinely unmeasured. Its pruner has only ever run the one startup pass on the SenderRecovery segment (reth_pruner_duration_seconds_count is 0 or 1 per host). The ~1,352 GB on disk is mid-sync state, not a steady-state footprint, and Reth's documented engine backpressure (reth_consensus_engine_beacon_backpressure_active) never activates because the node never reaches steady state. We will rerun the comparison once these hosts finish syncing.
Besu: trie-log pruning is the only window into Bonsai cleanup
Besu 26.5.0 runs Bonsai. Its pruning surface is narrower than Nethermind's and focused on the trie-log queue, the per-block diffs Bonsai keeps for reorg handling and historical queries:
besu_pruner_trie_log_pruned_from_queue_total: ~352 k trie logs pruned in 7 days, ~50,343 per host per day.besu_pruner_trie_log_added_to_prune_queue_total: logs enqueued. The queue-minus-pruned difference is the backlog.besu_pruner_trie_log_pruned_orphan_total: logs marked for deletion with no matching entry. A corruption signal; expect near zero.besu_executors_ethscheduler_chaindatapruner_*: thread-pool view, confirms the pruner scheduler is alive.
Besu exposes no Bonsai state size, no cutoff block, and no total disk figure (besu_rocksdb_rocks_db_files_size_bytes returns 0 here, an instrumentation gap). You read total Besu disk from node_filesystem_*. And the 50 k logs/day figure does not convert to bytes, since trie-log size varies per block with transaction mix, so it is a heartbeat, not a capacity input.
Erigon: the smallest footprint in the fleet, by a wide margin
Erigon v3.4.1 runs domain-based pruning across three named domains: blocks (block bodies and headers), state (account and storage state), and bor (Polygon-specific; always 0 on Ethereum mainnet). The Prometheus surface follows that decomposition:
prune_seconds_count{type=blocks}: ~501 k prune events per host in 7 days (460 k to 528 k), about one every 1.2 seconds.prune_seconds_count{type=state}: ~50,775 events per host in 7 days, about one every 12 seconds.prune_secondsaverages: ~1 ms per blocks prune, ~12 ms per state prune.domain_prune_size,domain_prunable,domain_pruning_progress: per-domain queue gauges, for confirming the backlog stays bounded.
Erigon's root-FS sits at ~509 GB, ~3× below the heaviest EC. Its weekly growth (+17.7 GB) matches everyone else's; only the cumulative footprint is smaller, because of the aggressive history retention. (The 26 GB an earlier draft quoted was the MDBX-only db_table_size_bytes; the snapshot and history files make up the rest.)
A Caplin standalone host (Erigon's built-in CL) shows the same prune_seconds_count series under job="caplin", and its root-FS sits at 525 GB, only ~25 GB above a plain Erigon host despite carrying the CL in the same binary. Details in the Erigon + Caplin standalone vs classic post.
Ethrex: pruning is invisible from Prometheus
Ethrex 13.0.0 exposes no pruning metric. list_prometheus_metric_names for ethrex_.* returns 31 series (peer counts, sync stages, discovery), none describing pruning, retention, or reclamation. The only disk signal is datadir_size_bytes, which matched node_filesystem_* to within ~100 GB and grew +15.3 GB/week per host.
Root-FS sits at ~598 GB, smaller than every EC except Erigon, so some retention bound is clearly in effect. But you cannot see when Ethrex prunes, how much it reclaims, or whether the pruner stalled. The Ethrex dashboard has one disk panel ("DB Size On Disk Total") and no pruning section.
For operators that is the tradeoff: a young Rust client with the smallest metrics surface and no first-class pruning observability. It is fixable by adding instrumentation to the existing metrics endpoint, not a storage-layer limitation. We are happy to coordinate the metric shape with the Ethrex team.
Pruning vs the 4-second sync-committee deadline
The footprints above are the static picture. The operational question is what the pruner does to the engine-API hot path while a node is at the tip. Sync-committee messages and attestations both need to be ready ~4 seconds into the slot. A P99 engine_newPayload above that ceiling means a tail of slots where the CC builds duties against a stale head. Measured from each Nimbus host on engine_api_request_duration_seconds_bucket{request="newPayload"}, 7-day window:
| EC | Nimbus P99 newPayload | vs. 4 s deadline | Sync state |
|---|---|---|---|
| Reth v2.2.0 | 0.23 s | excluded | not synced (returns SYNCING) |
| Besu 26.5.0 | 1.07 s | safe | synced |
| Erigon v3.4.1 | 2.19 s | safe | synced |
| Nethermind 1.37.2 | 4.44 s | over deadline | synced |
| Geth v1.17.3 | 4.67 s | over deadline | synced |
| Ethrex 13.0.0 | ≥5.00 s | top-bucket capped, worse | synced |
Three of the five synced clients sit over 4 s from Nimbus. Ethrex's value is at the histogram's top bucket (5.00 s), so the true P99 is higher; we write ≥5.00 s, and it is a slow-block-validation problem rather than pruning per se. The order tracks the per-client mechanisms above: Besu prunes off-thread, Erigon prunes in a fast stage, and Nethermind, Geth, and Ethrex all touch the engine thread during pruning. Reth's 0.23 s is the SYNCING return path, not block execution, so it is excluded.
The ranking flips by observer
Three CCs export a newPayload histogram: Nimbus (engine_api_request_duration_seconds), Lighthouse (execution_layer_request_times), and Prysm/Lodestar (new_payload_v1_latency_milliseconds). Teku does not. Each calls notifyNewPayload at a different point relative to gossip validation, so the same EC reads differently:
| EC | Nimbus P99 | Lighthouse P99 | Prysm P99 | Notes |
|---|---|---|---|---|
| Reth | 0.23 s | 0.97 s | 4.00 s (capped) | all not synced, SYNCING on every CC |
| Besu | 1.07 s | 4.86 s | 3.91 s | synced |
| Erigon | 2.19 s | n/a | 3.49 s | synced |
| Nethermind | 4.44 s | 1.40 s | 1.05 s | synced |
| Geth | 4.67 s | 4.63 s | 3.41 s | synced |
| Ethrex | ≥5.00 s | 4.49 s | 3.55 s | synced |
The same EC lands on opposite sides of the 4 s line depending on the CC watching. Nethermind breaches on Nimbus only; Besu breaches on Lighthouse.
Across the synced clients:
- Nethermind: 4.44 s on Nimbus, ~1 s on Lighthouse and Prysm. Nimbus calls newPayload later in the slot, into Nethermind's in-memory pruning window between blocks.
- Besu: 1.07 s on Nimbus, ~4 to 5 s on Lighthouse. The Bonsai path meets the Lighthouse calling pattern differently.
- Geth: over 3.4 s on every CC. Its PBSS commit cost lands on the engine thread whenever the call arrives.
- Ethrex: 3.55 s to ≥5.00 s on every CC. Different cause (slow block validation, not pruning), same outcome for duty timing.
So "is my EC fast enough?" depends on the pairing, not the EC alone. The rule: if any CC↔EC pair holds a P99 above 4 s, that pair is the one to investigate for late sync-committee messages. Re-measure with your own CC's histogram first. Latency that swings this much by consensus client is a recurring pattern on this fleet; we saw the same shape in blob verification latency across consensus clients.
Why Nethermind ships this trade-off (the v1.37.1 PRs)
Nethermind 1.37.2 inherited two 1.37.1 changes that trade engine-thread serialization for lower queue latency:
- nethermind#10479, "Run newPayload inline" (Feb 2026). Enables
AllowSynchronousContinuations, sonewPayloadruns on the engine thread when the queue is empty. It saves a context switch, but any in-flight prune on that thread now extendsnewPayloaddirectly. This is why a Nethermind behind a CC that calls at the wrong moment hits 4 to 7 s spikes. - nethermind#10247, "Bump default pruning cache by 512MB" (Jan 2026).
Pruning.CacheMb1280 → 1792,Pruning.DirtyCacheMb1024 → 1536. The PR's own note: "does take a bunch longer; but also is between blocks so should be ok." On our fleet that assumption breaks; the longer cycle does not always finish between blocks under load. - nethermind#10227 added the two knobs operators now need:
Pruning.PruneDelayMilliseconds(default 75 ms, gap between prune iterations) andMerge.PostBlockGcDelayMs(defaultSecondsPerSlot/8 = 1500 ms).
Fleet evidence: nethermind_pruning_time baselines at ~700 ms with regular spikes to 4,000 to 6,300 ms, and nethermind_pruning_cutoff_blocknumber = 0 for the whole window. No deep prune ran; every spike is the in-memory pruner alone. The Nethermind pairing turning up as the slow one is a repeat finding on this fleet: our Teku version comparison flagged persistently elevated block-import delays on the Teku + Nethermind pairing across three releases, so this is not purely a Nimbus artifact.
24 h on one Nethermind host. The baseline is harmless; the spikes are not. The 5.06 s cycle alone overruns the slot budget, and a newPayload landing in that window inherits the delay.
Open Nethermind tracker items:
- nethermind#10557 (open, stability): production validator infrastructure issue for this pattern.
- nethermind#6893 (open): FCU/newPayload timeouts with Lighthouse after restart.
- nethermind#8433 (open since 2025-03): memory-config recommendations, no conclusion.
- nethermind#9149 (closed via PR #9588): proposed pausing in-memory pruning; the merged PR only marks finalized blocks, it does not pause.
In-flight PRs that may help in a later release: nethermind#11539 (hot-path allocations), nethermind#10952 (PersistenceManager prune leak), nethermind#11636 (double execution in newPayloadWithWitness).
Where each client runs pruning relative to the engine thread
| EC | Where pruning runs | Engine-thread impact | Design rank |
|---|---|---|---|
| Reth | Persistence task gated by explicit backpressure | None unless backpressure trips (unmeasured here, hosts unsynced) | 1 (design) |
| Besu | Bonsai trie-log pruning in a separate process | None on engine path | 2 |
| Erigon | Own stage in the execution goroutine | Bounded, ~2 s | 3 |
| Geth | PBSS state-history writes on engine path | Direct, spike-prone under commit pressure | 4 |
| Nethermind | In-memory trie prune, optionally inline newPayload (1.37.1) | Direct, blocks during prune iterations | 5 |
| Ethrex | No state-trie pruning shipped | n/a (different problem) | n/a |
Sources: Reth's backpressure is a first-class metric (reth_consensus_engine_beacon_backpressure_active) with a documented --engine.persistence-backpressure-threshold knob. Besu's separate-process choice is in besu#4476, with PR #4409 dropping p95 block processing from 2.98 s to 0.81 s. Erigon's bounded stage is what remained after erigon#6584 closed as not-planned. Geth's pattern is geth#28317.
This is a design ranking, not a measured leaderboard: the cross-CC table shows the measured order moves with the observer, and Reth's rank-1 design is untested because its hosts never synced.
What an operator can tune today
Starting points to test on a canary node, not blind production changes. We applied the Nethermind set on our own nodes; the rest come from each client's docs and source, linked inline so you can check the exact key names and defaults yourself.
Nethermind 1.37.2 has the highest measured impact, so start here. The lever is making each prune iteration yield to the engine thread, not growing the cache:
# Pruning
PruneDelayMilliseconds = 150 # default 75; the main lever, more yield to the engine thread
DirtyCacheMb = 1280 # default 1536; smaller dirty cache = shorter prune cycles
# Merge
PostBlockGcDelayMs = 900 # default SecondsPerSlot/8 = 1500
Pruning.PruneDelayMilliseconds and Merge.PostBlockGcDelayMs were added in nethermind#10227 (defaults 75 ms and SecondsPerSlot/8); the Pruning.CacheMb 1280→1792 and Pruning.DirtyCacheMb 1024→1536 default bumps are nethermind#10247. Do not raise Pruning.CacheMb to fix the latency: a bigger cache makes each prune cycle take longer (that PR's own note: "does take a bunch longer"), the opposite of what you want when the prune collides with newPayload. Shorter, more frequent iterations with a higher PruneDelayMilliseconds keep the engine thread responsive. These keys live in Nethermind's JSON/.cfg config or --Pruning.PruneDelayMilliseconds-style flags, not YAML. Validate against your own P99 first.
Geth v1.17.x (command-line options): --cache 8192 (default 4096) if memory allows, and raise --cache.trie (default 15, the percentage of --cache given to the trie). Do not reach for --gcmode archive on a validator: it stops pruning but turns the node into a multi-TB growing archive, a different deployment, not a knob. Ancient-store growth is bounded separately by --history.transactions and --history.logs (both default 2,350,000 blocks), and path-mode state by --history.state (default 90,000).
Reth v2.2.0: no synced host here, so no measured advice. By design it gates pruning behind engine backpressure (--engine.persistence-backpressure-threshold, default 16, introduced in Reth 2.0); confirm against your own synced node's P99.
Besu 26.5.0 (limit-trie-logs docs): keep --bonsai-limit-trie-logs-enabled=true (default in 26.5+); the prune batch is --bonsai-trie-logs-pruning-window-size (default 30000). Besu was the slowest from the Lighthouse observer (~4.9 s), so test against your actual CC.
Erigon v3.4.x (options): little to tune; accept the ~2 s baseline. --prune.mode defaults to full; archive does the opposite of saving disk (it keeps all state history), so only use it if you need archive data.
Ethrex 13.0.0: not validator-grade in our measurements (≥3.5 s on every CC); track upstream and raise your CC's per-EL timeout if you must run it.
Why sync-committee duties get hit before attestations
Both duties share the slot-start + ~4 s deadline. But an attestation has one slot of inclusion grace via the next aggregator, while a sync-committee message is aggregated into the next block's sync_aggregate and a late one misses that inclusion outright. So the same newPayload tail that dents attestation effectiveness dents sync-committee participation more, in percentage terms. That matches what we and other Nethermind operators see: sync-committee periods feel disproportionately bad during pruning windows, even when attestation effectiveness barely moves.
Cross-client cheat sheet
By the question you are actually asking:
How much disk does my EC use? Use (node_filesystem_size_bytes − node_filesystem_avail_bytes) on the datadir mount. The EC-internal size metrics all understate it (see methodology); node_exporter is the portable answer.
Did my EC prune in the last 5 minutes? Yes on Erigon (rate(prune_seconds_count[5m])), Nethermind (rate(nethermind_pruned_persisted_nodes_count[5m])), Besu (rate(besu_pruner_trie_log_pruned_from_queue_total[5m])). No on Geth (no event counter), Reth (one event total, so any rate is noise), Ethrex (no metric).
Has the deep/full prune fired? Only Nethermind tells you, via nethermind_pruning_cutoff_blocknumber. It is 0 here, so the deep prune has not run.
Is this EC fast enough for my CC? Compute P99 engine_newPayload from your CC's histogram and compare to 4 s. The answer is per pairing, not per EC (see the latency section); re-measure on your own stack before trusting any ranking.
Why did my pruner stop? Nethermind: nethermind_state_db_in_pruning_writes hits zero while disk grows. Erigon: domain_prunable rises while domain_prune_size stalls. Besu: the added-minus-pruned trie-log backlog grows. Geth: pathdb_history_state_time p99 climbs over hours. Reth and Ethrex: no usable signal, check the config or the disk.
This mirrors what the reorg accounting post found on the consensus side, where beacon_reorgs_total reads 0 on Lodestar and 6,011 on Prysm over 90 days, and what the EC P2P peering deep dive found in how differently each client reports the network. One shared concept, six instrumentation choices, and on Ethrex, silence.
How Glamsterdam history expiry will change EC pruning
EIP-4444 (history expiry) lets execution clients drop pre-expiry headers, bodies, and receipts instead of keeping them forever in ancient stores. Glamsterdam is the activation vehicle, in devnet stabilisation per the EF's April 2026 Checkpoint, with no fixed mainnet slot yet. Two consequences for the numbers here:
- Geth's and Ethrex's footprints are mostly historical retention. Once operators flip on expiry, the ~1,688 GB Geth and ~598 GB Ethrex numbers should fall toward Erigon's ~509 GB. Read today's figures as the pre-expiry baseline.
- The observability gap gets more expensive. A stalled pruner you can see (Erigon, Nethermind, Besu) is a quick fix; a stalled pruner on a client with no pruning metric (Ethrex) is invisible until the disk fills. If you cannot answer "is my EC pruning" today, you will not be able to answer "is it observing the post-expiry bounds" after the fork. The instrumentation needs to land first.
Frequently asked questions about Ethereum EC pruning
Why is Erigon's datadir 3.3× smaller than Geth's on the same chain?
Storage architecture. Erigon prunes account, storage, and block-body history once fork-choice no longer needs it and keeps no ancient store. Geth path-mode keeps a bounded ~1.33 GB state journal plus a continuously growing ancient store. Both serve the same engine-API traffic; they just retain different amounts of history. EIP-4444 should narrow the gap.
How do I check if my Geth path-mode node is pruning correctly?
Watch two metrics: pathdb_history_state_bytes_data should stay roughly flat at your --history.state window (default 90,000 blocks), and pathdb_history_state_time p99 should stay under ~2 ms. There is no prune-cycle counter; the bounded journal is the prune. Ancient-store growth is separate and bounded by --history.transactions and --history.logs.
Why does Nethermind's pruning_cutoff_blocknumber stay at 0?
On PruningMode=Hybrid, Nethermind runs a continuous in-memory pruner and only escalates to a disk-mode deep prune when needed. The deep prune has not been needed in our window, so the cutoff and nethermind_deep_pruned_persisted_nodes_count stay at 0 while the continuous pruner does the work at ~1,497 nodes/sec/host. It keeps the state DB flat on nethermind_db_size, but the full datadir still grows ~16 GB/week, so monitor root-FS for capacity.
Why does Nethermind miss sync-committee duties during pruning?
Its 1.37.1 inline-newPayload change runs engine_newPayload on the engine thread, so an in-flight in-memory prune now serializes with it. nethermind_pruning_time spikes to 4 to 6 s, which pushes P99 past 4 s on Nimbus pairings. On Lighthouse and Prysm it stays ~1 s, so it is the Nimbus calling pattern meeting the inline path. Mitigate with PruneDelayMilliseconds up and DirtyCacheMb down (see the tuning section).
Why is the same EC fast on one CC and slow on another?
Each CC calls notifyNewPayload at a different point in the slot (Prysm early, Lighthouse mid, Nimbus late), so a CC that calls late meets more pruning contention. Nethermind is the clearest case: 4.44 s on Nimbus, ~1 s on Lighthouse and Prysm, same EC and load. The rule: if any CC↔EC pair holds a P99 above 4 s, investigate that pair, and measure with your own CC's histogram.
Is Reth synced in this comparison?
No. Every Reth host is still in initial staged sync: the checkpoint sits ~70 to 130 k blocks behind the tip and has not advanced in 12 hours, every newPayload returns SYNCING, and the final pipeline stages are still at 0. So its disk figure is mid-sync state, its 0.23 s newPayload is the SYNCING return path, and its pruning is unmeasured. We will re-run once these hosts finish syncing.
How do I monitor Ethrex pruning without Prometheus metrics?
Watch datadir_size_bytes and its 7-day delta. If it grows faster than mainnet block data implies (roughly 30 to 50 GB/month per client at current activity), investigate. There is no prune-event metric today; ask for one upstream, or write to contact@stereumlabs.com.
Will EIP-4444 history expiry change these numbers?
Yes, for the cumulative footprint. Geth's ~1,688 GB and Ethrex's ~598 GB are mostly historical block data that EIP-4444 lets clients drop; Erigon at ~509 GB already approximates the post-expiry state. (Reth is excluded; its hosts are still syncing.)
Methodology
All queries ran against our prometheus-cold datasource over windows ending around 2026-05-28 12:00 UTC.
Per-EC root-FS used (the ground-truth for on-disk size), 7-day rolling average, and its 7-day net delta:
avg by (ec_client) (
avg_over_time(
(node_filesystem_size_bytes{mountpoint="/", fstype!="rootfs", role="ec"}
- node_filesystem_avail_bytes{mountpoint="/", fstype!="rootfs", role="ec"})[7d:1h]
)
)
avg by (ec_client) (
(node_filesystem_size_bytes{mountpoint="/", fstype!="rootfs", role="ec"}
- node_filesystem_avail_bytes{mountpoint="/", fstype!="rootfs", role="ec"})
-
(node_filesystem_size_bytes{mountpoint="/", fstype!="rootfs", role="ec"} offset 7d
- node_filesystem_avail_bytes{mountpoint="/", fstype!="rootfs", role="ec"} offset 7d)
)
Per-EC pruning-event rates, plus the Geth path-mode and Nethermind in-pruning signals:
sum by (ec_client) (rate(besu_pruner_trie_log_pruned_from_queue_total[7d])) * 86400
sum by (ec_client) (rate(nethermind_pruned_persisted_nodes_count[1h]))
sum by (ec_client, type) (increase(prune_seconds_count[7d]))
pathdb_gc_node_count
avg_over_time(nethermind_state_db_in_pruning_writes[7d])
nethermind_pruning_cutoff_blocknumber
Reth sync-state check (confirming the hosts have not finished syncing, so their disk and newPayload numbers are mid-sync):
reth_blockchain_tree_canonical_chain_height
reth_consensus_engine_beacon_new_payload_valid
max by (instance) (reth_sync_checkpoint) # vs chain_head_block; flat over 12h = no progress
Every EC's sync state was cross-checked against its container logs in Elasticsearch (block-import lines such as Geth's Imported new potential chain segment or Besu's Imported #… for the synced clients, versus Reth's running sync::stages pipeline), not metrics alone, because a head-block metric can reflect the forkchoice head an EC is following optimistically while it is still syncing.
P99 engine_newPayload from each CC observer, 7-day rolling, and the Nethermind prune-cycle latency it tracks:
# Nimbus
histogram_quantile(0.99, sum by (cc_client, ec_client, le) (rate(engine_api_request_duration_seconds_bucket{request="newPayload"}[7d])))
# Lighthouse
histogram_quantile(0.99, sum by (cc_client, ec_client, le) (rate(execution_layer_request_times_bucket{method="new_payload"}[7d])))
# Prysm/Lodestar (Teku does not export this histogram in our window)
histogram_quantile(0.99, sum by (cc_client, ec_client, le) (rate(new_payload_v1_latency_milliseconds_bucket[7d])))
nethermind_pruning_time
Host counts, taken from count by (ec_client) (count by (ec_client, instance) (up{role="ec", job!="node_exporter"})): Geth 13 (7 NDC2 + 6 GCP comparator), Besu 7, Erigon 6, Ethrex 6, Nethermind 6, Reth 6. The job!="node_exporter" filter avoids double-counting, since every EC host also runs node_exporter under role="ec". The two Grandine custom-image hosts (cc_version="unstable-465c727") are kept because the EC underneath them is a production image.
Why we use node_exporter, not the EC-internal size metrics. Each EC-internal metric reports only a slice of the datadir. This is the gap that produced the wrong 26 GB Erigon figure:
| EC | EC-internal metric | Covers | Root-FS truth |
|---|---|---|---|
| Erigon | db_table_size_bytes | MDBX tables (~26 GB) | ~509 GB |
| Nethermind | nethermind_db_size | state DB (~778 GB) | ~1,225 GB |
| Reth | reth_db_table_size | MDBX tables (~201 GB) | ~1,352 GB (syncing) |
| Besu | besu_rocksdb_rocks_db_files_size_bytes | returns 0 | ~1,261 GB |
| Geth | eth_db_chaindata_disk_size + ..._ancient_size | chaindata + ancient (~1,624 GB) | ~1,688 GB |
| Ethrex | datadir_size_bytes | datadir (~499 GB) | ~598 GB |
Geth lands within ~60 GB and Ethrex within ~100 GB; the rest understate by hundreds of GB, and Besu by everything. So we use node_filesystem_*, which adds only ~5 to 10 GB of OS and logs on top of the datadir.
Top-bucket caveat. Ethrex's ≥5.00 s (and Reth's 4.00 s on Prysm) sit at the histogram's top bucket, so the true P99 is in the +Inf tail. We do not rank a capped value against a measured one. The read-this-first caveats (no validators, per-CC-observer) apply to every latency figure; where a single P99 is unattributed it is from Nimbus.
The window covers post-Fusaka mainnet conditions only. Pre-Fusaka data (before December 3, 2025) is in cold storage but not included here because PeerDAS introduced significant fork-choice timing changes that mix populations.
Dashboard panel definitions for every metric used here are linked from the EC-specific dashboards in our public catalog at /docs/dashboards/list/catalog. The per-client dashboards on grafana.stereumlabs.com carry the pruning panels above; Ethrex is the exception and ships without a pruning section.
If you maintain an Ethereum execution client and want to clarify what we measured (the Besu RocksDB size series, Reth's intended pruner cadence, the Ethrex Prometheus surface, or Nethermind's inline-newPayload path), write to contact@stereumlabs.com. We will append a correction with attribution.

