Execution Clients Dashboards

This page defines the standardized structure for Stereum’s Execution Layer dashboards, outlining the client-specific configurations required for each implementation and the guiding principles that ensure cross-client consistency and comparability.

1. Introduction

This document specifies the structure planned for creating client execution layer dashboards that will be integrated into the Stereum monitoring ecosystem, extending its existing telemetry capabilities with deeper instrumentation and standardized cross-client comparability.

These Grafana dashboards present a comprehensive visualization of key Execution Layer (EL) metrics, while the node operates in conjunction with a selected Consensus Layer (CL) client. Because each execution client exposes its own set of Prometheus metrics–-with variations in naming conventions, scope, and metric granularity–-every EL implementation requires a dedicated dashboard configuration.

Nevertheless, to preserve consistency and facilitate navigation, all dashboards adhere to a common structural template. This uniform layout organizes metrics into coherent sections, ensuring that equivalent concepts are grouped together across different clients. Such consistency simplifies interpretation and comparison between client dashboards.

2. Dashboard Sections Overview

Each dashboard is divided into the following major sections:

2.1 Basic System Resource Usage (CPU, Memory, Network)

This section monitors the fundamental system-level resources utilized by both the Execution Layer and the Consensus Layer clients. Metrics such as CPU utilization, resident memory consumption, and network throughput provide immediate insights into the node's hardware and process performance. They are particularly useful for diagnosing resource bottlenecks, identifying abnormal spikes, and ensuring that both clients operate within expected system limits.

2.2 Execution Layer (EL) Statistics

The EL Statistics section provides a high-level overview of the execution client's synchronization and block-processing performance.

Typical metrics include:

Current chain height (the latest block known to the client)
Blocks behind (difference from the network head, indicating sync lag)
Average block time (mean time between consecutive blocks)
Time since last block (how long ago the last block was processed)

Collectively, these indicators allow the operator to quickly assess the synchronization health of the node and detect potential delays in block propagation or execution.

2.3 Networking

The Networking section focuses on peer-to-peer connectivity and network stability. It displays metrics such as the number of connected peers, connection and disconnection rates, and peer churn frequency. These metrics help evaluate how well the client maintains connections to the network, which directly affects its ability to receive and propagate blocks and transactions.

2.4 Execution Engine

The Execution Engine section displays metrics that describe the interaction between the Execution Layer and the Consensus Layer through the Engine API.

Two critical API operations are often monitored:

engine_newPayload latency: Measures the time taken by the execution client to validate and process a new payload (block body) provided by the consensus client. This reflects block validation performance.
engine_forkchoiceUpdated latency: Measures the time required to update the chain's canonical head (fork choice) according to instructions from the consensus client.

These metrics are important for assessing the responsiveness and reliability of the EL–CL interface, which directly affects block proposal and finalization latency in the proof-of-stake protocol.

2.5 Transaction Pool

The Transaction Pool section provides visibility into how the execution client manages pending transactions before they are included in a block.

Typical metrics include:

The total number of transactions currently in the pool
Transaction types (e.g., EIP-1559, access list, blob transactions)
Transaction addition and removal rates
Pending pool memory usage

This section allows operators to evaluate the load on the mempool, detect congestion, and monitor how quickly transactions are processed and propagated through the network.

2.6 Pruning

The Pruning section tracks the state cleanup processes performed by the execution client to remove outdated state data and control on-disk storage growth.

Metrics may include:

Number of pruning cycles completed.
Nodes or keys removed during pruning.
Pending pruning backlog size.
Duration and timing of recent pruning operations.

Monitoring these values helps ensure that pruning is running efficiently and that the node's state database remains optimized for performance and disk usage.

2.7 Storage

The Storage section provides visibility into the client's database subsystem, typically built on key-value stores such as RocksDB or LevelDB.

Representative metrics include:

Total database size and on-disk file count.
Read/write throughput and latency.
Compaction time and pending compaction data volume.
Memory usage of internal caches or buffers.

These metrics are critical for diagnosing performance bottlenecks in block import, transaction execution, or state access operations.

2.8 Language-Specific Metrics

Each execution client is implemented in a different programming language and runtime environment–-such as Java (for Besu), Go (for Geth, Nethermind), or Rust (for Reth). This section exposes language-specific runtime metrics, including:

Memory usage: heap and non-heap regions or language-specific allocators.
Garbage collection or memory reclamation metrics.
Thread or goroutine counts and scheduling behavior.
CPU usage per process or thread.

Tracking these values helps with client-specific tuning and performance optimization, particularly for adjusting JVM heap sizes, Go GC thresholds, or Rust memory limits.

3. Charts Explained and Examples:

Each EL dashboard provides a detailed view of client-specific metrics while allowing the user to correlate data with the CL that the EL is operating alongside. At the top of the dashboard, a selection panel allows you to choose the associated CL client. When multiple versions are available, a drop-down list enables you to select the desired configuration. This ensures that the metrics displayed correspond to the exact EL–CL pairing currently being monitored.

It is recommended to maintain a minimum dashboard time range of 24 hours when visualizing metrics. A shorter time window may lead to incomplete datasets and render certain charts empty due to metric collection intervals or temporary data gaps. Maintaining a longer range provides a more stable visualization and allows for easier identification of performance trends over time.

To enhance readability and facilitate the comparison of metrics across different clients, comparable metrics are consistently represented using the same color coding. However, it is important to note that the scope and implementation of a given metric may vary between clients. For example, two clients may employ different strategies for managing the size of the pruning list, while a third client may not implement pruning at all. These variations in metric scope should be taken into account when interpreting the data.

3.1 Basic System Resource Usage (CPU, Memory, Network)

This section displays fundamental system resource metrics for the node running both the EL and the CL clients. These metrics are identical to those presented in the System Resource Utilization (EL & CL) chapter.

3.2 EL Stats

The Overview panel displays synchronization health and key performance indicators such as:

Block Time (5m avg): The average time between consecutive blocks observed by the execution client over the last five minutes. On Ethereum mainnet, a healthy node typically shows an average near 12 seconds, which reflects normal block production cadence.
Blocks Behind: Indicates how many blocks the local client is behind the network's canonical head. A value of 0 signifies that the client is fully synchronized.
Peer Count: The number of peers currently connected to the Execution Layer client. A stable peer count with sufficient connections (typically 20 or more) contributes to reliable block propagation and synchronization.

A well-performing node generally exhibits a Block Time (5m avg) close to 12 seconds, 0 Blocks Behind, and a steady peer count within the expected range for its network configuration.

3.3 Networking

The Connected Peers chart shows the number of active peer connections established by the Execution Layer client, along with the configured upper limit for maximum peer connections, if available.

The Transactions Per Second (TPS) metric measures how many transactions are processed every second. In a healthy state, TPS should be stable and meet demand without causing delays or congestion.

Network throughput, measured in Mgas/s (Mega Gas per second), indicates the total gas consumed by the network per second, reflecting its capacity to handle transaction and contract execution.

Besu specific metrics

The Connected Total visualization displays the per-second rate of new peer connections over a defined time interval. A rate close to zero indicates a stable and consistent peer set, meaning the node maintains its existing connections effectively.

The Connected Peers by Client provides a breakdown of connected peers categorized by the client implementation (e.g., Geth, Nethermind, Besu, Erigon). This distribution offers insight into the network's client diversity and may be useful for identifying interoperability or compatibility trends across peers.

3.4 Execution Engine

Two key metrics related to the Engine API, which are critical for validating the communication between the Execution Layer (EL) and the Consensus Layer (CL) are:

engine_newPayloadV4: This metric tracks the number of engine_newPayloadV4 RPC requests received by the Execution Client. It indicates that the Consensus Layer is successfully submitting new payloads (blocks) for validation and execution. A steady rate of updates suggests healthy synchronization between the CL and EL.
engine_forkchoiceUpdatedV3: This metric represents the number of engine_forkchoiceUpdatedV3 requests received. These requests inform the Execution Client of the current fork choice state –- i.e., which chain head the Consensus Layer considers canonical. Regular updates imply that the CL is properly communicating finalized and justified blocks to the EL.

3.5 Transaction Pool

Monitoring these metrics helps assess the node's health, network connectivity, and responsiveness to transaction flow.

Key metrics include:

Total Transaction Count: Represents the overall number of transactions currently in the pool. This value fluctuates as new transactions are received and mined into blocks. A healthy node typically maintains a stable transaction count, with periodic decreases as blocks are produced.

Total Future Transaction Count: Indicates the number of transactions in the queued pool, transactions not yet eligible for processing (e.g., missing nonce order or insufficient gas price).

Besu specific metrics

Total Transaction Count by Type: Breaks down the pool contents by transaction type (e.g., legacy, EIP-1559, blob, etc.), allowing you to verify that the client correctly accepts all supported transaction formats.

Unique Senders: Indicates the number of distinct accounts currently submitting transactions to the pool. A consistent number of active senders reflects healthy peer connectivity and participation in the network.

Rejected Transactions: Tracks transactions that were dropped or rejected due to invalid signatures, insufficient fees, or nonce conflicts. Occasional rejections are normal, but a persistently high rate may indicate misconfigured peers, network spam, or local validation issues.

3.6 Pruning Metrics

The Pruning section monitors the state of the EL cleanup processes responsible for managing and reducing on-disk state size. Pruning ensures that obsolete or unnecessary state data (e.g., old account or storage trie entries) is periodically removed, optimizing disk usage and improving long-term node performance.

In this section, establishing comparable metrics is particularly challenging, as each client implements its own pruning strategy and exposes different metric sets. Generally, a healthy node executes pruning operations periodically, which is reflected by a stable or decreasing pruning list size and a consistent number of items being removed over time.

Geth and Ethrex do not provide pruning metrics because of differences in their storage models. Geth does not perform state pruning, relying instead on its snapshot mechanism, while Ethrex uses an append-only database with offline compaction rather than incremental pruning. In both cases, there are no live pruning queues or deletions to track, so meaningful runtime pruning metrics are not exposed.

Besu specific metrics

Trie Logs Added To Prune Queue Total: Counts the total number of trie logs (state entries) added to the pruning queue. This metric indicates how much data is scheduled for cleanup.

Trie Logs Pruned Total: Tracks the total number of trie logs successfully pruned from the queue. A steady increase over time reflects active and effective cleanup operations.

Trie Logs Waiting To Be Pruned Total: Represents the current size of the pruning backlog. This value should remain relatively stable or decrease over time; a growing backlog may signal that pruning cannot keep up with the rate of new state additions, potentially indicating I/O bottlenecks or insufficient pruning frequency.

Erigon specific metrics

Added To Prune Queue Total: Indicates the amount of state data that is eligible for pruning. Total, categorized by domain (e.g., state, receipts) or by history (transaction or execution history) and per table, e.g., accounts, contract code, or storage entries.

Pruned Total: Number of items actually deleted during pruning operations, total and broken down by domain, index structures, or historical data.

Nethermind specific metrics

Pruned Nodes: Number of state nodes successfully pruned.
Deep Pruned Nodes: Number of deeper historical or less frequently accessed nodes pruned.

Pruning Time: Duration of the most recent pruning operation. Shorter times indicate efficient pruning; spikes may indicate network load or storage bottlenecks.

3.7 Storage

Monitoring storage metrics is essential to ensure healthy node operation and prevent disk-related issues. The main metric is database size, which reflects the total disk space used by the node's blockchain data, including state, blocks, receipts, and indexes. No storage metrics are available for Besu because RocksDB metrics are not explicitly enabled in the configuration.

Different examples:

Change control for this page: material edits will be logged in the global Changelog with a short rationale and effective date.

1. Introduction​

2. Dashboard Sections Overview​

2.1 Basic System Resource Usage (CPU, Memory, Network)​

2.2 Execution Layer (EL) Statistics​

2.3 Networking​

2.4 Execution Engine​

2.5 Transaction Pool​

2.6 Pruning​

2.7 Storage​

2.8 Language-Specific Metrics​

3. Charts Explained and Examples:​

3.1 Basic System Resource Usage (CPU, Memory, Network)​

3.2 EL Stats​

3.3 Networking​

Besu specific metrics​

3.4 Execution Engine​

3.5 Transaction Pool​

Besu specific metrics​

3.6 Pruning Metrics​

Besu specific metrics​

Erigon specific metrics​

Nethermind specific metrics​

3.7 Storage​

1. Introduction

2. Dashboard Sections Overview

2.1 Basic System Resource Usage (CPU, Memory, Network)

2.2 Execution Layer (EL) Statistics

2.3 Networking

2.4 Execution Engine

2.5 Transaction Pool

2.6 Pruning

2.7 Storage

2.8 Language-Specific Metrics

3. Charts Explained and Examples:

3.1 Basic System Resource Usage (CPU, Memory, Network)

3.2 EL Stats

3.3 Networking

Besu specific metrics

3.4 Execution Engine

3.5 Transaction Pool

Besu specific metrics

3.6 Pruning Metrics

Besu specific metrics

Erigon specific metrics

Nethermind specific metrics

3.7 Storage