Skip to main content

EthCC[9] talk recap: AI-powered observability for Ethereum staking

· 8 min read
Stefan Kobrc
Founder RockLogic
David Mühlbacher
B2B Relations

A recap of our EthCC[9] presentation in Cannes: what StereumLabs is, how we use AI on top of our monitoring data, and what Fusaka actually did to hardware across 36 client pairings.

EthCC[9] Talk: AI-Powered Observability for Ethereum Staking

On April 2, 2026 we presented StereumLabs at EthCC[9] in Cannes. The talk covered what we've built, why AI on raw metrics alone produces useless output, and concrete Fusaka/PeerDAS runtime data from our bare-metal fleet.

This post is a written companion to that talk. If you prefer watching, the recording is embedded below. The slide deck is available as a PDF download.

Video

The StereumLabs platform

StereumLabs is our observability and analytics platform for Ethereum staking infrastructure. We run every relevant client combination on dedicated bare-metal hardware: 6 execution layer clients (Geth, Nethermind, Besu, Erigon, Reth, Ethrex), 6 consensus layer clients (Lighthouse, Prysm, Teku, Nimbus, Lodestar, Grandine), plus the standalone Erigon + Caplin pairing. That's 37 combinations, monitored 24/7 with 90-day rolling metrics.

All nodes run on isolated bare metal. No shared cloud instances, no noisy-neighbor effects. When we measure performance differences between clients, the data is reproducible and directly comparable.

The platform provides 20+ dashboards covering resource consumption (CPU, RAM, disk, network), client-specific metrics (attestation rates, block processing times, peer counts, GC behavior), and system logs. Client development teams already have free access to the dashboards. The project is supported by an Ethereum Foundation grant.

But dashboards have limits. And that's where AI comes in.

AI Chatbot: natural language meets live data

We built an AI chatbot connected directly to our full monitoring dataset. Instead of navigating dashboards and writing queries, users ask questions in plain English:

  • "Compare disk growth between Geth and Erigon over the last 30 days"
  • "How did the Prysm update from v7.1.1 to v7.1.2 affect resource usage?"
  • "Which consensus client uses the most bandwidth as a supernode?"

The result is a structured analysis with actual numbers, across all EL pairings, in seconds.

The Instruction Set

Here's what most people get wrong about AI in infrastructure monitoring: the model alone doesn't produce useful results. If you point a language model at raw Prometheus metrics, it doesn't know which queries to run, what normal ranges look like, or how to interpret differences between client architectures.

That's why we've built a continuously evolving Instruction Set. It encodes which metrics matter for which client combination, what normal ranges look like per pairing, how to interpret architectural differences (Go vs. Java GC behavior, Rust memory models), and which queries to run in which order to build a meaningful analysis.

Without the Instruction Set, the AI produces generic answers. With it, it produces the kind of analysis that would take an experienced engineer hours to assemble manually. We expand it continuously as we encounter new patterns, new client versions, and new edge cases. It's built from years of running these clients professionally.

Proof: Prysm v7.1.1 to v7.1.2

When the Prysm team shipped v7.1.2, we asked the chatbot one question and got a full resource impact analysis across all 6 EL pairings:

MetricResult
Memory (RSS)Dropped 5.1% on average. Biggest improvement with Geth pairing (-8.8%)
Block processingImproved 25% overall. Erigon pairing went from 403ms to 90ms (-78%)
Peer countStable at ~71 across both versions. No regression
CPUMixed results, EL-dependent. Reth dropped 21%, Besu increased 28%

This analysis would normally take hours of manual work. The chatbot produced it from a single question. The full report is published on our blog: Prysm v7.1.1 & 7.1.2 resources.

AI Alerting: from "something is wrong" to "here's why"

The chatbot is great for proactive analysis. But what about when things go wrong at 3am? That's where AI Alerting comes in.

Two-stage architecture

Stage 1: Near-real-time threshold alerts. Monitors hard thresholds like attestation rate, disk usage, peer count, and missed blocks. Fires within seconds. No AI inference delay, no additional cost. If the AI layer is slow or unavailable, the basic alert still arrives.

Stage 2: AI root-cause analysis. When a threshold alert fires, a webhook triggers the AI. It pulls relevant metrics and logs, correlates the data against our neutral baseline from all 37 client combinations, and delivers a root-cause analysis with actionable next steps. Delivered in 5 to 15 seconds.

The result: operators don't just get "attestation rate dropped below 95%." They get: "Your attestation rate dropped because Geth's peer count fell to 3, likely due to a network partition. Your Prysm instance is healthy. Recommended action: check firewall rules and restart the EL client."

Built-in baseline: instant context

What makes our alerting especially useful is the neutral baseline dataset from our own fleet. When an alert fires, the AI automatically compares against data from all 37 client combinations.

Every alert answers three questions: Is this happening across the Ethereum network right now? Is it specific to this client version? Or is it unique to your local environment?

That distinction between "the whole network is seeing elevated block processing times after a fork" and "your Geth instance is the only one with this problem" is the difference between waiting it out and taking immediate action.

Security monitoring

The same two-stage architecture applies to security events:

  • SSH login checks — authorized key? Expected source IP? Expected time window?
  • Service restart analysis — when an execution client restarts, the AI verifies that fee recipient addresses haven't been changed. A compromised operator could redirect staking rewards without anyone noticing for days.
  • Configuration drift detection — unauthorized processes, unexpected port openings, validator key access patterns.

Traditional monitoring tells you "Geth restarted." Our AI layer tells you "Geth restarted, fee recipient address changed from 0xABC to 0xDEF, this was not initiated through the operator's usual deployment pipeline, severity: critical."

For operators staking millions in ETH, the difference between detecting a compromised reward address in minutes versus days is the difference between a security incident and a financial disaster.

Fusaka + PeerDAS: what the hardfork did to hardware

A significant portion of the talk covered our Fusaka hardfork measurements. We compared two 14-day windows (before and after the December 3, 2025 activation) across all 36 non-supernode client pairings.

The fleet-level headline numbers:

MetricChangeWhat happened
Network RX-60%PeerDAS in action: nodes sample slices instead of downloading full blobs
CPU+30%Expected trade-off: sampling routines cost compute
Memory-8%Less blob data held in RAM
Disk reads-53%Fewer full-blob fetches from disk

Notable client outliers: Nimbus CPU jumped +257% (most compute-intensive PeerDAS implementation), Lighthouse was the only consensus client to reduce CPU (-13%), and Besu saw the largest memory drop (-35%).

The worst pairing post-fork: Nimbus + Reth at 16.78% CPU.

The full analysis with per-client breakdowns, heatmaps, daily trends, and PromQL queries is available as a dedicated blog post: Fusaka hardfork: hardware impact on non-supernodes.

Deployment models

One thing we hear constantly from professional operators: "I'm interested, but I can't send my metrics to your cloud." That's why StereumLabs supports multiple deployment options:

  • SaaS — hosted by us in our ISO 27001 certified environment. Best for smaller operators and researchers.
  • Alerting-as-a-Service (pull model) — you expose a Prometheus endpoint, we scrape it. Your data never enters our systems. It only meets our baseline data at the AI inference layer.
  • On-premise — the entire StereumLabs stack deployed on your infrastructure. Your own API keys. Data never leaves your network.

The key message: your infrastructure data doesn't become someone else's competitive intelligence.

Current status

ComponentStatus
Dashboards (20+)Live, all 37 client combinations
AI ChatbotWorking proof of concept against live data
AI AlertingQ2 2026
Security MonitoringQ2 2026
On-premise deploymentReady

Get in touch

We're looking for node operators who want to try the AI chatbot, client development teams interested in automated cross-EL impact analysis, and staking protocols looking for monitoring and security standards across their operator ecosystem.

Reach out at contact@stereumlabs.com or visit stereumlabs.com.

Slides

The full slide deck from the talk is available for download:

📄 Download slides (PDF)