feat(grafana): refresh Walrus storage node dashboard by i-love-to-code · Pull Request #5 · bartosian/walrus-tools

i-love-to-code · 2026-02-14T06:24:01Z

This PR updates the Walrus storage node Grafana dashboard so it matches current node metrics and fixes a broken panel. Includes updated screenshot for docs.

Fixes

HTTP Request Duration: Switched to http_server_request_duration_seconds_* (replacing non-existent http_request_duration_seconds_*) and fixed legend to use real labels (http_request_method, http_route, http_response_status_code) so the legend no longer shows “Name:”.

New panels

Overview: Shards Owned, Runtime Lag, Checkpoint Parsing Errors.
Garbage collection: GC last completed epoch, expired blobs deleted, blob data deletion attempts (by status).
Recovery: Ongoing Blob Syncs, Recover Blob Progress, Sync Metadata Progress; Blob Recovery Backlog timeline set to full width.
Sui RPC & backlogs: Sui RPC call duration (time series), Sui RPC retries (Primary / Failover), Pending Metadata Cache, Pending Sliver Cache.
RocksDB: Background errors, Compaction pending, Block cache usage (%).

Descriptions and clarity

Runtime vs checkpoint lag: Panel descriptions updated to spell out that checkpoint lag is from the checkpoint downloader (ingestion vs full node) and runtime lag is from runtime monitoring (health/slashing), and why they can differ.
Checkpoint Lag Over Time: Description updated to state it’s the same metric as the Checkpoint Lag stat and how to interpret spikes.
Sui RPC Retries: Description explains Primary vs Failover; display names set to “Primary” and “Failover”; threshold-based coloring removed.
Latency Distribution: Title set to “Latency Distribution (95th percentile)”; legend shortened to span name only.
RocksDB Compaction Pending: Threshold-based coloring removed.
Block Read Rate / Get Read Bytes: Descriptions updated to note they are counters, rate in MiB/s, and that the “metric might not be a counter” warning can be ignored when the scrape exposes counters.

Align dashboard with current Prometheus metrics: * Fix broken HTTP panel. * Add GC, recovery, Sui RPC, and RocksDB panels. * Improve descriptions and legends.

feat(grafana): refresh Walrus storage node dashboard

9553712

Align dashboard with current Prometheus metrics: * Fix broken HTTP panel. * Add GC, recovery, Sui RPC, and RocksDB panels. * Improve descriptions and legends.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(grafana): refresh Walrus storage node dashboard#5

feat(grafana): refresh Walrus storage node dashboard#5
i-love-to-code wants to merge 1 commit intobartosian:mainfrom
veera-dao:feat/update-storage-dashboard

i-love-to-code commented Feb 14, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

i-love-to-code commented Feb 14, 2026

Fixes

New panels

Descriptions and clarity

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant