Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Expose application metrics in /metrics endpoint #25

Open
oltarasenko opened this issue Nov 21, 2024 · 1 comment
Open

Expose application metrics in /metrics endpoint #25

oltarasenko opened this issue Nov 21, 2024 · 1 comment
Milestone

Comments

@oltarasenko
Copy link
Collaborator

We want the system metrics to be pulled by an external monitoring system (like Grafana).

The task is to select essential node metrics and expose them via HTTP. Right now, it looks like we should start with:

  1. memory footprint
  2. wasm metrics
  3. cpu and reductions
@twilson63 twilson63 added this to the Audit Release milestone Nov 21, 2024
@samcamwilliams
Copy link
Collaborator

LGTM. Those are good to add, but if we can share any of the same base (with additional HB specific metrics) as Arweave that would be good.

Other things to track:

  • Members in the pg groups for scheduling and compute
  • Ideally live message pushing processes, too, but we don't register those with pg right now (nor should we, ideally). We could achieve this by having the logger process generate some sort of event when a push starts and when its registered workers drops to zero.
  • Size of the message cache (either in element count or bytes if necessary -- just as long as it is performant)
  • The same for the computed outputs directory's subdirectories (specifically, /computed/*/*/[slot_number]
  • Stretch goal, only if it will take <1 day: Counts of the atoms used as the first element of tuples sent to ?event(...) -- regardless of whether or not debug logging is enabled.

I think the base Arweave stuff should cover all of the core HTTP metrics (number of response types), average response time etc., but if not please add them. Please be careful not to breakdown on endpoints using the cowboy router patterns at the moment as the HTTP API is in flux (see feat/device-rework branch). In the ideal case (and maybe cowboy works like this anyway?) we would be able to match on simple string patterns (wildcards but no regex) rather than the normal route structures, which won't map well onto the devices underneath.

MVVVP first then we can see where we are and iterate.

GL!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants