Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Change metric naming, unit, and http fields #213

Open
wants to merge 4 commits into
base: main
Choose a base branch
from

Conversation

phbnf
Copy link
Collaborator

@phbnf phbnf commented Mar 24, 2025

Towards #105.

Following recommendations from: https://prometheus.io/docs/practices/naming/

Prefix all the metrics with ct_, and suffix with units. Also, use seconds for the SCT timestamp unit since this is what guidelines recommend.... even if an SCT timestamp is in milliseconds... I'm a bit on the fence for this one, but these metrics are meant for graphs and human consumption, so seconds seem to more adapted indeed (and for anything more fancy, milliseconds are still available since the number exported is a float).

Note that I'm explicitly specifying the ct_ prefix manually and not using a wrapper, not to break code searching: I find that searching for a metric name is very useful when investigating an outage.

@phbnf phbnf marked this pull request as ready for review March 24, 2025 15:36
@phbnf phbnf changed the title Change metric naming and unit Change metric naming, unit, and http fields Mar 24, 2025
@roger2hk roger2hk requested a review from AlCutter March 24, 2025 18:56
@@ -68,48 +68,48 @@ var (
lastSCTTimestamp *prometheus.GaugeVec // origin => value
reqsCounter *prometheus.CounterVec // origin, op => value
rspsCounter *prometheus.CounterVec // origin, op, code => value
rspLatency *prometheus.HistogramVec // origin, op, code => value
rspDuration *prometheus.HistogramVec // origin, op, code => value
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Out of curiosity, what's the thinking behind the name change?
What I think this metric represents is called latency distribution in, er, other automatic monitoring systems we're intimately familiar with :)

If you're intent on changing it, "response duration" sounds a bit odd to my ears, but how about rspTime? Also, the description of the metric still explains it in terms of Latency of responses, so I guess that needs updating to match.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Apologies, this PR lacked something very important: a link to https://prometheus.io/docs/practices/naming/, I've now added it. I'm 100% onboard with your comments and this looked a bit odd to me at first sight, being used to other monitoring systems... but it seems to be the prometheus way.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the intent of the naming doc you linked was the http_ prefix there, and the ..._request_duration_seconds was just an example.

I found this "counter" (hah!) example in the histogram section of the prom docs:

A common example would be the time it takes to reply to a request, called latency.

So I think there's latitude to fit in with styles we're familiar with.

prometheus.HistogramOpts{
Name: "http_latency",
Name: "ct_http_request_duration_seconds",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: variable naming is in terms of "response", but metric name is now in terms of "request" - would be good to keep them consistent if possible.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh yeah, good point!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants