-
Notifications
You must be signed in to change notification settings - Fork 498
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Provide standardized traffic metrics #554
Comments
Hey @tomkerkhove, thanks for bringing this up! We're always interested in finding any overlap with SMI Spec or any other related projects and trying to develop a standard that works for all use cases. In this case, metrics are something that we haven't prioritized yet, but something that could be in scope in the future. This actually came up in Slack a couple weeks ago and I raised it at our community meeting as well. A few quick follow up questions related to SMI metrics:
|
Thanks for your reply! In terms of what metrics I do not have much of a preference, but # of requests is the minimum for me. A CRD is ideal for us/me as we do not want to force Prometheus on everyone. For example, some companies rely on their cloud provider for that so don't need it (including me). In terms of standardization, I'd hope every gateway API has to/will provide these metrics and hopefully compatible with SMI so that we can have the same metric experience for Service Meshes, Ingresses/gateways and eventually hoping for service-to-service as well. |
I would suggest success/failure vs. raw count, since success rate or # of requests can both be calculated from it. |
Yes, but ideally they are there out of the box so no computation needs to be done which makes it easier to use |
Given the SMI experience, are there any issues with using CRs (which are backed by etcd db) to store metrics that are fast changing and tend to be emphemeral? I know that K8S metrics resources have special handling to avoid overwhelming the API server. (https://github.com/kubernetes/metrics/blob/master/pkg/apis/metrics/v1beta1/types.go) |
Other art happening concurrently: I'd share bowei's concern about the apiserver not necessarily being the best place for fast-moving metrics. |
Issues go stale after 90d of inactivity. If this issue is safe to close now please do so with Send feedback to sig-contributor-experience at kubernetes/community. |
Closing as discussed in standup that this is not on the radar for now. |
Since time has passed by, I'm wondering if there are plans to integrate Traffic Metrics into the spec? If so, would be nice to re-use some of the SMI spec or OpenTelemetry ones. |
I am admittedly not very familiar with them, but my understanding is that opentelemetry also defined some standardized request metric schemas which could be a reasonable API if this project decides to 'endorse' a metrics scheme. |
That's definitely correct, but what if it's not endorsed and rather part of the spec to provide these? This is what SMI does and allows end-users to rely on a standard way of getting metrics; regardless of what standard that is being used for the semantics. Tooling knows they are there, for every Gateway API. |
The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs. This bot triages issues and PRs according to the following rules:
You can:
Please send feedback to sig-contributor-experience at kubernetes/community. /lifecycle rotten |
The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs. This bot triages issues and PRs according to the following rules:
You can:
Please send feedback to sig-contributor-experience at kubernetes/community. /close |
@k8s-triage-robot: Closing this issue. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
Given the current state of the project, I think it's time to reconsider this for consistent metrics across implementation. /reopen |
@tomkerkhove: Reopened this issue. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs. This bot triages issues and PRs according to the following rules:
You can:
Please send feedback to sig-contributor-experience at kubernetes/community. /close |
@k8s-triage-robot: Closing this issue. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
@tomkerkhove Some of the folks who have been involved in SMI are trying to get a new working group started for discussing E/W mesh applications of Gateway API that it sounds like may be of interest to you kubernetes/community#6724 In early discussions though, we decided to not focus on a spec for telemetry at this time - it's been an under-implemented/adopted part of SMI, and the divergence in implementations, vendors and evolving standards has made it challenging to build the consensus needed for a standard to become widely adopted. It has been encouraging to see some of the work OpenTelemetry has been doing, and I think for the near future it would be best to focus on implementation/adoption within that group, with the goal of laying the groundwork to eventually enable broader adoption in projects like Gateway API, rather than starting a parallel effort. |
Thanks for the update. I strongly believe Gateway API is more than service meshes and this is a common misconception but I will jump to that thread and see what gives because in the end if it's purely focussing on Service Meshes then SMI already covered that. |
I one thousand percent agree that Gateway API is more than service meshes, but I'm supportive of the new WG because the core API has so many todos already that we're not going to have bandwidth to properly address service mesh use cases for some time. Having a WG that works on the service mesh problems and how to integrate the work SMI has already done with Gateway API and report back will be super useful. |
For anyone still interested in this - there's a related discussion in OpenTelemetry now: open-telemetry/semantic-conventions#1675 |
What would you like to be added:
Provide standardized traffic metrics for all gateways to implement so that other tools can rely on a common way to get the metrics.
Ideally, this would fully align with the metrics that SMI spec leverages to make it, even more, easier to integrate.
I've proposed to extend SMI to beyond Service Meshes but then this project started so it might be better to just align instead of re-invent.
Why is this needed:
Tools & platforms need a unified way to get traffic metrics for all gateways, regardless of what the effective gateway is that is being used.
In my case, we are building HTTP-based autoscaling for KEDA (experimental) so we will rely on things such as SMI, but here we were hoping to rely on a standard/SDK for getting the metrics as well instead of re-implementing every gateway.
/cc @michelleN @bridgetkromhout
The text was updated successfully, but these errors were encountered: