-
Notifications
You must be signed in to change notification settings - Fork 247
A89: Backend Service Metric Label #471
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Changes from 5 commits
Commits
Show all changes
14 commits
Select commit
Hold shift + click to select a range
4ece413
A89: xDS Cluster Metric Label
ejona86 a7d61a3
Add discussion link
ejona86 90e6217
No more clunter
ejona86 de24e20
More commas
ejona86 296f5ec
Fix A78 link
ejona86 f3c244f
Change metric name to grpc.lb.backend_service
ejona86 72894ea
Adjust descriptions to "backend service"; add WRR
ejona86 e2c6de4
Rename file to A89-backend-service-metric-label.md
ejona86 a5beefc
_Metrics_ for deadline/unavailable
ejona86 5271d32
Mention A75's impact (or lack thereof) to the design
ejona86 f97758d
Update last updated
ejona86 1791fa7
Status: Ready for Implementation
ejona86 7480168
Move plumbing to cds when A75 is implemented
ejona86 96f07cb
Add links to A66 and A78
ejona86 File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,75 @@ | ||
A89: xDS Cluster Metric Label | ||
---- | ||
* Author(s): [Eric Anderson](https://github.com/ejona86) | ||
* Approver: @markdroth | ||
* Status: {Draft, In Review, Ready for Implementation, Implemented} | ||
* Implemented in: <language, ...> | ||
* Last updated: 2025-01-10 | ||
* Discussion at: https://groups.google.com/g/grpc-io/c/s4tm26RiMyI | ||
|
||
## Abstract | ||
|
||
Add a new optional label to per-call metrics containing the xDS cluster being | ||
used for the RPC. | ||
|
||
## Background | ||
|
||
[gRFC A78][] added the `grpc.lb.locality` per-call optional label, which also | ||
added the infrastructure to support LBs adding optional labels to per-call | ||
metrics. The optional label can be enabled in the gRPC/OpenTelemetry integration | ||
API added in [gRFC A79][]. | ||
|
||
Similar to how locality metrics are useful for analyzing _where_ traffic is | ||
being routed, the xDS cluster is useful for knowing _to whom_ it is being | ||
routed. `grpc.target` is generally all that's necessary to know which service is | ||
receiving traffic, but non-deterministic routing in xDS like weighted clusters, | ||
aggregate clusters, and cluster specifier plugins mean different clusters (and | ||
thus potentially different services or service versions) would comingle metrics | ||
unless the selected cluster is added as a label. It can also be helpful to know | ||
the selected cluster to confirm that deterministic routing, like path matching, | ||
is behaving as expected. | ||
|
||
### Related Proposals: | ||
* [gRFC A78: gRPC OTel Metrics for WRR, Pick First, and XdsClient][gRFC A78] | ||
* [gRFC A79: Non-per-call Metrics Architecture][gRFC A79] | ||
|
||
[gRFC A78]: A78-grpc-metrics-wrr-pf-xds.md#per-call-metrics | ||
[gRFC A79]: A79-non-per-call-metrics-architecture.md | ||
|
||
## Proposal | ||
|
||
Each pick in the `xds_cluster_impl` policy, `xds_cluster_impl` will add the | ||
optional label `grpc.xds.cluster` to the call attempt tracer. The value will be | ||
copied from `xds_cluster_impl`'s service config `cluster` key. This is done | ||
regardless of the pick's result. It is possible for later picks for the same RPC | ||
to have a different value. This is the case for locality as well, and the last | ||
pick's value should be used. | ||
|
||
The `grpc.xds.cluster` label will be available on the following per-call | ||
metrics: | ||
- `grpc.client.attempt.duration` | ||
- `grpc.client.attempt.sent_total_compressed_message_size` | ||
- `grpc.client.attempt.rcvd_total_compressed_message_size` | ||
|
||
### Temporary environment variable protection | ||
|
||
The new optional label requires calling an API to activate, so environment | ||
variable protection is unnecessary. | ||
|
||
## Rationale | ||
|
||
gRFC A78 added `grpc.lb.locality` to per-call and WRR metrics, while this is | ||
only adding the new label to per-call. Cluster is an xDS-specific concept, so | ||
it is more awkward to add to WRR and that is left as potential future work. | ||
|
||
Which "cluster" was used for a request is ambiguous when using aggregate | ||
clusters as multiple clusters are involved. For placing in a label, there are | ||
two potential choices: the top-level aggregate cluster and the leaf cluster. | ||
Using the leaf cluster seems to provide the most insight when using aggregate | ||
clusters as failing over to a different priority would be significant. If the | ||
top-level cluster is needed in the future, it can be added as well. | ||
|
||
## Implementation | ||
|
||
@ejona86 will immediately implement in gRPC Java. Other languages will follow as | ||
able. The implementation is very quick. | ||
ejona86 marked this conversation as resolved.
Show resolved
Hide resolved
|
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.