-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Thanos-Querier " component=endpointset msg="update of node failed" err="getting metadata: fallback fetching info from test-component-metrics-grpc-consumer.test:10901: rpc error: code = DeadlineExceeded desc = context deadline exceeded" address=test-component-metrics-grpc-consumer.test:10901" #7971
Comments
Hey, @inboxamitraj! I'm here to help you with any bugs, questions, or contributions. Let's work together to resolve the issue you're facing. I found a similar open issue that might be relevant to your situation: Thanos-Querier error "rpc error: code = DeadlineExceeded desc = context deadline exceeded" with istio cross-cluster communication. This issue is still open, and no solution has been found in the comments [1]. Regarding your problem, the 'DeadlineExceeded' error typically indicates that the gRPC call is taking longer than the allowed timeout. Here are a few suggestions to address this issue:
These steps might help in resolving the 'DeadlineExceeded' errors and restoring the expected functionality of your Thanos-Querier setup. To continue talking to Dosu, mention @dosu. Help Dosu learn! Give it feedback: Great Response | Irrelevant Answer | Incorrect Sources | Too Verbose | Hallucination | Other |
You're welcome! If you encounter any further issues or need additional assistance, feel free to reach out. Good luck with the troubleshooting! To continue talking to Dosu, mention @dosu. Help Dosu learn! Give it feedback: Great Response | Irrelevant Answer | Incorrect Sources | Too Verbose | Hallucination | Other |
@dosu
|
Since increasing the timeout settings didn't resolve the issue, consider the following additional steps:
These steps should help further diagnose and potentially resolve the gRPC communication issues you're experiencing with Thanos-Querier in your multi-cluster setup. To continue talking to Dosu, mention @dosu. Help Dosu learn! Give it feedback: Great Response | Irrelevant Answer | Incorrect Sources | Too Verbose | Hallucination | Other |
Thanos, Prometheus and Golang version used:
thanos, version 0.33.0 (branch: HEAD, revision: 0f025f1)
build user: root@9fb4d3c2f7ec
build date: 20231217-16:49:45
go version: go1.21.5
platform: linux/amd64
tags: netgo
AWS EKS version: 1.29
Istio version: 1.20.2
note- same error exists with old Thanos v0.26.0 and latest Thanos v0.37.0 version as well.
Object Storage Provider:
What happened:
we have a Thanos-Querier running on our Centralised Cluster A which queries thanos-store and thanos-sidecar running on other EKS clusters
we recently upgraded Istio to 1.20 from 1.19 since then Grafana does not show latest prometheus metrics on other clusters. however we can see prometheus metrics before 2 hours. This means- Thanos-Querier can talk to Thanos-Store but not able to connect to live Thanos-SideCar which has latest prometheus metrics.
What you expected to happen:
Centralised Grafana should show latest prometheus metrics from other clusters.
How to reproduce it (as minimally and precisely as possible):
Two EKS 1.29 Clusters
Istio 1.20
Prometheus Operator running on both the Clusters and one of the cluster working as Centralised Cluster.
Full logs to relevant components:
Anything else we need to know:
What we have done so far-
Thanos Querier Deployment in Cluster A
Service Entry in Cluster A
Destination Rule in Cluster A
Attached- Grafana screenshot - we cannot see the latest metrics, however metrics older than 2 hours can be seen

The text was updated successfully, but these errors were encountered: