Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

No metrics displayed with Thanos engine ranged query (go_routines[1d]) in query distributes installation (Prometheus engine returns correct output) #8078

Open
s0rl0v opened this issue Jan 29, 2025 · 4 comments

Comments

@s0rl0v
Copy link

s0rl0v commented Jan 29, 2025

Thanos, Prometheus and Golang version used:
Thanos - v0.37.2
Prometheus - v2.55.1
Golang - v1.23.4

Object Storage Provider:
Azure
Huawei OBS

What happened:
I've set up Thanos Query in distributed mode with the following stores (aka local queries) in config:

  • query-az.int.zone
  • query-hw.int.zone

extraArgs:

  • --query.timeout=5m
  • --query.mode=distributed

The local queries have Thanos Gateways hostnames in their corresponding configurations (Prometheus installations with Thanos sidecars).

When I query distributed query with range query (e.g. go_threads[1d]) via Thanos engine it returns nothing
Image

, Prometheus engine - returns correct output.
Image

Querying local query directly also yields correct results with either Thanos and Promethus engines.
Image

What you expected to happen:
Distributed query returns the same output with Promethues and Thanos engines.

How to reproduce it (as minimally and precisely as possible):

  1. Have Thanos installation with Thanos Gateways.
  2. Set up distributed Query with two local queries in --store extra args.
  3. Search any ranged metric (e.g. go_threads[1d]) with Thanos Enginer.
  4. See nothing

Full logs to relevant components:
no errors whatsoever

Anything else we need to know:

Copy link

dosubot bot commented Jan 29, 2025

Hey there, @s0rl0v! I'm here to assist you with any bugs, questions, or contributions you might have. Let's tackle this together!

The issue you're experiencing with Thanos not displaying metrics for range queries in distributed mode could be related to several factors specific to the Thanos engine's handling of distributed queries. Here are some potential areas to investigate:

  1. Query Mode Configuration: Ensure that the --query.mode=distributed flag is correctly set for all relevant components. This mode is essential for enabling distributed query execution across multiple stores [1].

  2. Thanos Gateway Configuration: Verify that the Thanos Gateways are correctly configured and accessible from the Thanos Query component. Any misconfiguration here could lead to issues in fetching data from the stores.

  3. Replica Labels and Deduplication: Check if the --query.replica-label is set correctly to handle deduplication across your stores. Incorrect deduplication settings might result in missing data in the query results [2].

  4. Partial Response Handling: Consider enabling partial responses using the --query.partial-response flag. This can help in scenarios where some stores might not be responding, allowing the query to return partial results instead of nothing [2].

  5. Logs and Debugging: Although you mentioned there are no errors, enabling more verbose logging might provide additional insights into what might be going wrong during the query execution.

  6. Version-Specific Issues: While there are no known issues specifically documented for Thanos v0.37.2 related to distributed queries, it's worth checking the release notes or any recent issues in the Thanos repository for any updates or patches that might address this behavior.

By reviewing these configurations and settings, you might be able to identify the root cause of the issue with the Thanos engine in distributed mode.

To continue talking to Dosu, mention @dosu.


Help Dosu learn! Give it feedback: Great Response | Irrelevant Answer | Incorrect Sources | Too Verbose | Hallucination | Other

@s0rl0v
Copy link
Author

s0rl0v commented Jan 29, 2025

  1. Query Mode Configuration: Ensure that the --query.mode=distributed flag is correctly set for all relevant components. This mode is essential for enabling distributed query execution across multiple stores [1].

The flag set only for global query, local ones are without it.

  1. Thanos Gateway Configuration: Verify that the Thanos Gateways are correctly configured and accessible from the Thanos Query component. Any misconfiguration here could lead to issues in fetching data from the stores.

The gateways are present in Stores menu, there are no errors in query logs assiciated with store unavailability.

  1. Replica Labels and Deduplication: Check if the --query.replica-label is set correctly to handle deduplication across your stores. Incorrect deduplication settings might result in missing data in the query results [2].

replica label is set to "site", Azure metrics have value set to "azure", huawei ones are set to "huawei"

  1. Partial Response Handling: Consider enabling partial responses using the --query.partial-response flag. This can help in scenarios where some stores might not be responding, allowing the query to return partial results instead of nothing [2].

enabled

  1. Logs and Debugging: Although you mentioned there are no errors, enabling more verbose logging might provide additional insights into what might be going wrong during the query execution.

I'll check this out.

  1. Version-Specific Issues: While there are no known issues specifically documented for Thanos v0.37.2 related to distributed queries, it's worth checking the release notes or any recent issues in the Thanos repository for any updates or patches that might address this behavior.

There are no known issues, I've used search before creating this issue.

@s0rl0v s0rl0v changed the title No metrics displayed via Thanos engine ranged query (go_routines[1d]) in query distributes installation No metrics displayed with Thanos engine ranged query (go_routines[1d]) in query distributes installation (Prometheus engine returns correct output) Jan 29, 2025
@fpetkovski
Copy link
Contributor

Could you click on the Endpoints tab and share a screenshot?

@s0rl0v
Copy link
Author

s0rl0v commented Jan 29, 2025

@fpetkovski Sure!
This is what distributed endpoints look like:
Image

These are endpoints from local query (non-distributed)
Image

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants