Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

chore: update query builder to use 5min/30min aggregation tables #5679

Merged
merged 11 commits into from
Oct 21, 2024

Conversation

srikanthccv
Copy link
Member

@srikanthccv srikanthccv commented Aug 12, 2024

Summary

Uses changes from SigNoz/signoz-otel-collector#361

  1. Table Selection and Aggregation Strategy

    We have two pre-aggregated tables:
    a) 5-minute aggregation
    b) 30-minute aggregation

    The choice of table depends on the query's time range:

    • Range < 1 day: No pre-aggregation table used (step interval < 5 minutes)
    • 1 day < Range < 1 week: Use 5-minute aggregation table (5 minutes < step interval < 33 minutes)
    • Range > 1 week: Use 30-minute aggregation table (step interval > 30 minutes)
  2. Column Selection

    The pre-aggregated tables contain columns for min, max, sum, and count of values from raw samples.

    This PR changes the column selection logic:

    • Instead of always using the value column
    • We now select the appropriate column (min, max, sum, or count) based on the aggregation needs

These changes optimize query performance by using pre-aggregated data when possible and selecting the most appropriate columns for aggregation.

@github-actions github-actions bot added the chore label Aug 12, 2024
@ankitnayan
Copy link
Collaborator

TTL also needs to be updated?

@srikanthccv
Copy link
Member Author

srikanthccv commented Aug 13, 2024

For those customers with a different TTL than default, it will be updated.

Edit:

TTL has been updated. This feature is enabled by default for everyone. Will disable it for tenants who don't have enough data backfilled.

@srikanthccv srikanthccv marked this pull request as ready for review October 19, 2024 09:06
Copy link
Contributor

@ellipsis-dev ellipsis-dev bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 Looks good to me! Reviewed everything up to ac79b1c in 1 minute and 2 seconds

More details
  • Looked at 1125 lines of code in 10 files
  • Skipped 0 files when reviewing.
  • Skipped posting 4 drafted comments based on config settings.
1. pkg/query-service/app/metrics/v4/cumulative/timeseries.go:114
  • Draft comment:
    The op variable is redundantly set in each case of the switch statement but is overwritten by helpers.AggregationColumnForSamplesTable. Remove the redundant assignments to clean up the code.
  • Reason this comment was not posted:
    Comment was on unchanged code.
2. pkg/query-service/app/metrics/v4/delta/timeseries.go:26
  • Draft comment:
    The op variable is redundantly set in each case of the switch statement but is overwritten by helpers.AggregationColumnForSamplesTable. Remove the redundant assignments to clean up the code.
  • Reason this comment was not posted:
    Marked as duplicate.
3. pkg/query-service/app/metrics/v4/delta/timeseries.go:89
  • Draft comment:
    The op variable is redundantly set in each case of the switch statement but is overwritten by helpers.AggregationColumnForSamplesTable. Remove the redundant assignments to clean up the code.
  • Reason this comment was not posted:
    Marked as duplicate.
4. pkg/query-service/app/metrics/v4/query_builder_pre_agg_test.go:1
  • Draft comment:
    Avoid using the component/index.tsx file structure approach, as it makes it difficult to debug and find components using global search tools like VS Code. This is applicable in other files as well.
  • Reason this comment was not posted:
    Comment was on unchanged code.

Workflow ID: wflow_Cv8GQvlgcyiIxF6O


You can customize Ellipsis with 👍 / 👎 feedback, review rules, user-specific overrides, quiet mode, and more.

@ankitnayan ankitnayan self-requested a review October 21, 2024 08:05
@shivanshuraj1333
Copy link
Member

shivanshuraj1333 commented Oct 21, 2024

I have one concern wrt user experience. Some edge cases are missing with this.
Say there are memory spikes for certain requests which are visible in 5 min window, but they are Averaged out in 30 min window. AFAIU the logic, users won't be able to see such spikes for data older data.

@srikanthccv
Copy link
Member Author

AFAIU the logic, users won't be able to see such spikes for data older data.

@shivanshuraj1333 we maintain mix, max, and count other than average. They can choose to use max aggregation if they are interested in seeing the spike.

@srikanthccv
Copy link
Member Author

The default time aggregation for gauge metrics is avergage, but we also allow them to change.

@srikanthccv srikanthccv merged commit 28818fb into develop Oct 21, 2024
16 checks passed
@srikanthccv srikanthccv deleted the agg-tables branch October 21, 2024 08:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants