-
Notifications
You must be signed in to change notification settings - Fork 59
oximeter: refactor field query to improve performance. #9262
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
As described in #9256, looking up field names and values can take much longer than looking up measurements, making queries surprisingly slow overall. Rather than scanning field tables multiple times and running potentially many joins to combine the results, this patch refactors the field lookup query to reduce the number of joins by aggregating all fields of a given type in a single subquery, then joining the results.
5532a72
to
886e8ad
Compare
Note: the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is mostly fine, but I'm having a bit of trouble convincing myself the use of anyIf()
is safe. Also, could we write a "test" that simply prints the output of some of these queries? That can be #[ignore]
d so it doesn't do anything most of the time, but it would help see the entire query structure.
oximeter/db/src/client/oxql.rs
Outdated
format!("filter_on_{}", field_schema.name), | ||
select_map | ||
.entry(field_schema.field_type) | ||
.or_insert(vec![String::from("timeseries_key")]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Performance nit, but you could do or_insert_with
here to avoid always creating the string and vector.
oximeter/db/src/client/oxql.rs
Outdated
.entry(field_schema.field_type) | ||
.or_insert(vec![String::from("timeseries_key")]) | ||
.push(format!( | ||
"anyIf(field_value, field_name = '{}') AS `{}`", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm vaguely concerned about anyIf
. It seems like that would select the first field value with the given field name, which may or may not come from the same actual timeseries, right? As in, there's no guarantee that the rows selected by anyIf()
are the right ones.
I might be missing something about the logic here, but I think it'd be good to add a test for this specific case:
- Two timeseries with the same schema
- Both have a field with the same name, but with different values
- Ensure that using this query, we get the right association between samples and this field.
For that test, we should be sure to put the rows in the "wrong" order intentionally, so that anyIf()
encounters the field from the wrong timeseries first. Let me know if that all makes sense, or if I'm missing something about how this all works.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here's how I understand this to work:
- We filter each query on timeseries_name, so we don't have to worry about confusing schemas across timeseries.
- We aggregate with anyIf while grouping by timeseries_key, so we're only aggregating over groups of rows that should have identical field names and values (other than last_updated_at). If I understand correctly, a given timeseries_key always corresponds to the same set of (name, value) fields. In a way, anyIf is doing the same thing as the distinct in the original query.
Here's an example:
SELECT
timeseries_key,
field_name,
field_value,
last_updated_at
FROM oximeter.fields_string
WHERE (timeseries_name = 'virtual_machine:check') AND (timeseries_key = 4559832181464040913)
Query id: 245521ab-5764-4370-831f-92642c882e4a
┌──────timeseries_key─┬─field_name─┬─field_value─┬─────last_updated_at─┐
│ 4559832181464040913 │ reason │ success │ 2025-10-21 23:00:01 │
│ 4559832181464040913 │ state │ running │ 2025-10-21 23:00:01 │
└─────────────────────┴────────────┴─────────────┴─────────────────────┘
┌──────timeseries_key─┬─field_name─┬─field_value─┬─────last_updated_at─┐
│ 4559832181464040913 │ reason │ success │ 2025-10-19 05:28:14 │
└─────────────────────┴────────────┴─────────────┴─────────────────────┘
┌──────timeseries_key─┬─field_name─┬─field_value─┬─────last_updated_at─┐
│ 4559832181464040913 │ state │ running │ 2025-10-19 05:28:14 │
└─────────────────────┴────────────┴─────────────┴─────────────────────┘
4 rows in set. Elapsed: 0.013 sec. Processed 61.25 thousand rows, 1.68 MB (4.58 million rows/s., 126.07 MB/s.)
Peak memory usage: 1.80 MiB.
If we aggregate these rows with anyIf and group by timeseries_key, we get the expected set of fields, and it's safe to use anyIf because all rows with the same (timeseries_key, field_name) must be duplicates:
SELECT
timeseries_key,
anyIf(field_value, field_name = 'reason') AS reason,
anyIf(field_value, field_name = 'state') AS state
FROM oximeter.fields_string
WHERE (timeseries_name = 'virtual_machine:check') AND (timeseries_key = 4559832181464040913)
GROUP BY timeseries_key
Query id: aceca8a9-7f20-4430-a481-6df45e5e2321
┌──────timeseries_key─┬─reason──┬─state───┐
│ 4559832181464040913 │ success │ running │
└─────────────────────┴─────────┴─────────┘
1 row in set. Elapsed: 0.030 sec. Processed 65.54 thousand rows, 1.31 MB (2.21 million rows/s., 44.33 MB/s.)
Peak memory usage: 1.55 MiB.
Does that make sense? Let me know if I'm missing something, and I can convince you that the query is correct, I'll add the tests you mentioned.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we aggregate these rows with anyIf and group by timeseries_key
This is the part I was missing, I think, thanks. We're aggregating with anyIf()
within groups defined by GROUP BY timeseries_key
. We don't have the filter on timeseries_key
that you've got in your example, but I think that's OK.
I would still like to see the full query in a skipped test, and also either write a new test to check this behavior or make sure that one of the tests we already have covers this.
Thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done. I added a test to assert that we generated the expected query rather than printing it out, so we can look at the expected query in the test. Happy to change that to a skipped test that prints if we prefer.
9311271
to
bc91f6c
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice, this looks great. Thanks for adding the tests!
bc91f6c
to
1909a9f
Compare
Thanks @bnaecker! Once this lands on dogfood, I'm hoping to see a change in collection latencies in http://172.20.26.132:9090/query?g0.expr=sum+by+%28request_name%29+%28oxide_receiver_api_request_duration_seconds%29&g0.show_tree=0&g0.tab=graph&g0.range_input=1h&g0.res_type=auto&g0.res_density=medium&g0.display_mode=lines&g0.show_exemplars=0, hopefully in a good way. |
As described in #9256, looking up field names and values can take much longer than looking up measurements, making queries surprisingly slow overall. Rather than scanning field tables multiple times and running potentially many joins to combine the results, this patch refactors the field lookup query to scan multiple tables in parallel with the
merge
table function, then pivot the results to a wide table using maps. As described inline, there's a bug in merge that prevents us from merging all field tables at once, so we split tables into groups that merge cleanly, then join the results.