perf: Implement query chunking for charts #1233

pulpdrew · 2025-10-02T20:00:52Z

Summary

Closes HDX-2310

This PR implements chunking of chart queries to improve performance of charts on large data sets and long time ranges. Recent data is loaded first, then older data is loaded one-chunk-at-a-time until the full chart date range has been queried.

Screen.Recording.2025-10-03.at.1.11.09.PM.mov

Performance Impacts

Expectations

This change is intended to improve performance in a few ways:

Queries over long time ranges are now much less likely to time out, since the range is chunked into several smaller queries
Average memory usage should decrease, since the total result size and number of rows being read are smaller
Perceived latency of queries over long date ranges is likely to decrease, because users will start seeing charts render (more recent) data as soon as the first chunk is queried, instead of after the entire date range has been queried. However, total latency to display results for the entire date range is likely to increase, due to additional round-trip network latency being added for each additional chunk.

Measured Results

Overall, the results match the expectations outlined above.

Total latency changed between ~-4% and ~25%
Average memory usage decreased by between 18% and 80%

Scenarios and data

In each of the following tests:

Queries were run 5 times before starting to measure, to ensure data is filesystem cached.
Queries were then run 3 times. The results shown are the median result from the 3 runs.

Scenario: Log Search Histogram in Staging V2, 2 Day Range, No Filter

	Total Latency	Memory Usage (Avg)	Memory Usage (Max)	Chunk Count
Original	5.36	409.23 MiB	409.23 MiB	1
Chunked	5.14	83.06 MiB	232.69 MiB	4

Scenario: Log Search Histogram in Staging V2, 14 Day Range, No Filter

	Total Latency	Memory Usage (Avg)	Memory Usage (Max)	Chunk Count
Original	26.56	383.63 MiB	383.63 MiB	1
Chunked	33.08	130.00 MiB	241.21 MiB	16

Scenario: Chart Explorer Line Chart with p90 and p99 trace durations, Staging V2 Traces, Filtering for "GET" spans, 7 Day range

	Total Latency	Memory Usage (Avg)	Memory Usage (Max)	Chunk Count
Original	2.79	346.12 MiB	346.12 MiB	1
Chunked	3.26	283.00 MiB	401.38 MiB	9

Implementation Notes

When is chunking used?

Chunking is used when all of the following are true:

granularity and timestampValueExpression are defined in the config. This ensures that the query is already being bucketed. Without bucketing, chunking would break aggregation queries, since groups can span multiple chunks.
dateRange is defined in the config. Without a date range, we'd need an unbounded set of chunks or the start and end chunks would have to be unbounded at their start and end, respectively.
The config is not a metrics query. Metrics queries have complex logic which we want to avoid breaking with the initial delivery of this feature.
The consumer of useQueriedChartConfig does not pass the disableQueryChunking: true option. This option is provided to disable chunking when necessary.

How are time windows chosen?

First, generate the windows as they are generated for the existing search chunking feature (eg. 6 hours back, 6 hours back, 12 hours back, 24 hours back...)
Then, the start and end of each window is aligned to the start of a time bucket that depends on the "granularity" of the chart.
The first and last windows are shortened or extended so that the combined date range of all of the windows matches the start and end of the original config.

Which order are the chunks queried in?

Chunks are queried sequentially, most-recent first, due to the expectation that more recent data is typically more important to the user. Unlike with useOffsetPaginatedSearch, we are not paginating the data beyond the chunks, and all data is typically displayed together, so there is no need to support "ascending" order.

Does this improve client-side caching behavior?

One theoretical way in which query chunking could improve performance to enable client-side caching of individual chunks, which could then be re-used if the same query is run over a longer time range.

Unfortunately, using streamedQuery, react-query stores the entire time range as one item in the cache, so it does not re-use individual chunks or "pages" from another query.

We could accomplish this improvement by using useQueries instead of streamedQuery or useInfiniteQuery. In that case, we'd treat each chunk as its own query. This would require a number of changes:

Our query key would have to include the chunk's window duration
We'd need some hacky way of making the useQueries requests fire in sequence. This can be done using enabled but requires some additional state to figure out whether the previous query is done.
We'd need to emulate the return value of a useQuery using the useQueries result, or update consumers.

changeset-bot · 2025-10-02T20:00:55Z

🦋 Changeset detected

Latest commit: d854c1f

The changes in this PR will be included in the next version bump.

This PR includes changesets to release 2 packages

Name	Type
@hyperdx/app	Patch
@hyperdx/api	Patch

Not sure what this means? Click here to learn what changesets are.

Click here if you're a maintainer who wants to add another changeset to this PR

vercel · 2025-10-02T20:00:58Z

The latest updates on your projects. Learn more about Vercel for GitHub.

Project	Deployment	Preview	Comments	Updated (UTC)
hyperdx-v2-oss-app	Ready	Preview	Comment	Oct 10, 2025 7:33pm

github-actions · 2025-10-02T20:07:49Z

E2E Test Results

✅ All tests passed • 25 passed • 3 skipped • 221s

Status	Count
✅ Passed	25
❌ Failed	0
⚠️ Flaky	0
⏭️ Skipped	3

View full report →

pulpdrew · 2025-10-03T15:02:57Z

packages/app/package.json

    "@microsoft/fetch-event-source": "^2.0.1",
    "@tabler/icons-react": "^3.5.0",
-    "@tanstack/react-query": "^5.56.2",
+    "@tanstack/react-query": "^5.90.2",


Upgraded to pull in the new experimental streamedQuery function.

If we want to avoid relying on the experimental feature, this could be implemented more verbosely with useInfiniteQuery.

claude · 2025-10-09T15:40:02Z

PR Review: Query Chunking for Charts

Overview

This PR implements progressive query chunking for chart queries to improve performance on large datasets and long time ranges. The implementation is well-thought-out with comprehensive testing and clear documentation.

✅ Strengths

1. Excellent Test Coverage

940 lines of comprehensive unit tests covering edge cases
Tests for alignment, window generation, chunking behavior, loading states
Good use of mocking and async testing patterns

2. Smart Performance Trade-offs

Clear documentation of performance expectations vs measured results
Recent data loads first (most relevant to users)
Prevents timeouts on large queries
Reduces memory usage significantly (18-80% reduction)

3. Safe Defaults

Only enables chunking when safe (has granularity, dateRange, timestampValueExpression)
Explicitly disables for metrics queries (avoiding complex edge cases)
Provides disableQueryChunking escape hatch

4. Good State Management

isComplete flag clearly indicates when all chunks are loaded
Proper distinction between isPending, isFetching, and isLoading
Uses TanStack Query's streamedQuery appropriately

🔍 Issues & Suggestions

1. Critical: Package Version Consistency ⚠️

// packages/app/package.json
"@tanstack/react-query": "^5.90.2",  // Updated
"@tanstack/react-query-devtools": "^5.56.2",  // Not updated

Issue: Mismatched versions between react-query and react-query-devtools could cause runtime issues.
Recommendation: Update devtools to match: "@tanstack/react-query-devtools": "^5.90.2"

2. Potential Bug: Infinite Loop Risk ⚠️

In DBTimeChart.tsx:223, the onError callback is called on every render when there's an error:

if (query.isError && options?.onError) {
  options.onError(query.error);
}

Issue: This runs on every render, not just when error state changes. Could cause performance issues or infinite loops if onError triggers a re-render.
Recommendation: Use useEffect with error as dependency or move into query options:

const query = useQuery<TQueryFnData, ClickHouseQueryError | Error>({
  // ... other options
  onError: options?.onError, // Let react-query handle it
});

3. Edge Case: Empty Window Array

In getGranularityAlignedTimeWindows (line 105-106):

if (!windows.length || alignedStart < windows[windows.length - 1].dateRange[0]) {
  windows.push({...});
}

Potential Issue: The !windows.length check should ensure the first window is always added, but the logic could be clearer.
Recommendation: Consider simplifying:

const shouldAddWindow = windows.length === 0 || 
  alignedStart < windows[windows.length - 1].dateRange[0];
if (shouldAddWindow) {
  windows.push({...});
}

4. Type Safety: Union Discriminator

In useChartConfig.tsx:55-71, the shouldUseChunking type guard is good, but could be more explicit:

const shouldUseChunking = (
  config: ChartConfigWithOptDateRange,
): config is ChartConfigWithDateRange & {
  granularity: string;
} => {
  if (!isUsingGranularity(config)) return false;
  if (!config.dateRange) return false;
  if (isMetricChartConfig(config)) return false;
  return true;
};

Suggestion: Consider adding explicit type assertions where shouldUseChunking is used to help TypeScript narrow types more effectively.

5. Memory Consideration: Data Accumulation

The reducer in useQueriedChartConfig accumulates all chunks:

reducer: (acc, { chunk, isComplete }) => {
  return {
    data: [...(chunk.data || []), ...(acc?.data || [])],
    meta: chunk.meta,
    rows: (acc?.rows || 0) + (chunk.rows || 0),
    isComplete,
  };
},

Consideration: For very long time ranges with many chunks, this could still accumulate significant data. Consider documenting maximum expected chunks or adding a safeguard.
Recommendation: Add a comment about expected max chunks, or consider a warning if chunk count exceeds expectations.

6. Code Duplication in PatternTable.tsx

const isLoading = isTotalCountLoading || !isTotalCountComplete || isGroupedPatternsLoading;

Minor: The isTotalCountComplete check is added but the relationship between isTotalCountLoading and isTotalCountComplete could be clearer in documentation.

7. Missing Error Boundary

The chunking implementation doesn't have explicit error recovery if a chunk fails mid-stream.
Question: What happens if chunk 2 of 5 fails? Does the user see partial data or an error state?
Recommendation: Document the error behavior and consider adding retry logic per chunk or a "partial data available" indicator.

🎯 Recommendations

High Priority

Fix package version mismatch for @tanstack/react-query-devtools
Fix onError callback to prevent potential infinite loops
Document error handling for failed chunks

Medium Priority

Add JSDoc comments to getGranularityAlignedTimeWindows explaining the alignment logic
Consider adding telemetry to track actual chunking performance in production
Add integration test with failing chunk to verify error handling

Low Priority

Consider extracting window generation constants to config
Add performance budget documentation (max chunks, max total rows)

📊 Performance Analysis

The measured results align well with expectations:

✅ Total latency impact acceptable (-4% to +25%)
✅ Memory reduction significant (18-80%)
✅ User experience improved (progressive loading)

The trade-off of slightly increased total latency for much better perceived performance is appropriate for this use case.

🏗️ Architecture

The implementation follows HyperDX patterns well:

✅ Proper use of TanStack Query patterns
✅ Separation of concerns (window generation, query execution, state management)
✅ Typescript type safety with proper guards
✅ Follows existing hook patterns

📝 Documentation

✅ Excellent PR description with detailed rationale
✅ Implementation notes cover key decisions
✅ Performance data included
⚠️ Could benefit from inline JSDoc for complex functions

Overall Assessment

This is a well-designed feature with excellent testing. The main concerns are the package version mismatch and the onError callback pattern. Once those are addressed, this should be safe to merge.

Recommendation: Request changes for the two high-priority issues, then approve after fixes.

Review generated by Claude Code 🤖

vercel bot deployed to Preview October 2, 2025 20:09 View deployment

pulpdrew commented Oct 3, 2025

View reviewed changes

vercel bot deployed to Preview October 3, 2025 15:04 View deployment

pulpdrew force-pushed the drew/paginated-chart-queries branch from e077e57 to 5620ca8 Compare October 3, 2025 15:09

vercel bot deployed to Preview October 3, 2025 15:12 View deployment

vercel bot deployed to Preview October 3, 2025 19:06 View deployment

pulpdrew force-pushed the drew/paginated-chart-queries branch from a3b22e9 to 2d0e0b7 Compare October 3, 2025 20:01

vercel bot deployed to Preview October 3, 2025 20:04 View deployment

pulpdrew force-pushed the drew/paginated-chart-queries branch from 2d0e0b7 to 6229d52 Compare October 6, 2025 18:40

vercel bot had a problem deploying to Preview October 6, 2025 18:41 Failure

pulpdrew force-pushed the drew/paginated-chart-queries branch from 6229d52 to 3e73289 Compare October 6, 2025 18:49

vercel bot deployed to Preview October 6, 2025 18:52 View deployment

pulpdrew force-pushed the drew/paginated-chart-queries branch from 943ddb8 to 3e899c8 Compare October 6, 2025 19:17

vercel bot deployed to Preview October 6, 2025 19:21 View deployment

pulpdrew force-pushed the drew/paginated-chart-queries branch from 3e899c8 to 76d2753 Compare October 6, 2025 20:48

vercel bot deployed to Preview October 6, 2025 20:51 View deployment

pulpdrew force-pushed the drew/paginated-chart-queries branch from 76d2753 to ec976e5 Compare October 7, 2025 14:45

vercel bot deployed to Preview October 7, 2025 14:48 View deployment

pulpdrew force-pushed the drew/paginated-chart-queries branch from ec976e5 to ca69e95 Compare October 7, 2025 16:23

vercel bot deployed to Preview October 7, 2025 16:26 View deployment

pulpdrew changed the title ~~Drew/paginated chart queries~~ feat: Implement query chunking for charts Oct 7, 2025

pulpdrew marked this pull request as ready for review October 7, 2025 17:03

pulpdrew requested review from a team and dhable and removed request for a team October 7, 2025 17:06

pulpdrew changed the title ~~feat: Implement query chunking for charts~~ perf: Implement query chunking for charts Oct 7, 2025

feat: Implement query chunking for charts

61afdb4

pulpdrew force-pushed the drew/paginated-chart-queries branch from ca69e95 to 61afdb4 Compare October 9, 2025 15:34

vercel bot deployed to Preview October 9, 2025 15:48 View deployment

pulpdrew requested a review from knudtty October 9, 2025 19:06

Merge branch 'main' into drew/paginated-chart-queries

d854c1f

vercel bot deployed to Preview October 10, 2025 19:33 View deployment

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

perf: Implement query chunking for charts #1233

perf: Implement query chunking for charts #1233

Uh oh!

pulpdrew commented Oct 2, 2025 •

edited

Loading

Uh oh!

changeset-bot bot commented Oct 2, 2025 •

edited

Loading

Uh oh!

vercel bot commented Oct 2, 2025 •

edited

Loading

Uh oh!

github-actions bot commented Oct 2, 2025 •

edited

Loading

Uh oh!

pulpdrew Oct 3, 2025

Uh oh!

claude bot commented Oct 9, 2025 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

perf: Implement query chunking for charts #1233

Are you sure you want to change the base?

perf: Implement query chunking for charts #1233

Uh oh!

Conversation

pulpdrew commented Oct 2, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Performance Impacts

Expectations

Measured Results

Scenario: Log Search Histogram in Staging V2, 2 Day Range, No Filter

Scenario: Log Search Histogram in Staging V2, 14 Day Range, No Filter

Scenario: Chart Explorer Line Chart with p90 and p99 trace durations, Staging V2 Traces, Filtering for "GET" spans, 7 Day range

Implementation Notes

Uh oh!

changeset-bot bot commented Oct 2, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🦋 Changeset detected

Uh oh!

vercel bot commented Oct 2, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions bot commented Oct 2, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

E2E Test Results

Uh oh!

pulpdrew Oct 3, 2025

Choose a reason for hiding this comment

Uh oh!

claude bot commented Oct 9, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PR Review: Query Chunking for Charts

Overview

✅ Strengths

1. Excellent Test Coverage

2. Smart Performance Trade-offs

3. Safe Defaults

4. Good State Management

🔍 Issues & Suggestions

1. Critical: Package Version Consistency ⚠️

2. Potential Bug: Infinite Loop Risk ⚠️

3. Edge Case: Empty Window Array

4. Type Safety: Union Discriminator

5. Memory Consideration: Data Accumulation

6. Code Duplication in PatternTable.tsx

7. Missing Error Boundary

🎯 Recommendations

High Priority

Medium Priority

Low Priority

📊 Performance Analysis

🏗️ Architecture

📝 Documentation

Overall Assessment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

pulpdrew commented Oct 2, 2025 •

edited

Loading

changeset-bot bot commented Oct 2, 2025 •

edited

Loading

vercel bot commented Oct 2, 2025 •

edited

Loading

github-actions bot commented Oct 2, 2025 •

edited

Loading

claude bot commented Oct 9, 2025 •

edited

Loading