[QueryCache] Use RamUsageEstimator instead of defaulting to 1024 bytes for non-accountable query size #15124

sgup432 · 2025-08-26T02:16:35Z

Description

Related issue - #15097

Instead of using the default 1024 bytes for query size, we try to use RamUsageEstimator to calculate approx size. As seen in above issues, using 1024 as a default can cause issues(like heap exhaustion) due to underestimation, and even cause overestimation leading to fewer queries getting cached.

RamUsageEstimator.sizeOf() is usually pretty fast, I made a sample BooleanQuery with 15 clauses with a size of around 3kb. It took around 14959ns to calculate its size via RamUsageEstimator.sizeOf().

…1024 bytes for non accountable queries

github-actions · 2025-08-26T02:17:31Z

This PR does not have an entry in lucene/CHANGES.txt. Consider adding one. If the PR doesn't need a changelog entry, then add the skip-changelog label to it and you will stop receiving this reminder on future updates to the PR.

lucene/core/src/test/org/apache/lucene/search/TestLRUQueryCache.java

sgup432 · 2025-08-26T20:59:18Z

@msfroh Would you mind taking a look at this?

github-actions · 2025-08-26T20:59:25Z

This PR does not have an entry in lucene/CHANGES.txt. Consider adding one. If the PR doesn't need a changelog entry, then add the skip-changelog label to it and you will stop receiving this reminder on future updates to the PR.

benwtrent · 2025-08-27T13:28:22Z

lucene/core/src/java/org/apache/lucene/search/LRUQueryCache.java

+    long queryRamBytesUsed = RamUsageEstimator.sizeOf(query, 0);
+    return LINKED_HASHTABLE_RAM_BYTES_PER_ENTRY + queryRamBytesUsed;


I realize you have a small benchmark that you did, but real world performance is still a concern of mine here.

I am not sure how to benchmark this, but my concern is that non-accountable queries just got more expensive, we calculate memory used for a query on cache and eviction

Yeah I just timed one 3kb boolean query.

I am not sure how to benchmark this, but my concern is that non-accountable queries just got more expensive

The only way I see it is to micro-benchmark RamUsageEstimator with different query types and size, might give more idea. I had tried to do same but with only term and boolean queries.

but my concern is that non-accountable queries just got more expensive

I am not sure whether it would be that expensive(atleast in absolute terms). Like for TermQuery and similar cheaper queries, RamUsageEstimator should be pretty fast considering we already cache ramBytesUsed for the underlying terms etc. In my opinion, complex BooleanQueries with many nested clauses might be more expensive, as we need to visit all the underlying children for size calculation. To handle that, we already have a limit of 16 clauses beyond which we can cannot cache, and it can be refined further if needed.

I don't see any other good way of calculating query size until we implement Accountable for all queries, but it would too intrusive for just the caching use-case.

@benwtrent So I wrote a simple code to micro-benchmark putIfAbsent method which is the one which calls getRamBytesUsed(Query query) during caching and eviction.
Created a cache with MAX_SIZE = 10000 and MAX_SIZE_IN_BYTES = 1048576, and created N sample boolean queries and the logic looks like something below

for (int i = 0; i < MAX_SIZE; i++) { TermQuery must = new TermQuery(new Term("foo", "bar" + i)); TermQuery filter = new TermQuery(new Term("foo", "quux" + i)); TermQuery mustNot = new TermQuery(new Term("foo", "foo" + i)); BooleanQuery.Builder bq = new BooleanQuery.Builder(); bq.add(must, BooleanClause.Occur.FILTER); bq.add(filter, BooleanClause.Occur.FILTER); bq.add(mustNot, BooleanClause.Occur.MUST_NOT); queries[i] = bq.build(); }

JMH method to test 100% writes workload for 10 threads

@Benchmark @Group("concurrentPutOnly") @GroupThreads(10) public void testConcurrentPuts() { int random = ThreadLocalRandom.current().nextInt(MAX_SIZE); queryCache.putIfAbsent( queries[random], this.sampleCacheAndCount, cacheHelpers[random & (SEGMENTS - 1)]); }

Baseline numbers

Benchmark Mode Cnt Score Error Units QueryCacheBenchmark.concurrentPutOnly thrpt 15 4102080.220 ± 80816.546 ops/s

My changes

Benchmark Mode Cnt Score Error Units QueryCacheBenchmark.concurrentPutOnly thrpt 15 901925.345 ± 38059.230 ops/s

So it became ~4.5x slower. There were lot of evictions as well. Though this probably one of the worst case scenario with only write workload and boolean query. For a mixed read/write or simple filter queries, this might be way less.
And this is the method profile (taken from JFR). Kind of expected.

I don't know if there is a way out or we can further improve query visitor logic itself.

Seems like I was doing

long queryRamBytesUsed = RamUsageEstimator.sizeOf(query, 0);

long sizeOf(Query q, long defSize) Here defSize represents the shallow size of the query, and if 0 is passed, it calculates the shallow size during runtime which was doing a lot of object allocations(adding to higher CPU%) and therefore adding to latency.

I changed to RamUsageEstimator.sizeOf(query, 32);, Here 32 is a rough shallow bytes for a term/boolean query which I calculated manually by calling that method

After running with above change, performance improved but still 2.5x slower than the baseline.

Benchmark Mode Cnt Score Error Units QueryCacheBenchmark.concurrentPutOnly thrpt 15 1608680.127 ± 229218.026 ops/s

github-actions · 2025-09-11T00:24:52Z

This PR has not had activity in the past 2 weeks, labeling it as stale. If the PR is waiting for review, notify the [email protected] list. Thank you for your contribution!

Use RamUsageEstimator to calculate query size instead of the default …

2dd3b8b

…1024 bytes for non accountable queries

github-actions bot added the module:core/search label Aug 26, 2025

sgup432 changed the title ~~Use RamUsageEstimator to estimate query size instead of defaulting to 1024 bytes for non-accountable queries.~~ [QueryCache] Use RamUsageEstimator instead of defaulting to 1024 bytes for non-accountable query size Aug 26, 2025

cwperks reviewed Aug 26, 2025

View reviewed changes

lucene/core/src/test/org/apache/lucene/search/TestLRUQueryCache.java Outdated Show resolved Hide resolved

Use try-with-resources for directory and indexWriter

c60d477

benwtrent reviewed Aug 27, 2025

View reviewed changes

github-actions bot added Stale and removed Stale labels Sep 11, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[QueryCache] Use RamUsageEstimator instead of defaulting to 1024 bytes for non-accountable query size #15124

[QueryCache] Use RamUsageEstimator instead of defaulting to 1024 bytes for non-accountable query size #15124

sgup432 commented Aug 26, 2025

Uh oh!

github-actions bot commented Aug 26, 2025

Uh oh!

Uh oh!

sgup432 commented Aug 26, 2025

Uh oh!

github-actions bot commented Aug 26, 2025

Uh oh!

benwtrent Aug 27, 2025

Uh oh!

sgup432 Aug 27, 2025 •

edited

Loading

Uh oh!

sgup432 Sep 18, 2025

Uh oh!

sgup432 Sep 18, 2025 •

edited

Loading

Uh oh!

github-actions bot commented Sep 11, 2025

Uh oh!

Uh oh!

		long queryRamBytesUsed = RamUsageEstimator.sizeOf(query, 0);
		return LINKED_HASHTABLE_RAM_BYTES_PER_ENTRY + queryRamBytesUsed;

[QueryCache] Use RamUsageEstimator instead of defaulting to 1024 bytes for non-accountable query size #15124

Are you sure you want to change the base?

[QueryCache] Use RamUsageEstimator instead of defaulting to 1024 bytes for non-accountable query size #15124

Conversation

sgup432 commented Aug 26, 2025

Description

Uh oh!

github-actions bot commented Aug 26, 2025

Uh oh!

Uh oh!

sgup432 commented Aug 26, 2025

Uh oh!

github-actions bot commented Aug 26, 2025

Uh oh!

benwtrent Aug 27, 2025

Choose a reason for hiding this comment

Uh oh!

sgup432 Aug 27, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

sgup432 Sep 18, 2025

Choose a reason for hiding this comment

Uh oh!

sgup432 Sep 18, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

github-actions bot commented Sep 11, 2025

Uh oh!

Uh oh!

sgup432 Aug 27, 2025 •

edited

Loading

sgup432 Sep 18, 2025 •

edited

Loading