[server] Add negative cache for non-existent partition IDs in metadata requests#3511
Open
swuferhong wants to merge 1 commit into
Open
[server] Add negative cache for non-existent partition IDs in metadata requests#3511swuferhong wants to merge 1 commit into
swuferhong wants to merge 1 commit into
Conversation
loserwang1024
left a comment
Contributor
There was a problem hiding this comment.
I have left some comment
| */ | ||
| private static final long CLEANUP_INTERVAL_MS = 60_000; | ||
|
|
||
| private final ConcurrentHashMap<Long, Long> cache; |
Contributor
There was a problem hiding this comment.
Why not useUse Guava Cache but implementation by yourself?
- Unreliable expiration cleanup — maybeCleanup() is only called inside markNonExistent(). If no new partitions are marked as non-existent for a long time (in practice, markNonExistent is only triggered after a ZK query confirms non-existence), expired entries will linger in memory indefinitely. While isKnownNonExistent() lazily removes individual expired entries, it never performs a full sweep.
- The project already has shaded Guava, so you can use it directly:
import org.apache.fluss.shaded.guava32.com.google.common.cache.Cache;
import org.apache.fluss.shaded.guava32.com.google.common.cache.CacheBuilder;
private final Cache<Long, Boolean> negativeCache = CacheBuilder.newBuilder()
.expireAfterAccess(10, TimeUnit.MINUTES)
.maximumSize(10000) // prevent OOM in extreme cases
.build();- Advantages of Guava Cache:
- Automatic eviction: access-time-based TTL without manual maintenance
- Bounded size: maximumSize prevents unbounded growth (the current implementation has no size limit)
- Battle-tested thread safety: no need to roll your own CAS logic
@wuchong ,WDYT?
| long[] partitionIds = request.getPartitionsIds(); | ||
| List<Long> partitionIdsNotExistsInCache = new ArrayList<>(); | ||
| for (long partitionId : partitionIds) { | ||
| // Fast-path: throw immediately for partition IDs known to not exist, |
Contributor
There was a problem hiding this comment.
I think negative cache check should NOT be placed before metadata cache lookup.
Problem scenario — a race condition exists:
- A new partition is being created; its ID has been assigned.
- A client sends a metadata request, but the server's metadata cache hasn't synced yet, and ZK may not have the entry written yet either.
- Server queries ZK, finds nothing → calls markNonExistent(partitionId).
- Partition creation completes; metadata cache is updated via ZK watch.
- Subsequent requests are blocked by the negative cache and can never retrieve the now-existing partition.
Suggested fix: Move the negative cache check to after the metadata cache lookup but before the ZK query:
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Purpose
Linked issue: close #3510
During hourly partition rotation, clients holding stale partition IDs repeatedly trigger
ZooKeeper lookups that always return "not found". This adds unnecessary pressure on ZK.
The negative cache eliminates these redundant queries by remembering the "not found" result
for a configurable TTL period.
Brief change log
Tests
API and Format
Documentation