-
Notifications
You must be signed in to change notification settings - Fork 955
Added cache for COMMAND #2839
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: unstable
Are you sure you want to change the base?
Added cache for COMMAND #2839
Conversation
Signed-off-by: Evgeny Barskiy <[email protected]>
8123861 to
6b6f4f3
Compare
Signed-off-by: Evgeny Barskiy <[email protected]>
JimB123
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What evidence is there to suggest that COMMAND
is expensive to process (too much string operations)
How meaningful is the benefit of caching this information? I can't tell if this represents premature optimization (adding complexity without sufficient value).
Signed-off-by: Evgeny Barskiy <[email protected]>
@PingXie can you share internal latency of COMMAND? |
hpatro
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Have a similar question in the lines of @JimB123 requested.
Why do we want to optimize this particular command? This is not a heavily used command AFAIK. Apart from the clients invoking it upfront to build certain metadata. Does this lead to tail latency on your end for datapath commands?
roshkhatri
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree with reviews, overall looks good
Few things:
- Can we also add some tests to test invalidation?
- Can we also add some benchmark numbers to see the improvements that we are seeing?
Signed-off-by: Evgeny Barskiy <[email protected]>
|
Regardless the performance: When its important:
In general, we have identified 3 similar issues responsible for the tail latency:
While COMMAND has the lowest impact out of these three, its easiest to address |
Signed-off-by: Evgeny Barskiy <[email protected]>
Signed-off-by: Evgeny Barskiy <[email protected]>
zuiderkwast
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Internally, with default valkey it takes 2ms (little to no variance) to regenerate a result.
Add bunch of modules (with extra commands) and it shoots up to 5ms.
For comparison, processing cached result takes 50usec internally.
These numbers look convincing. Thank you.
It looks solid. I have only a few minor comments.
| } | ||
|
|
||
| /* Forward declaration */ | ||
| void addReplyCommandInfo(client *c, struct serverCommand *cmd); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is a block of forward declarations near the top of this file where we could move this declaration, under
/*============================ Internal prototypes ========================== */
| sds cache = server.command_response_cache[cache_idx]; | ||
|
|
||
| if (cache == NULL) { | ||
| cacheCommandResponse(c->resp); | ||
| cache = server.command_response_cache[cache_idx]; | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here we rely on cacheCommandResponse populating a specific global and then we fetch it afterwards. It looks like an implicit dependency that we can avoid.
Consider returning the response from cachecCommandResponse, like we do in generateClusterSlotResponse, and where the corresponding call looks like this:
sds cached_reply = server.cached_cluster_slot_info[conn_type];
if (!cached_reply) {
cached_reply = generateClusterSlotResponse(c->resp);
server.cached_cluster_slot_info[conn_type] = cached_reply;
}| if (cache == NULL) { | ||
| cacheCommandInfo(cmd, c->resp); | ||
| cache = cmd->info_cache[cache_idx]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Similar here. We can let the other function just return the response instead of caching it to avoid some dependency on side-effects between these two functions:
| if (cache == NULL) { | |
| cacheCommandInfo(cmd, c->resp); | |
| cache = cmd->info_cache[cache_idx]; | |
| if (cache == NULL) { | |
| cache = generateCommandInfoResponse(cmd, c->resp); | |
| cmd->info_cache[cache_idx] = cache; |
COMMAND command is expensive to process (too much string operations)
but result is rarely changes, it makes sense to add caching