Lookahead pre-fetching #413

sundb · 2025-09-30T08:59:11Z

No description provided.

codecov · 2025-09-30T09:18:07Z

Codecov Report

❌ Patch coverage is 97.13740% with 15 lines in your changes missing coverage. Please review.
✅ Project coverage is 76.28%. Comparing base (2bc4e02) to head (1a874f1).

Files with missing lines	Patch %	Lines
src/networking.c	96.42%	12 Missing ⚠️
src/acl.c	93.75%	1 Missing ⚠️
src/cluster.c	97.36%	1 Missing ⚠️
src/db.c	90.90%	1 Missing ⚠️

Additional details and impacted files

@@             Coverage Diff              @@
##           unstable     #413      +/-   ##
============================================
+ Coverage     76.12%   76.28%   +0.16%     
============================================
  Files           131      131              
  Lines         76635    76959     +324     
============================================
+ Hits          58335    58708     +373     
+ Misses        18300    18251      -49

Files with missing lines	Coverage Δ
src/aof.c	`82.60% <100.00%> (+0.30%)`	⬆️
src/blocked.c	`98.80% <100.00%> (-0.01%)`	⬇️
src/cluster.h	`100.00% <ø> (ø)`
src/config.c	`85.42% <ø> (ø)`
src/iothread.c	`87.90% <100.00%> (+0.06%)`	⬆️
src/memory_prefetch.c	`99.35% <100.00%> (+1.99%)`	⬆️
src/module.c	`82.94% <100.00%> (+0.15%)`	⬆️
src/multi.c	`95.80% <100.00%> (ø)`
src/replication.c	`87.59% <100.00%> (+0.01%)`	⬆️
src/script.c	`89.47% <100.00%> (ø)`
... and 7 more

... and 13 files with indirect coverage changes

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

oranagra

maybe you can provide a short list of things that exists in the ROF code that you didn't take here and why. it'll be easier to review and discuss.
e.g. i see you did take some portions in multi.c but not all, and it seems you didn't take the ones in cluster.c

oranagra · 2025-09-30T09:34:06Z

src/server.h

 #define CLIENT_REEXECUTING_COMMAND (1ULL<<50) /* The client is re-executing the command. */
 #define CLIENT_REPL_RDB_CHANNEL (1ULL<<51)      /* Client which is used for rdb delivery as part of rdb channel replication */
 #define CLIENT_INTERNAL (1ULL<<52) /* Internal client connection */
+#define CLIENT_IN_PREFETCH (1ULL<<53) /* The client is in the prefetching batch. */


did we already "burn" the term "prefetch" into our code base for this (cpu cache warmup)? maybe we can rename to avoid teminology collision with ROF

ohh, i see we did (memory_prefetch.c). so maybe we can just try to avoid using the term "prefetch" without "memory" 🤷

anyway, i'm not sure what this flag is used for. and if it's about "prefetch" or "preprocess"..

This flag is only used for IO-threads.
When the client moves from the io thread to the main thread (memory prefetch is not performed), The main thread will one by one to the client for the memory prefetch (in processClientsFromIOThread()), but when the pipeline is larger than the lookahead, when querybuf still has data we will enter processInputBuffer() again.
because there may be multiple clients queuing up for prefetch at the top level, we need to avoid starting a new prefetch in it.

did we already "burn" the term "prefetch" into our code base for this (cpu cache warmup)? maybe we can rename to avoid teminology collision with ROF

we can change it to memory_prefetch

ok, so consider renaming or commenting on the purpose or use of that flag.

done with e0ff33a

oranagra · 2025-09-30T09:42:09Z

src/networking.c

 __thread int thread_reusable_qb_used = 0; /* Avoid multiple clients using reusable query
                                         * buffer due to nested command execution. */

+/* COMMAND_QUEUE_MIN_CAPACITY no longer needed with linked list implementation */


done with 4de9dd9

oranagra · 2025-09-30T09:44:14Z

src/networking.c


    /* Search for end of line */
-    newline = strchr(c->querybuf+c->qb_pos,'\n');
+    newline = memchr(c->querybuf+c->qb_pos,'\n',sdslen(c->querybuf) - c->qb_pos);


is that a bug in ROF or any other version?
we do know sds is always null terminated.

from VK, I will revert them.

maybe they had a reason.. i'm just wondering what it was

this change was from valkey-io/valkey#1485.

Changed parsing code to use memchr instead of strchr: During parsing command, ASAN got stuck for unknown reason when called to strchr to look for the next \r Adding assert for null-terminated querybuf didn't resolve the issue. Switched to memchr as it's more secure and resolves the issue

feel like it is caused by some race issues, and we don't need it.

oranagra · 2025-09-30T09:44:57Z

src/networking.c

    if (c->multibulklen == 0) {
-        /* The client should have been reset */
-        serverAssertWithInfo(c,NULL,c->argc == 0);
+        /* TODO: The client should have been reset */


why is that a TODO now?

the comment is outdated, i'll update it.

updated in 4de9dd9

oranagra · 2025-09-30T09:48:06Z

src/networking.c

-         * 2) When the requested size is less than the current size, because
-         *    we always allocate argv gradually with a maximum size of 1024,
-         *    Therefore, if argv_len exceeds this limit, we always reallocate. */
-        if (unlikely(c->multibulklen > c->argv_len || c->argv_len > 1024)) {


i wonder if we're losing some efficiency for clients without pipeline here?

I'll verify it. I think prefetch fills this gap.

oranagra · 2025-09-30T09:55:01Z

src/networking.c

+    }
+}
+
+void parseInputBuffer(client *c) {


so you extracted a portion of processInputBuffer to a function.
it'll be harder to review, and also to merge to ROF.
maybe you can provide a list of bullets explaining the differences?

I was just in order to avoid this method is too big, I deleted it in 742cb79, consistency with ROF.

oranagra · 2025-09-30T09:57:25Z

src/networking.c

+    /* Parse up to lookahead commands */
+    while (c->pending_cmds.ready_len < lookahead && c->querybuf && c->qb_pos < sdslen(c->querybuf)) {


this is a key difference from ROF, right?
you always parse a full batch and then execute a full batch, whereas in ROF we're greedy and parse more commands on every one we execute.
for ease of merges, maybe it's a good idea to refactor the code in a way that it can serve both purposes.

yes, will do it.

i added variable parse_more to determine whether it's needed to parse more commands.
in ROF we can set it to 1 to parse more commands all the time.

/* Determine if we need to parse more commands from the query buffer. * Only parse when there are no ready commands waiting to be processed. */ const int parse_more = !c->pending_cmds.ready_len;

oranagra · 2025-09-30T10:05:44Z

src/cluster.c

+    pendingCommand mc;
+    pendingCommand *mcp = &mc;


i don't see that we're using the pre-calculated slot number.

done with 8cfae3f

oranagra · 2025-09-30T10:07:24Z

src/server.c

+    if (server.cluster_enabled) {
+        getKeysResult result = (getKeysResult)GETKEYS_RESULT_INIT;
+        int numkeys = getKeysFromCommand(pcmd->cmd, pcmd->argv, pcmd->argc, &result); 


since this is conditional here, it means ACL can't re-use it and if both cluster mode and ACL are in used, this part is done twice.

Yes, this is for the command filter. I will try to optimize it at the end and ensure that the command filter is available.

my thinking is that this command filter feature isn't used anywhere AFAIK, and if we have a possibility for optimizations, we can consider dropping it, or making one of these aspects break in the presence of the other.

oranagra · 2025-09-30T10:12:18Z

src/server.h

 #define GETKEYS_RESULT_INIT { 0, MAX_KEYS_BUFFER, {{0}}, NULL }

+/* Parser state and parse result of a command from a client's input buffer. */
+struct pendingCommand {


i'd like to understand why we don't have to store the client* here, unlike OSS.
probably it's because the prefetch is synchronous and in ROF it's async.
right?

that's the same reason why we don't need PENDING_CMD_FLAG_MULTI (in ROF it was used only for statistics tracking, right?)

i'd like to understand why we don't have to store the client* here, unlike OSS. probably it's because the prefetch is synchronous and in ROF it's async. right?

yes, these two are all for BigRedis, no needed for OSS.

Co-authored-by: oranagra <[email protected]>

…gain

sundb · 2025-10-03T13:14:10Z

src/blocked.c

+        /* Free the current pending command to prevent it from being executed again
+         * when the client is unblocked from shutdown state. */
+        freeClientPendingCommands(c, 1);


@oranagra please take a look this fix.
when we unblock a shutdown blocked client, before this PR, we would re-enter the processInputBuffer to obtain the next command. But now, because the first command in the Pending command list still exists, we forgot to remove the first command in the pending command list, which will result in the shutdownCommand being executed again.
I'm not sure if this fix is enough. If there are other types of blockages that don't require reprocess, we all need to manually call freeClientPendingCommand().

/* This function will execute any fully parsed commands pending on * the client. Returns C_ERR if the client is no longer valid after executing * the command, and C_OK for all other cases. */ int processPendingCommandAndInputBuffer(client *c) { /* Notice, this code is also called from 'processUnblockedClients'. * But in case of a module blocked client (see RM_Call 'K' flag) we do not reach this code path. * So whenever we change the code here we need to consider if we need this change on module * blocked client as well */ if (c->flags & CLIENT_PENDING_COMMAND) { c->flags &= ~CLIENT_PENDING_COMMAND; if (processCommandAndResetClient(c) == C_ERR) { return C_ERR; } } /* Now process client if it has more data in it's buffer. * * Note: when a master client steps into this function, * it can always satisfy this condition, because its querybuf * contains data not applied. */ if ((c->querybuf && sdslen(c->querybuf) > 0) || c->pending_cmds.ready_len > 0) { return processInputBuffer(c); } return C_OK; }

make noopt REDIS_CFLAGS='-Werror -DLOG_REQ_RES'
./runtest --log-req-res --no-latency --dont-clean --force-resp3 --tags -slow --verbose --dump-logs --single integration/shutdown --only "Shutting down master waits for replica then fails"

then we can see two -ERR Errors trying to SHUTDOWN. Check logs. in the stdout.reqres.

other types of blockages that don't require reprocess

sorry for my lack of focus, so isn't this bug exactly because this else is skipping the actions of the if which call
prepareForNextCommand, who calls resetClientInternal that handles that?
i.e. we don't have to worry about other types of blocked commands?

i.e. we don't have to worry about other types of blocked commands?

you're right, i'm wrong.
if a client dones't have CLIENT_PENDING_COMMAND flag, we should reset the client and remove the first command from the pending command list.

This issue also exists in unstable, so i made a PR to fix it.
redis#14420

sundb · 2025-10-05T12:04:34Z

src/blocked.c

         * which calls reqresAppendResponse) */
-        reqresAppendResponse(c);
-        resetClient(c);
+        prepareForNextCommand(c);


@oranagra follow #413 (comment)
The reason why I don't add clusterSlotStatsAddNetworkBytesInForUserClient() into prepareForNextCommand() is that we don't call clusterSlotStatsAddNetworkBytesInForUserClient() in the original code.
The previous commands of this client have already been processed once through commandProcessed().
Would it be duplicated if this statistic were also calculated here?

just to be sure i understand.
so you're saying there's a bug in the per-slot metrics in the ROF branch, right? specifically for blocked commands.
we can mirror this change there ASAP, or wait till this one is merged and we'll handle the conflict (ROF doesn't currently run in cluster enabled).

maybe it's a good idea to add an argument to prepareForNextCommand and let it do that from there conditionally?

done with ed08a67

FYI, the reason i created this prepareForNextCommand was exactly that, that i was afraid that if i'll add code on all the random places that do this (resetClient, reqresAppendResponse), and some day the upstream will do some change in all these places, i might not get a merge conflict.
so i moved these into a dedicated "prepare" function, unifying all these places, which reduces the chances something will be overlooked by a merge.

i.e. it was a change aimed for being more explicit, and also hoping conflicts 😄
in that regard, the last commit you added really serves that purpose.

…reForNextCommand() Co-authored-by: oranagra <[email protected]>

…entInternal()

oranagra · 2025-10-08T05:27:19Z

src/server.h

    size_t argv_len_sum;      /* Sum of lengths of objects in argv list. */
    unsigned long long input_bytes;
    struct redisCommand *cmd;
+    getKeysResult keys_result;


i see you added it, but don't see it being set or used

Yes, I'm still trying to add it and not introducing the break change of the command filter.

…-prefetch

sundb · 2025-10-20T12:07:30Z

src/server.c

+/* Periodic maintenance for the pending command pool.
+ * This function should be called from serverCron to manage pool size based on utilization patterns. */
+void pendingCommandPoolCron(void) {
+    /* Only shrink pool when IO threads are not active */
+    if (server.io_threads_active) return;
+
+    /* Calculate utilization rate based on minimum pool size reached */
+    if (server.cmd_pool.capacity > PENDING_COMMAND_POOL_SIZE) {
+        /* If utilization is below threshold, shrink the pool */
+        double utilization_ratio = 1.0 - (double)server.cmd_pool.min_size / server.cmd_pool.capacity;
+        if (utilization_ratio < 0.5)
+            shrinkPendingCommandPool();
+    }
+
+    /* Reset tracking for next interval */
+    server.cmd_pool.min_size = server.cmd_pool.size; /* Reset to current size */
+}


@oranagra now the initial size of the pool is 16, with a maximum size of 1024 to prevent indefinite expansion.
When a command is claimed, the minimum size of the pool is recorded.
The pool's utilization is checked every 2 seconds. If the utilization remains below half (50%) for 2 consecutive seconds, the pool size will be shrunk to half, until it reaches the recorded minimum size of 16.

Minimum size means the objects in the pool that were not used, and thus unneeded, right? Sounds good.
I'm trying to decide if there's also a need to defrag that pool, if it remains large but occupies slabs that become sparse.
Currently, I feel we can skip it.

Minimum size means the objects in the pool that were not used, and thus unneeded, right? Sounds good.

yes

Co-authored-by: moticless <[email protected]>

Co-authored-by: Yuan Wang <[email protected]>

Co-authored-by: moticless <[email protected]>

Co-authored-by: Yuan Wang <[email protected]>

…-prefetch

Co-authored-by: Yuan Wang <[email protected]>

sundb added 2 commits September 30, 2025 16:57

lookahead

3e2e1ba

Fix complain

2481632

oranagra reviewed Sep 30, 2025

View reviewed changes

sundb and others added 13 commits October 1, 2025 19:01

Revert code in processInputBuffer to avoid too many conflicts

742cb79

Co-authored-by: oranagra <[email protected]>

Rename CLIENT_IN_PREFETCH to CLIENT_IN_MEMORY_PREFETCH

e0ff33a

Co-authored-by: oranagra <[email protected]>

no longer use flags instead of read_error for pendingCommand

77d7c5b

Co-authored-by: oranagra <[email protected]>

Update comment

4de9dd9

Co-authored-by: oranagra <[email protected]>

Remove unused code

2094895

Co-authored-by: oranagra <[email protected]>

Remove unused code

1222287

Rename methods

682b40a

Refine

bcfcf93

Fix the issue that read_error wasn't handled correctly with io thread

13dd132

Remove consumePendingCommand()

7bffad2

Fix mistake comment out

60be3f4

Move the calucation of lookahead outside of if

16d2682

Avoid the client calling the shutdownCommand again after unblocking a…

f080aa6

…gain

sundb commented Oct 3, 2025

View reviewed changes

sundb added 3 commits October 4, 2025 14:29

Reset client for blocked shutdown client

1888ca1

Fix race condition for DefaultUser->flags

61d639a

Make user->flags atomic

77c2b72

sundb commented Oct 5, 2025

View reviewed changes

sundb and others added 7 commits October 5, 2025 20:22

Revert strchr()

59ed640

Add update_slot_stats argument to update cluster slot stats for prepa…

ed08a67

…reForNextCommand() Co-authored-by: oranagra <[email protected]>

call clusterSlotStatsAddNetworkBytesInForUserClient() before resetCli…

4dd99d4

…entInternal()

use the slot from preprocess() in getNodeByQuery()

8cfae3f

add CLIENT_READ_CROSS_SLOT for isClientReadErrorFatal()

5c9460a

add CLIENT_READ_CROSS_SLOT for isClientReadErrorFatal()

02ede13

simplify the using of isClientReadErrorFatal()

d205048

oranagra reviewed Oct 8, 2025

View reviewed changes

sundb added 7 commits October 16, 2025 10:24

Update comment

4e99598

Refine comment

efe3a39

Incr the max size of pendimg command pool, and shrink it regular

c547e8c

Simplify the code

1d8c292

indentation

1cb020a

Merge remote-tracking branch 'origin/unstable' into command-lookahead…

2255dda

…-prefetch

code style

45d3cd4

sundb commented Oct 20, 2025

View reviewed changes

sundb and others added 14 commits October 21, 2025 09:54

Defensively prevent the cmd pool from shrinking too small

7ca5db7

Co-authored-by: moticless <[email protected]>

Apply suggestion from @ShooterIT

a0674a1

Co-authored-by: Yuan Wang <[email protected]>

Apply suggestion from @ShooterIT

a5eee02

Co-authored-by: Yuan Wang <[email protected]>

Apply suggestion from @ShooterIT

b97101d

Co-authored-by: Yuan Wang <[email protected]>

Avoid partial prefetch and incr client if success

8984656

Co-authored-by: Yuan Wang <[email protected]>

Fix the missing of handing the CLIENT_READ_CROSS_SLOT read error

3f1c108

Co-authored-by: moticless <[email protected]>

Update src/networking.c

0c8c77c

Co-authored-by: Yuan Wang <[email protected]>

defer releasing pending command

6d34543

Refine

e6f606c

Add comment

8f7041e

Add test for CROSSSLOT Keys

f7c8e38

Add test for the expansion and shrinking of pending command pool

a538c22

Fix spell and test failure

b92cce8

Merge remote-tracking branch 'origin/unstable' into command-lookahead…

46850c2

…-prefetch

sundb force-pushed the command-lookahead-prefetch branch from d320a2f to 46850c2 Compare October 22, 2025 14:53

sundb and others added 7 commits October 22, 2025 22:58

eliminate CLIENT_IN_MEMORY_PREFETCH

0779f1f

General deferred free object

1804bbc

Fix wrongly c->deferred_objects_num++

7b976d8

unnecessary change

f156460

code style

6a5d7eb

Rename freeClientDeferredObject to freeDeferredObject to avoid confusion

0d34e6a

code review

1a874f1

Co-authored-by: Yuan Wang <[email protected]>

		/* Parse up to lookahead commands */
		while (c->pending_cmds.ready_len < lookahead && c->querybuf && c->qb_pos < sdslen(c->querybuf)) {

Lookahead pre-fetching #413

Are you sure you want to change the base?

Lookahead pre-fetching #413

Uh oh!

Conversation

sundb commented Sep 30, 2025

Uh oh!

codecov bot commented Sep 30, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

oranagra left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

sundb Oct 3, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

codecov bot commented Sep 30, 2025 •

edited

Loading

sundb Oct 3, 2025 •

edited

Loading