-
Notifications
You must be signed in to change notification settings - Fork 955
Optimize skiplist query efficiency by embedding the skiplist header #2867
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: unstable
Are you sure you want to change the base?
Conversation
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## unstable #2867 +/- ##
============================================
- Coverage 72.44% 72.42% -0.03%
============================================
Files 128 128
Lines 70415 70491 +76
============================================
+ Hits 51011 51050 +39
- Misses 19404 19441 +37
🚀 New features to boost your workflow:
|
Signed-off-by: chzhoo <[email protected]> Signed-off-by: GitHub <[email protected]>
Signed-off-by: chzhoo <[email protected]> Signed-off-by: GitHub <[email protected]>
Signed-off-by: GitHub <[email protected]>
00c9a4b to
c2c63b9
Compare
ranshid
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice!
I will do a deeper review, I just want to first eliminate concerns about potential future extensions of the skiplist metadata.
src/server.h
Outdated
| union { | ||
| double score; /* Sorting score for node ordering */ | ||
| unsigned long length; /* Number of elements in the skiplist */ | ||
| }; | ||
| union { | ||
| struct zskiplistNode *backward; /* Pointer to previous node for reverse traversal */ | ||
| struct zskiplistNode *tail; /* Tail element of the skiplist */ | ||
| }; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I fear about this blocking us from changing/extending the skiplist metadata because it might later impact the size of every skiplist node.
Maybe an alternative would be to keep the zskiplistNode as is and embed a node as part of it like this:
typedef struct zskiplist {
union {
struct {
unsigned long length; /* Number of elements in the skiplist */
struct zskiplistNode *tail; /* Tail element of the skiplist */
};
struct zskiplistNode header;
};
} zskiplist;
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point Ran.
I'm not sure your suggestion works though, because it's not allowed to embed a struct with a "flexible array" member (the level[]) into another struct. I get a pendantic warning for this:
server.h:1450:26: warning: invalid use of structure with flexible array member [-Wpedantic]
1450 | struct zskiplistNode header;
| ^~~~~~
To work around this warning, we could use a trick that was common before "flexible arrays" existed. That is, to declare an array of size 1 instead (level[1]) and still use it as a flexible array. We only need to take it into account when we do sizeof(zskiplistNode) but other than that, all should work the same and now we can include a zskiplistNode inside the zskiplist struct.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeh. and another option is to keep the suggested typedef struct zskiplistNode zskiplist; for now and just make sure to provide zsl getter functions like:
zskiplistNode* zslGetHeader(zskiplist *zsl) {
return (zskiplistNode*)(zsl);
}
zslGetLen(zskiplist *zsl) {
return (unsigned long)(zslGetHeader(zsl)->score);
}
zskiplistNode* zslGetTail {
return zslGetHeader(zsl)->backward;
}
etc...
in the future, in case we wish to extend the skiplist metadata we could just allocate the zskiplistNode with the zskiplist, eg:
typedef struct zskiplist {
int something;
} zskiplist;
zskiplist *zslCreate(void) {
int j;
zskiplist *zsl;
zsl = malloc(sizeof(*zsl) + sizeof(zskiplistNode) + ZSKIPLIST_MAXLEVEL * sizeof(struct zskiplistLevel));
...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks all for the discussion. I understand the code needs the following changes:
- Embed
zskiplistNodeinzskiplist. The struct will be defined as follows:
typedef struct zskiplistLevel {
struct zskiplistNode *forward;
unsigned long span;
} zkiplistLevel;
typedef struct zskiplistNode {
double score;
struct zskiplistNode *backward;
zskiplistLevel level[1];
} zskiplistNode;
typedef struct zskiplist {
union {
struct zskiplistNode header;
struct {
unsigned long length;
struct zskiplistNode *tail;
};
};
} zskiplist;
For future extensions to the skiplist metadata, the following definition can be used:
/* The 'level' array is placed at the head of the struct (rather than the tail)
* to ensure all metadata(length/tail/somemeta) is stored contiguously. */
typedef struct zskiplistNode {
zskiplistLevel level[1];
double score;
struct zskiplistNode *backward;
} zskiplistNode;
typedef struct zskiplist {
union {
struct zskiplistNode header;
struct {
char padding[sizeof(zskiplisLevel)];
unsigned long length;
struct zskiplistNode *tail;
};
int somemeta;
} zskiplist;
- Provide zsl-related getter/setter functions, such as zslGetTail, zslGetLength, and zslGetHeader.
What do you think? @ranshid @zuiderkwast
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The level[] or level[1] needs to be the last member in the node struct, so that it can be variable size. If the node is embedded in another struct (zskiplist) it needs to be the last member there.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The original suggestion where zskiplist is a typedef of zskiplistNode is also fine by me, but yes with getters/setters it's better.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The
level[]orlevel[1]needs to be the last member in the node struct, so that it can be variable size. If the node is embedded in another struct (zskiplist) it needs to be the last member there.
The following layout allows placing level array at the head of the node struct, though it requires more extensive changes and can be considered later.
+---------+-----+---------+-------+------------------+-------------+
| level-n | ... | level-0 | score | backward-pointer | element-sds |
+---------+-----+---------+-------+------------------+-------------+
|
|
`-> zskiplistNode-pointer returned to the user
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I just meant if we want to access fields in the struct simply like node->score, then the variable size stuff needs to be in the end.
This is a balance between simplicity and performance. There is a limit. At some point, we should not add very much complexity for only a small performance improvement.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
75% of the nodes have only one level. 75% of the remaining nodes have two levels, and so on. Only 6% of the nodes have > 2 levels. Only 1% have > 3 levels. The high-level nodes are very few, so it seems me that we don't need to focus on them.
Signed-off-by: chzhoo <[email protected]>
Signed-off-by: chzhoo <[email protected]>
Description
Embedding the skiplist header reduces memory jumps, thus optimizing sorted set query speed.
Benchmark
Step 1
Generate the test data using the following lua script and cli command
lua script
cli command
valkey-cli eval "$(cat load.lua)" 0 $ZSET_COUNTStep 2
Run benchmark 5 times through the following command and select the peak value as the result
valkey-benchmark -n 5000000 -r $ZSET_COUNT --threads 2 zrange zset:__rand_int__ 0 0Benchmark result
Benchmark Env
CPU: AMD EPYC 9K65 192-Core Processor * 8
OS: Ubuntu Server 24.04 LTS 64bit
Memory: 32GB
VM: Tencent Cloud | SA9.2XLARGE32