Skip to content

RSCBC-32: Reduce allocations on KV hot paths#473

Open
chvck wants to merge 1 commit intomainfrom
rscbc32
Open

RSCBC-32: Reduce allocations on KV hot paths#473
chvck wants to merge 1 commit intomainfrom
rscbc32

Conversation

@chvck
Copy link
Copy Markdown
Contributor

@chvck chvck commented Mar 25, 2026

Changes:

  • Get result values use Bytes instead of Vec, eliminating a full document-body memcpy on every get/get_and_lock/get_and_touch/get_meta by propagating zero-copy Bytes from the codec through to the SDK layer
  • RetryRequest.retry_reasons is now Option, only allocating on the first retry attempt instead of every operation
  • Compressor.compress returns Cow<[u8]> instead of borrowing from self, removing heap state from StdCompressor entirely
  • MutationToken bucket_name uses Arc instead of String, replacing a per-mutation String::clone with an atomic ref-count bump
  • Add unit tests for compression manager

@github-actions
Copy link
Copy Markdown

🐰 Bencher Report

Branchrscbc32-23550438882-154-1
Testbedubuntu-latest
Click to view all benchmark results
BenchmarkLatencyBenchmark Result
microseconds (µs)
(Result Δ%)
Upper Boundary
microseconds (µs)
(Limit %)
query📈 view plot
🚷 view threshold
449.26 µs
(+6.64%)Baseline: 421.28 µs
484.47 µs
(92.73%)
upsert_and_get📈 view plot
🚷 view threshold
445.11 µs
(+11.22%)Baseline: 400.19 µs
460.22 µs
(96.72%)
🐰 View full continuous benchmarking report in Bencher

@github-actions
Copy link
Copy Markdown

🐰 Bencher Report

Branchrscbc32-23551112822-155-1
Testbedubuntu-latest
Click to view all benchmark results
BenchmarkLatencyBenchmark Result
microseconds (µs)
(Result Δ%)
Upper Boundary
microseconds (µs)
(Limit %)
query📈 view plot
🚷 view threshold
417.05 µs
(-1.00%)Baseline: 421.28 µs
484.47 µs
(86.08%)
upsert_and_get📈 view plot
🚷 view threshold
407.96 µs
(+1.94%)Baseline: 400.19 µs
460.22 µs
(88.64%)
🐰 View full continuous benchmarking report in Bencher

Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR reduces allocations on key KV hot paths by pushing zero-copy buffers (bytes::Bytes) through the core → SDK results boundary, avoiding eager allocations for retry bookkeeping, and simplifying compression buffering behavior.

Changes:

  • Replace Vec<u8> document bodies in KV results with bytes::Bytes to avoid memcpy/clones on gets.
  • Make retry reason tracking lazy (Option<HashSet<_>>) to avoid allocating for non-retried operations.
  • Update compression to return Cow<[u8]> and add unit tests around compression behavior/manager.

Reviewed changes

Copilot reviewed 9 out of 9 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
sdk/couchbase/src/results/kv_results.rs Switch SDK GetResult content storage to Bytes and propagate from core results.
sdk/couchbase/src/mutation_state.rs Change bucket name storage to Arc<str> to reduce cloning; update (de)serialization.
sdk/couchbase/src/clients/couchbase_core_kv_client.rs Store bucket name as Arc<str> and return projection content as Bytes.
sdk/couchbase-core/tests/allocations.rs Adjust expected allocation counts and tighten allocation assertion.
sdk/couchbase-core/tests/agent_ops.rs Update one assertion to work with Bytes-backed get results.
sdk/couchbase-core/src/retry.rs Lazily allocate retry reasons and adjust accessor/Display formatting.
sdk/couchbase-core/src/results/kv.rs Replace KV result value buffers from Vec<u8> to Bytes.
sdk/couchbase-core/src/crudcomponent.rs Avoid extra copies by propagating Bytes from responses; pass compressed data as borrowed slice.
sdk/couchbase-core/src/compressionmanager.rs Change compressor API to return Cow<[u8]>, remove internal heap buffer, and add unit tests.
Comments suppressed due to low confidence (1)

sdk/couchbase-core/tests/allocations.rs:239

  • This assertion switched to assert_eq! (exact match) but the failure message still says "Expected max {} allocations". Either update the message to reflect exact equality, or revert to the previous <= expected_allocs semantics if some variance is expected across environments/toolchains.
    dhat::assert_eq!(
        total_allocs,
        expected_allocs,
        "Expected max {} allocations, was {}",
        expected_allocs,
        total_allocs
    );

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines 254 to 257
for (vbid, (seqno, vbuuid)) in vbuckets {
let key = MutationStateKey {
bucket_name: bucket_name.clone(),
bucket_name: Arc::from(bucket_name.as_str()),
vbid: vbid.parse().map_err(de::Error::custom)?,
Copy link

Copilot AI Mar 25, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In MutationState deserialization, bucket_name: Arc::from(bucket_name.as_str()) is executed once per vBucket entry, which allocates and copies the bucket name repeatedly (defeating the goal of using Arc<str> to avoid clones). Consider converting the bucket_name: String to a single Arc<str> once per outer loop iteration (e.g., let bucket = Arc::<str>::from(bucket_name);) and then cloning the Arc for each MutationStateKey.

Copilot uses AI. Check for mistakes.
datatype: DataTypeFlag,
input: &'a [u8],
) -> error::Result<(&'a [u8], u8)> {
) -> error::Result<(Cow<'a, [u8]>, u8)> {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since you're considering leveraging Bytes elsewhere, why not use it here as well? One issue with using Cow when you use Bytes elsewhere is that you do not get the ability to maintain the same Bytes object through a function like this without resorting to some hackery.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The reason for Cow here is that we return actually either a borrow or an owned value. In the case of Bytes i think we'd need to allocate in the borrow case.

#[derive(Clone, Debug, Eq, PartialEq, Hash)]
pub struct GetResult {
pub value: Vec<u8>,
pub value: Bytes,
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This would be a serious breaking change. I'm not sure you can change this without a v2 release?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's in core, which is basically labelled as "do not use" so should be very low risk.

#[derive(Clone, Debug, Eq, PartialEq, Ord, PartialOrd, Hash)]
struct MutationStateKey {
bucket_name: String,
bucket_name: Arc<str>,
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe this should be Arc still.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is Arc

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It hid what I wrote >_<. Arc of String***

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What makes you say it should he string?

let mut buckets: HashMap<String, SparseScanVectors> = HashMap::default();
for (key, token) in value.tokens {
let bucket = buckets.entry(key.bucket_name.clone()).or_default();
let bucket = buckets.entry(key.bucket_name.to_string()).or_default();
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using to_string() here means you need to convert the key.bucket_name value from a &str to a String (which involves copying the data out). I think that the intention here is probably that we want to access the bucket using &str comparisons.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we have to use String as we need to take ownership of the value

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While the HashMap itself has to be a String for storage purposes, when looking up values in the map you shouldn't need to construct a String just to do that, you should be able to pass a &str to avoid allocating.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As it's calling entry it does require string. It needs ownership in case an insert is called against the entry (which it in fact is here) .

Changes:
- Get result values use Bytes instead of Vec<u8>, eliminating a full
  document-body memcpy on every get/get_and_lock/get_and_touch/get_meta
  by propagating zero-copy Bytes from the codec through to the SDK layer
- RetryRequest.retry_reasons is now Option<HashSet>, only allocating on
  the first retry attempt instead of every operation
- Compressor.compress returns Cow<[u8]> instead of borrowing from self,
  removing heap state from StdCompressor entirely
- MutationToken bucket_name uses Arc<str> instead of String, replacing a
  per-mutation String::clone with an atomic ref-count bump
- Add unit tests for compression manager
@github-actions
Copy link
Copy Markdown

🐰 Bencher Report

Branchrscbc32-23590194426-156-1
Testbedubuntu-latest
Click to view all benchmark results
BenchmarkLatencyBenchmark Result
microseconds (µs)
(Result Δ%)
Upper Boundary
microseconds (µs)
(Limit %)
query📈 view plot
🚷 view threshold
439.87 µs
(+4.40%)Baseline: 421.33 µs
484.53 µs
(90.78%)
upsert_and_get📈 view plot
🚷 view threshold
409.02 µs
(+1.94%)Baseline: 401.23 µs
461.41 µs
(88.65%)
🐰 View full continuous benchmarking report in Bencher

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants