Skip to content

feat: make SyncNotes return multiple blocks#1843

Merged
bobbinth merged 34 commits intonextfrom
tomasarrachea-sync-notes-rework
Mar 31, 2026
Merged

feat: make SyncNotes return multiple blocks#1843
bobbinth merged 34 commits intonextfrom
tomasarrachea-sync-notes-rework

Conversation

@TomasArrachea
Copy link
Copy Markdown
Collaborator

@TomasArrachea TomasArrachea commented Mar 25, 2026

Closes #1809

This PR updates the RPC component to batch multiple blocks into a single SyncNotes response.

Changes

  • RPC sync_notes handler now loops over store calls, accumulating blocks until the 4MB budget is exceeded or the range is exhausted
  • Proto: changed SyncNotesResponse proto from single-block (header + mmr_path + notes) to multi-block (repeated NoteSyncBlock blocks + BlockRange). Added StoreSyncNotesResponse for the store-internal single-block response, using BlockRange instead of PaginationInfo.
  • Store's sync_notes uses open_at(block_num, checkpoint) instead of open(block_num) so MMR proofs are relative to block_to

Follow ups:

  • Instead of batching the sync note responses at the RPC component, batch them on the store and minimize the gRPC requests.
  • Deduplicate the authentication nodes between returned note updates.

@TomasArrachea TomasArrachea force-pushed the tomasarrachea-sync-notes-rework branch from 6961007 to 6e0dd69 Compare March 26, 2026 15:26
@TomasArrachea TomasArrachea marked this pull request as ready for review March 26, 2026 15:34
Copy link
Copy Markdown
Contributor

@bobbinth bobbinth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good! Thank you! I left a few small comments inline.

Comment on lines +20 to +24
/// Estimated byte size of a single [`NoteSyncRecord`].
///
/// Note ID (~38 bytes) + index + metadata (~26 bytes) + sparse merkle path with 16
/// siblings (~608 bytes).
const NOTE_RECORD_BYTES: usize = 700;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is fine for now, but most likely will be a significant overestimate because sparse Merkle paths get compressed, and in most cases shouldn't be more than a couple hundred bytes. But the compression depends on how many paths there are (the more paths, the worse the compression) - so, taking the worst case is fine for now.

Copy link
Copy Markdown
Collaborator

@igamigo igamigo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great! I think there are a couple of edge cases that might need addressing though.

Comment on lines +495 to +496
// `block_to` matches the request's `block_range.block_to`, or the chain tip if it was
// not specified.
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: This is not always true, right? The text below mentions that the block_to may be smaller than what the user requested (and the chain tip as well)

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I removed the text below; as mentioned in this discussion the block_to will always return either the requested block or the chain tip.

/// - `note_tags`: The tags the client is interested in, resulting notes are restricted to the
/// first block containing a matching note.
/// - `block_range`: The range of blocks from which to synchronize notes.
/// Returns as many blocks with matching notes as fit within the response payload
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: You can reference the max payload size constant here

block_range: RangeInclusive<BlockNumber>,
) -> Result<(NoteSyncUpdate, MmrProof, BlockNumber), NoteSyncError> {
) -> Result<Vec<(NoteSyncUpdate, MmrProof)>, NoteSyncError> {
let inner = self.inner.read().await;
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we want to take the lock in the loop to avoid keeping it for the whole duration

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(and it can be moved right next to the open_at() call)

Comment on lines +480 to +485
// Merkle path to verify the block's inclusion in the MMR at the returned
// `block_range.block_to`.
//
// An MMR proof can be constructed for the leaf of index `block_header.block_num` of
// an MMR of forest `block_range.block_to` with this path.
primitives.MerklePath mmr_path = 2;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the comment on line 480-481 correct here? I thought the MMR path is done against the requested block_range.block_to rather than the returned one.

Comment on lines +136 to +139
let block_from = block_range.start();
// Clamp block_to to the chain tip to avoid erroring when opening the MMR proof
let block_to = block_range.end().min(&chain_tip);
let clamped_block_range = *block_from..=*block_to;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think things like this should be enforced in BlockRange::into_inclusive_range(). But also, I'm not sure we want to clamp here. Maybe returning an error is a better approach?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed this to error instead 889fc04

@TomasArrachea TomasArrachea requested a review from igamigo March 30, 2026 17:26
block_num: last_block_included.as_u32(),
block_range: Some(proto::rpc::BlockRange {
block_from: block_from.as_u32(),
block_to: Some(block_to.as_u32()),
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

block_to here is probably not what we want, as it is just the same number that the user might have requested. Instead, what we need (AFAIK) is to return the largest block num in the results (or have it returned separately on sync_notes). Or, if no valid sync notes response was found, then we reached the user's block_to

Copy link
Copy Markdown
Collaborator Author

@TomasArrachea TomasArrachea Mar 30, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I changed this so the block_to returned marks the last block included in the response. This makes sense for the proposed flow where the client calls SyncChainMmr and then uses that chain tip as parameter for the sync notes. But if a user made a sync notes request without specifying the chain tip, it would have no way of finding out which is the chain tip used for the MMR paths included in the response. It might not be the last included block if the response is truncated.

Copy link
Copy Markdown
Collaborator Author

@TomasArrachea TomasArrachea Mar 30, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A simple solution would be to add a chain_tip field to the response

Edit: actually, using PaginationInfo would be a better fit.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree, that should work well

Comment on lines +90 to +91
// Use block_end + 1 as the MMR checkpoint so that block_end itself can be proven.
let mmr_checkpoint = block_end + 1;
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: I'd move this down closer to the open_at call

Comment on lines +89 to +90
// If `response.block_range.block_to` is less than the requested range end, make another
// request starting from `response.block_range.block_to + 1` to continue syncing.
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the underlying query is committed_at > block_from, so I think prompting for + 1 is not correct here.

Copy link
Copy Markdown
Contributor

@bobbinth bobbinth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good! Thank you. I left some more comments inline - mostly about making sure we treat block ranges consistently.

Comment on lines 139 to +144
pub(crate) fn select_notes_since_block_by_tag_and_sender(
conn: &mut SqliteConnection,
account_ids: &[AccountId],
note_tags: &[u32],
block_range: RangeInclusive<BlockNumber>,
) -> Result<(Vec<NoteSyncRecord>, BlockNumber), DatabaseError> {
) -> Result<Vec<NoteSyncRecord>, DatabaseError> {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not for this PR, but it seems like this function is never invoked with account_ids supplied. If so, we should remove the account_ids parameter and simplify the associated SQL. Let's create an issue for this.

Comment on lines +108 to +110
if !results.is_empty() && accumulated_size > MAX_RESPONSE_PAYLOAD_BYTES {
break;
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we have !results.is_empty() as a condition here? Basically, if the result is empty, why do we need to fall through to getting the MMR proof?

Copy link
Copy Markdown
Collaborator Author

@TomasArrachea TomasArrachea Mar 31, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was added after this dicussion #1843 (comment) to guarantee that at least one update is included in the response. If the db.get_note_sync returns None we do break .

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see. Basically, the idea here is that if somehow accumulated_size is over the limit on the first iteration (i.e., when esults.is_empty()) we still want to send the response. Practically, this shouldn't happen (unless we are off in accumulated_size computations), but it's better to be on the safe side.

Copy link
Copy Markdown
Collaborator

@SantiagoPittella SantiagoPittella left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good overall

Copy link
Copy Markdown
Collaborator

@igamigo igamigo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Left some minor, non-blocking comments

Comment on lines 86 to +91
#[instrument(level = "debug", target = COMPONENT, skip_all, ret(level = "debug"), err)]
pub async fn sync_notes(
&self,
note_tags: Vec<u32>,
block_range: RangeInclusive<BlockNumber>,
) -> Result<(NoteSyncUpdate, MmrProof, BlockNumber), NoteSyncError> {
let inner = self.inner.read().await;

let (note_sync, last_included_block) =
self.db.get_note_sync(block_range, note_tags).await?;
) -> Result<(Vec<(NoteSyncUpdate, MmrProof)>, BlockNumber), NoteSyncError> {
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With some of these changes we might need to update the store's README.md

/// -- filter the block's notes and return only the ones matching the requested tags or senders
/// (tag IN (?1) OR sender IN (?2))
/// ```
pub(crate) fn select_notes_since_block_by_tag_and_sender(
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could this be modified to return multiple responses at once? I think this was the original suggestion (although we should do it in a separate issue/PR)

Comment on lines +125 to +127
// if results is empty, return `block_end` since the sync is complete.
let last_block_checked =
results.last().map_or(block_end, |(update, _)| update.block_header.block_num());
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is a small optimization here where we return the block number we dropped, minus one. For example, if we match against blocks 1, 5, 10 and 15 but block 15 does not make it into the response, instead of returning 10 as the last included block, we return 14 (so the next query avoids blocks 10-14 because the user will start at 15 already). I doubt this is significant due to how the tables are probably indexed, so I would not bother doing it unless it's really trivial, but maybe it can be done in a follow up PR or issue

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Created #1870

Copy link
Copy Markdown
Contributor

@bobbinth bobbinth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All looks good! Thank you!

@bobbinth bobbinth merged commit a7abf73 into next Mar 31, 2026
18 checks passed
@bobbinth bobbinth deleted the tomasarrachea-sync-notes-rework branch March 31, 2026 20:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Return multiple blocks for SyncNotes

4 participants