Skip to content

Conversation

zhangchiqing
Copy link
Member

@zhangchiqing zhangchiqing commented Oct 3, 2025

Work towards #7912

  • Added Key Existence checks to the db operations when saving execution results.
  • Renamed the lock IDs, see comments for motivation.
  • Combined some storage API, such as BatchInsert and BatchIndex and combined into BatchInsertAndIndexXXX
  • Refactored the lock policies with groups, explained the reasoning in the comments.

@zhangchiqing zhangchiqing changed the title [Storage] Refactor index result [Storage] Refactor index execution result Oct 3, 2025
@zhangchiqing zhangchiqing marked this pull request as ready for review October 9, 2025 17:04
@zhangchiqing zhangchiqing requested a review from a team as a code owner October 9, 2025 17:04
@zhangchiqing zhangchiqing marked this pull request as draft October 15, 2025 20:52
@zhangchiqing zhangchiqing marked this pull request as ready for review October 16, 2025 00:29
storage.LockIndexStateCommitment,
}
// Acquire locks to ensure it's concurrent safe when inserting the execution results and chunk data packs.
return storage.WithLocks(s.lockManager, locks, func(lctx lockctx.Context) error {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does it make sense to change the argument from lockctx.Context to lockctx.Proof?

Suggested change
return storage.WithLocks(s.lockManager, locks, func(lctx lockctx.Context) error {
return storage.WithLocks(s.lockManager, locks, func(lctx lockctx.Proof) error {

Build()
builder := lockctx.NewDAGPolicyBuilder()

addLocks(builder, LockGroupAccessFinalizingBlock)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I created lock groups so that it's easy to track where they are used, and it's ok that duplicated lock policies are added, lockctx can ensure there is no cycle that would cause deadlock.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 Good idea. Splitting out the DAG definition into groups makes sense to me.

Copy link
Member

@jordanschalm jordanschalm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work

return fmt.Errorf("could not index initial genesis execution block: %w", err)
}

err = operation.IndexOwnOrSealedExecutionResult(lctx, rw, rootSeal.BlockID, rootSeal.ResultID)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
err = operation.IndexOwnOrSealedExecutionResult(lctx, rw, rootSeal.BlockID, rootSeal.ResultID)
err = operation.IndexSealedOrMyExecutionResult(lctx, rw, rootSeal.BlockID, rootSeal.ResultID)

I think using "My" rather than "Own" would be clearer, just because "own" is also a common verb

}

// SaveExecutionResults saves all data related to the execution of a block.
// It is concurrent safe because chunk data packs store is conflict-free (storing data by hash), and protocol data requires a lock to store, which will be synchronized.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// It is concurrent safe because chunk data packs store is conflict-free (storing data by hash), and protocol data requires a lock to store, which will be synchronized.
// Calling this function multiple times with the same input is a no-op and will not return an error.

(since we swallow ErrAlreadyExists)

storage/locks.go Outdated
Comment on lines 28 to 30
// LockInsertOwnReceipt is intended for Execution Nodes to ensure that they never publish different receipts for the same block.
// Specifically, with this lock we prevent accidental overwrites of the index `executed block ID` ➜ `Receipt ID`.
LockInsertOwnReceipt = "lock_insert_own_receipt"
LockInsertOwnReceipt = "lock_insert_own_receipt"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// LockInsertOwnReceipt is intended for Execution Nodes to ensure that they never publish different receipts for the same block.
// Specifically, with this lock we prevent accidental overwrites of the index `executed block ID` ➜ `Receipt ID`.
LockInsertOwnReceipt = "lock_insert_own_receipt"
LockInsertOwnReceipt = "lock_insert_own_receipt"
// LockInsertMyReceipt is intended for Execution Nodes to ensure that they never publish different receipts for the same block.
// Specifically, with this lock we prevent accidental overwrites of the index `executed block ID` ➜ `Receipt ID`.
LockInsertOwnReceipt = "lock_insert_my_receipt"

Similar suggestion again. I think we should align on using the "my receipt" terminology for increased clarity. We already use it in some places, for example the store module:

type MyExecutionReceipts struct {

storage/locks.go Outdated
LockBootstrapping = "lock_bootstrapping"
// LockInsertInstanceParams protects data that is *exclusively* written during bootstrapping.
LockInsertInstanceParams = "lock_insert_instance_params"
LockIndexCollectionsByBlock = "lock_index_collections_by_block"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
LockIndexCollectionsByBlock = "lock_index_collections_by_block"
LockIndexBlockByPayloadGuarantees = "lock_index_block_by_payload_guarantees"

Collection guarantees are distinct from collections (and have different IDs), so suggesting to be clear here (and in the implementation) that we are referring specifically to collection guarantees here.

Also, it feels like the index name is inverted currently. The underlying storage operation is inserting entries like guaranteeID -> blockID. Usually in these cases we say we are indexing the block (thing in the value) by guarantee (thing in the key).

Build()
builder := lockctx.NewDAGPolicyBuilder()

addLocks(builder, LockGroupAccessFinalizingBlock)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 Good idea. Splitting out the DAG definition into groups makes sense to me.

Comment on lines 222 to 223
func (b *Blocks) BatchIndexBlockContainingCollectionGuarantees(lctx lockctx.Proof, rw storage.ReaderBatchWriter, blockID flow.Identifier, collIDs []flow.Identifier) error {
return operation.BatchIndexBlockContainingCollectionGuarantees(lctx, rw, blockID, collIDs)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
func (b *Blocks) BatchIndexBlockContainingCollectionGuarantees(lctx lockctx.Proof, rw storage.ReaderBatchWriter, blockID flow.Identifier, collIDs []flow.Identifier) error {
return operation.BatchIndexBlockContainingCollectionGuarantees(lctx, rw, blockID, collIDs)
func (b *Blocks) BatchIndexBlockContainingCollectionGuarantees(lctx lockctx.Proof, rw storage.ReaderBatchWriter, blockID flow.Identifier, guaranteeIDs []flow.Identifier) error {
return operation.BatchIndexBlockContainingCollectionGuarantees(lctx, rw, blockID, guaranteeIDs)

func IndexBlockContainingCollectionGuarantee(w storage.Writer, collID flow.Identifier, blockID flow.Identifier) error {
return UpsertByKey(w, MakePrefix(codeCollectionBlock, collID), blockID)
// - [storage.ErrAlreadyExists] if any collection guarantee is already indexed
func BatchIndexBlockContainingCollectionGuarantees(lctx lockctx.Proof, rw storage.ReaderBatchWriter, blockID flow.Identifier, collIDs []flow.Identifier) error {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
func BatchIndexBlockContainingCollectionGuarantees(lctx lockctx.Proof, rw storage.ReaderBatchWriter, blockID flow.Identifier, collIDs []flow.Identifier) error {
func BatchIndexBlockContainingCollectionGuarantees(lctx lockctx.Proof, rw storage.ReaderBatchWriter, blockID flow.Identifier, guaranteeIDs []flow.Identifier) error {

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you also please update the variable naming in LookupBlockContainingCollectionGuarantee below 🙏

Copy link
Member

@jordanschalm jordanschalm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants