[Storage] Refactor index execution result #8005

zhangchiqing · 2025-10-03T19:17:10Z

Work towards #7912

Added Key Existence checks to the db operations when saving execution results.
Renamed the lock IDs, see comments for motivation.
Combined some storage API, such as BatchInsert and BatchIndex and combined into BatchInsertAndIndexXXX
Refactored the lock policies with groups, explained the reasoning in the comments.

codecov-commenter · 2025-10-03T20:57:45Z

Codecov Report

❌ Patch coverage is 44.86607% with 247 lines in your changes missing coverage. Please review.

Files with missing lines	Patch %	Lines
storage/operation/events.go	23.25%	29 Missing and 4 partials ⚠️
engine/execution/state/bootstrap/bootstrap.go	0.00%	25 Missing ⚠️
storage/mock/execution_results_reader.go	0.00%	22 Missing ⚠️
state/cluster/badger/state.go	61.81%	14 Missing and 7 partials ⚠️
storage/mock/execution_results.go	0.00%	20 Missing ⚠️
storage/operation/results.go	9.09%	20 Missing ⚠️
state/protocol/badger/state.go	71.73%	9 Missing and 4 partials ⚠️
storage/mock/blocks.go	0.00%	12 Missing ⚠️
storage/operation/headers.go	40.00%	8 Missing and 4 partials ⚠️
storage/operation/receipts.go	35.29%	8 Missing and 3 partials ⚠️
... and 14 more

📢 Thoughts on this report? Let us know!

zhangchiqing · 2025-10-08T23:09:37Z

engine/execution/state/state.go

+		storage.LockIndexStateCommitment,
+	}
+	// Acquire locks to ensure it's concurrent safe when inserting the execution results and chunk data packs.
+	return storage.WithLocks(s.lockManager, locks, func(lctx lockctx.Context) error {


Does it make sense to change the argument from lockctx.Context to lockctx.Proof?

Suggested change

return storage.WithLocks(s.lockManager, locks, func(lctx lockctx.Context) error {

return storage.WithLocks(s.lockManager, locks, func(lctx lockctx.Proof) error {

zhangchiqing · 2025-10-16T00:16:11Z

storage/locks.go

-		Build()
+	builder := lockctx.NewDAGPolicyBuilder()
+
+	addLocks(builder, LockGroupAccessFinalizingBlock)


I created lock groups so that it's easy to track where they are used, and it's ok that duplicated lock policies are added, lockctx can ensure there is no cycle that would cause deadlock.

👍 Good idea. Splitting out the DAG definition into groups makes sense to me.

jordanschalm

Nice work

jordanschalm · 2025-10-16T20:56:04Z

engine/execution/state/bootstrap/bootstrap.go

+				return fmt.Errorf("could not index initial genesis execution block: %w", err)
+			}
+
+			err = operation.IndexOwnOrSealedExecutionResult(lctx, rw, rootSeal.BlockID, rootSeal.ResultID)


Suggested change

err = operation.IndexOwnOrSealedExecutionResult(lctx, rw, rootSeal.BlockID, rootSeal.ResultID)

err = operation.IndexSealedOrMyExecutionResult(lctx, rw, rootSeal.BlockID, rootSeal.ResultID)

I think using "My" rather than "Own" would be clearer, just because "own" is also a common verb

engine/execution/state/state.go

jordanschalm · 2025-10-16T21:00:01Z

engine/execution/state/state.go

 }

 // SaveExecutionResults saves all data related to the execution of a block.
 // It is concurrent safe because chunk data packs store is conflict-free (storing data by hash), and protocol data requires a lock to store, which will be synchronized.


Suggested change

// It is concurrent safe because chunk data packs store is conflict-free (storing data by hash), and protocol data requires a lock to store, which will be synchronized.

// Calling this function multiple times with the same input is a no-op and will not return an error.

(since we swallow ErrAlreadyExists)

state/cluster/badger/state.go

state/protocol/badger/state.go

jordanschalm · 2025-10-17T02:10:51Z

storage/locks.go

 	// LockInsertOwnReceipt is intended for Execution Nodes to ensure that they never publish different receipts for the same block.
 	// Specifically, with this lock we prevent accidental overwrites of the index `executed block ID` ➜ `Receipt ID`.
-	LockInsertOwnReceipt = "lock_insert_own_receipt"
+	LockInsertOwnReceipt       = "lock_insert_own_receipt"


Suggested change

// LockInsertOwnReceipt is intended for Execution Nodes to ensure that they never publish different receipts for the same block.

// Specifically, with this lock we prevent accidental overwrites of the index `executed block ID` ➜ `Receipt ID`.

LockInsertOwnReceipt = "lock_insert_own_receipt"

LockInsertOwnReceipt = "lock_insert_own_receipt"

// LockInsertMyReceipt is intended for Execution Nodes to ensure that they never publish different receipts for the same block.

// Specifically, with this lock we prevent accidental overwrites of the index `executed block ID` ➜ `Receipt ID`.

LockInsertOwnReceipt = "lock_insert_my_receipt"

Similar suggestion again. I think we should align on using the "my receipt" terminology for increased clarity. We already use it in some places, for example the store module:

flow-go/storage/store/my_receipts.go

Line 19 in 9dafecc

type MyExecutionReceipts struct {

jordanschalm · 2025-10-17T02:15:39Z

storage/locks.go

-	LockBootstrapping = "lock_bootstrapping"
+	// LockInsertInstanceParams protects data that is *exclusively* written during bootstrapping.
+	LockInsertInstanceParams    = "lock_insert_instance_params"
+	LockIndexCollectionsByBlock = "lock_index_collections_by_block"


Suggested change

LockIndexCollectionsByBlock = "lock_index_collections_by_block"

LockIndexBlockByPayloadGuarantees = "lock_index_block_by_payload_guarantees"

Collection guarantees are distinct from collections (and have different IDs), so suggesting to be clear here (and in the implementation) that we are referring specifically to collection guarantees here.

Also, it feels like the index name is inverted currently. The underlying storage operation is inserting entries like guaranteeID -> blockID. Usually in these cases we say we are indexing the block (thing in the value) by guarantee (thing in the key).

jordanschalm · 2025-10-17T02:19:42Z

storage/locks.go

-		Build()
+	builder := lockctx.NewDAGPolicyBuilder()
+
+	addLocks(builder, LockGroupAccessFinalizingBlock)


👍 Good idea. Splitting out the DAG definition into groups makes sense to me.

jordanschalm · 2025-10-17T02:21:04Z

storage/store/blocks.go

+func (b *Blocks) BatchIndexBlockContainingCollectionGuarantees(lctx lockctx.Proof, rw storage.ReaderBatchWriter, blockID flow.Identifier, collIDs []flow.Identifier) error {
+	return operation.BatchIndexBlockContainingCollectionGuarantees(lctx, rw, blockID, collIDs)


Suggested change

func (b *Blocks) BatchIndexBlockContainingCollectionGuarantees(lctx lockctx.Proof, rw storage.ReaderBatchWriter, blockID flow.Identifier, collIDs []flow.Identifier) error {

return operation.BatchIndexBlockContainingCollectionGuarantees(lctx, rw, blockID, collIDs)

func (b *Blocks) BatchIndexBlockContainingCollectionGuarantees(lctx lockctx.Proof, rw storage.ReaderBatchWriter, blockID flow.Identifier, guaranteeIDs []flow.Identifier) error {

return operation.BatchIndexBlockContainingCollectionGuarantees(lctx, rw, blockID, guaranteeIDs)

jordanschalm · 2025-10-17T02:22:08Z

storage/operation/headers.go

-func IndexBlockContainingCollectionGuarantee(w storage.Writer, collID flow.Identifier, blockID flow.Identifier) error {
-	return UpsertByKey(w, MakePrefix(codeCollectionBlock, collID), blockID)
+//   - [storage.ErrAlreadyExists] if any collection guarantee is already indexed
+func BatchIndexBlockContainingCollectionGuarantees(lctx lockctx.Proof, rw storage.ReaderBatchWriter, blockID flow.Identifier, collIDs []flow.Identifier) error {


Suggested change

func BatchIndexBlockContainingCollectionGuarantees(lctx lockctx.Proof, rw storage.ReaderBatchWriter, blockID flow.Identifier, collIDs []flow.Identifier) error {

func BatchIndexBlockContainingCollectionGuarantees(lctx lockctx.Proof, rw storage.ReaderBatchWriter, blockID flow.Identifier, guaranteeIDs []flow.Identifier) error {

Could you also please update the variable naming in LookupBlockContainingCollectionGuarantee below 🙏

jordanschalm

Nice work

Co-authored-by: Jordan Schalm <[email protected]>

zhangchiqing added 20 commits October 3, 2025 08:07

add lockctx to save execution results operations

56789b8

fix tests

5f38c5c

Fix import formatting in storage/operation files

ae157d1

update mocks

11ebaba

refactor indexing results

cf8fbc8

adding lock to index events and light transaction results

e069fd9

refactor tests

0c4785d

fix optimistic_sync persisters

4c35202

refactor IndexOwnExecutionResult

1c7271f

update mocks

89688a2

fix inmemory stores

3b87400

fix in memory indexer

87f056a

add lock manager to an ingestion engine

81ec9d3

fix access test

f85f951

fix tests for BatchIndex

060f438

fix tests

14dcb42

fix linter

a731f69

fix tests

726eaec

remove results.Store replace with BatchStore

0820d94

refactor InsertResult

e1c6cc7

zhangchiqing changed the title ~~[Storage] Refactor index result~~ [Storage] Refactor index execution result Oct 3, 2025

fix mocks

9855b1e

zhangchiqing added 7 commits October 3, 2025 14:26

fix badger state

45dc92a

fix tests

ce2156e

refactor BatchIndexBlockContainingCollectionGuarantees

32f653d

use different lock id

f47041d

Merge branch 'master' into leo/refactor-index-result

77127b3

update mocks

bb10b84

fix execution tests

0826270

zhangchiqing added 5 commits October 8, 2025 16:47

fix tests

f863089

fix persist block test

bf3187d

fix lint

e7e4059

add comment to lock policy

d725fed

rename LockBootstrapping to LockInsertInstanceParams

a0a2985

zhangchiqing marked this pull request as ready for review October 9, 2025 17:04

zhangchiqing requested a review from a team as a code owner October 9, 2025 17:04

zhangchiqing requested review from AlexHentschel and jordanschalm October 9, 2025 17:05

zhangchiqing mentioned this pull request Oct 10, 2025

[Storage] Refactor indexing transaction error message #8021

Merged

zhangchiqing marked this pull request as draft October 15, 2025 20:52

zhangchiqing added 7 commits October 15, 2025 16:00

Merge branch 'master' into leo/refactor-index-result

c2c67aa

fix lint

41a54b8

remove LockBootstrapping

a35a911

fix tests and locks policy

6dd2401

fix tests

6a91a9e

refactor lock policys

87758a9

Merge branch 'master' into leo/refactor-index-result

04414c1

zhangchiqing marked this pull request as ready for review October 16, 2025 00:29

zhangchiqing commented Oct 16, 2025

View reviewed changes

jordanschalm approved these changes Oct 17, 2025

View reviewed changes

zhangchiqing and others added 8 commits October 17, 2025 08:53

Apply suggestions from code review

bcf1655

Co-authored-by: Jordan Schalm <[email protected]>

rename to LockInsertMyReceipt

65c3ceb

rename to LockIndexBlockByPayloadGuarantees

8ece701

rename variables in BatchIndexBlockContainingCollectionGuarantees

55e86da

fix lint

5bd5581

rename results operations

50b4ba4

add comments

8d8d964

Merge branch 'master' into leo/refactor-index-result

4800c80

	return storage.WithLocks(s.lockManager, locks, func(lctx lockctx.Context) error {
	return storage.WithLocks(s.lockManager, locks, func(lctx lockctx.Proof) error {

	err = operation.IndexOwnOrSealedExecutionResult(lctx, rw, rootSeal.BlockID, rootSeal.ResultID)
	err = operation.IndexSealedOrMyExecutionResult(lctx, rw, rootSeal.BlockID, rootSeal.ResultID)


	// It is concurrent safe because chunk data packs store is conflict-free (storing data by hash), and protocol data requires a lock to store, which will be synchronized.
	// Calling this function multiple times with the same input is a no-op and will not return an error.

	LockIndexCollectionsByBlock = "lock_index_collections_by_block"
	LockIndexBlockByPayloadGuarantees = "lock_index_block_by_payload_guarantees"

		func (b *Blocks) BatchIndexBlockContainingCollectionGuarantees(lctx lockctx.Proof, rw storage.ReaderBatchWriter, blockID flow.Identifier, collIDs []flow.Identifier) error {
		return operation.BatchIndexBlockContainingCollectionGuarantees(lctx, rw, blockID, collIDs)

	func BatchIndexBlockContainingCollectionGuarantees(lctx lockctx.Proof, rw storage.ReaderBatchWriter, blockID flow.Identifier, collIDs []flow.Identifier) error {
	func BatchIndexBlockContainingCollectionGuarantees(lctx lockctx.Proof, rw storage.ReaderBatchWriter, blockID flow.Identifier, guaranteeIDs []flow.Identifier) error {

[Storage] Refactor index execution result #8005

Are you sure you want to change the base?

[Storage] Refactor index execution result #8005

Uh oh!

Conversation

zhangchiqing commented Oct 3, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

codecov-commenter commented Oct 3, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

jordanschalm left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

jordanschalm left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

zhangchiqing commented Oct 3, 2025 •

edited

Loading

codecov-commenter commented Oct 3, 2025 •

edited

Loading