Skip to content

Conversation

@m-Peter
Copy link
Collaborator

@m-Peter m-Peter commented Jan 21, 2025

Closes: #727

Description


For contributor use:

  • Targeted PR against master branch
  • Linked to Github issue with discussion and accepted design OR link to spec that describes this work.
  • Code follows the standards mentioned here.
  • Updated relevant documentation
  • Re-reviewed Files changed in the Github PR explorer
  • Added appropriate labels

Summary by CodeRabbit

Release Notes

  • New Features

    • Added --experimental-soft-finality-enabled CLI flag to enable experimental soft finality feature (disabled by default; may produce incorrect results—use with caution).
    • Added --experimental-sealing-verification-enabled CLI flag for experimental sealing verification (enabled by default; may cause indexing halts on mismatches—use with caution).
  • Chores

    • Enhanced build tooling with improved dependency management.

@m-Peter m-Peter self-assigned this Jan 21, 2025
@coderabbitai
Copy link
Contributor

coderabbitai bot commented Jan 21, 2025

Walkthrough

This PR implements soft finality indexing by adding unsealed finalized block event streaming with verification. Key additions include a new RPCBlockTrackingSubscriber for subscribing to finalized blocks, a SealingVerifier component to validate events, EventsHash storage for tracking event hashes, and CrossSporkClient enhancements for block header subscription and event retrieval.

Changes

Cohort / File(s) Summary
Configuration & CLI
config/config.go, cmd/run/cmd.go
Add two new experimental feature flags and config fields: ExperimentalSoftFinalityEnabled and ExperimentalSealingVerificationEnabled, with CLI wiring and default values.
Bootstrap & Orchestration
bootstrap/bootstrap.go
Integrate new EventsHash storage, conditionally instantiate SealingVerifier and RPCBlockTrackingSubscriber based on experimental flags, and add height reset logic via EventsHash when ForceStartCadenceHeight is set.
Event Hash Storage
storage/pebble/events_hash.go, storage/pebble/keys.go, storage/pebble/db.go
New EventsHash type with store/retrieve operations for event hashes by height, tracking of processed sealed height, batch support utilities, and new storage keys.
Sealing Verification
services/ingestion/sealing_verifier.go
New SealingVerifier engine to verify sealed and unsealed event hashes, with backfill logic, caching of unsealed/sealed hashes, and synchronized event processing.
Block Tracking & Subscription
services/ingestion/block_tracking_subscriber.go
New RPCBlockTrackingSubscriber for subscribing to finalized block headers and retrieving associated EVM events, with spork backfill support and optional sealing verification integration.
Cross-Spork Client Enhancement
services/requester/cross-spork_client.go
Add methods to retrieve events for block headers, subscribe to block headers from start height, expose spork root height helpers, and validate continuous height ranges across past sporks.
Error Handling & Testing
services/ingestion/engine.go, models/events.go, services/testutils/mock_client.go, services/requester/cross-spork_client_test.go
Improve trace error messages with transaction context, add BlockEvents accessor to CadenceEvents, update mock client to return both root heights, and expand test coverage for spork boundary validation.
Build & Utilities
Makefile, storage/pebble/storage.go
Add verbose mode to go mod tidy commands and introduce private batch delete utility.

Sequence Diagram(s)

sequenceDiagram
    actor Bootstrap
    participant BTS as RPCBlockTrackingSubscriber
    participant SV as SealingVerifier
    participant Cross as CrossSporkClient
    participant EH as EventsHash
    
    Bootstrap->>BTS: Subscribe(ctx)
    BTS->>BTS: Backfill past sporks
    BTS->>SV: Start after backfill
    SV->>EH: Load processed sealed height
    SV->>Cross: Subscribe to sealed blocks (via backfill)
    SV->>SV: Run event loop
    
    loop Live Streaming
        BTS->>Cross: SubscribeBlockHeadersFromStartHeight
        Cross-->>BTS: Block headers stream
        BTS->>Cross: GetEventsForBlockHeader (finalized, unsealed)
        Cross-->>BTS: EVM events
        BTS->>SV: AddFinalizedBlock (unsealed events)
        SV->>SV: onUnsealedEvents: hash & cache
        
        par Parallel Verification
            SV->>Cross: Subscribe to sealed block events
            Cross-->>SV: Sealed events
            SV->>SV: onSealedEvents: hash & compare
            SV->>EH: verifyBlock if hashes match
            EH->>EH: SetProcessedSealedHeight
        end
    end
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Areas requiring extra attention:

  • SealingVerifier (services/ingestion/sealing_verifier.go): Dense event synchronization logic with dual-cache state management (unsealed/sealed), backfill logic, and concurrent handling—verify cache consistency, error propagation, and edge cases around height advancement
  • RPCBlockTrackingSubscriber (services/ingestion/block_tracking_subscriber.go): Cross-spork backfill coordination with sealing verifier startup timing and live subscription logic—ensure correct event ordering and error recovery paths
  • CrossSporkClient enhancements (services/requester/cross-spork_client.go): New height boundary validation and spork root height tracking—confirm continuous range validation logic and backward compatibility with existing client selection
  • Bootstrap wiring (bootstrap/bootstrap.go): Conditional instantiation of new components based on experimental flags and integration with EventsHash reset logic—verify all code paths and proper cleanup
  • EventsHash storage (storage/pebble/events_hash.go): Batch operations and height-based removal logic—validate storage consistency and atomicity guarantees

Possibly related PRs

Suggested labels

Feature, EVM, Improvement

Suggested reviewers

  • zhangchiqing
  • peterargue
  • janezpodhostnik

Poem

🐰 Soft finality blooms so bright,
Block tracking through the day and night,
Events hashed and sealed with care,
Unsealed results now everywhere!
With verification's gentle hand,
The gateway works across the land. 🌸

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Title Check ⚠️ Warning The PR title states "Implement the RPCBlockHeaderSubscriber for indexing finalized results," but the actual implementation introduces a component named RPCBlockTrackingSubscriber, not RPCBlockHeaderSubscriber. While the general concept of subscribing to block headers for indexing finalized results is correctly captured, the specific component name referenced in the title does not match what is actually implemented in the changeset, making the title misleading about the actual code changes. Update the PR title to accurately reflect the actual implementation name, such as "Implement the RPCBlockTrackingSubscriber for indexing finalized results" or similar language that correctly identifies the component being introduced. This ensures the title clearly and accurately communicates the primary change in the changeset.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Linked Issues Check ✅ Passed The implementation comprehensively addresses all primary coding objectives from issue #727. The PR introduces the necessary infrastructure to subscribe to finalized blocks using SubscribeBlockHeadersFromStartHeight [#727], retrieve EVM events via the newly added GetEventsForBlockHeader method on CrossSporkClient [#727], and handle event indexing with optional sealing verification through the SealingVerifier component. Event hash tracking is implemented via the new EventsHash storage layer to support verification and manual rollback mechanisms. Configuration flags enable operators to control the soft finality feature and sealing verification behavior as intended in the PoC.
Out of Scope Changes Check ✅ Passed The vast majority of changes directly support the feature implementation described in issue #727. The core additions—RPCBlockTrackingSubscriber, SealingVerifier, event hash storage, and CrossSporkClient extensions—are all essential to the soft finality PoC. Supporting changes like the WithBatch utility, error message improvements in engine.go, and mock client updates are reasonable enablers. The Makefile change to add verbose logging to go mod tidy is a minor process improvement that does not detract from the main objectives.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch mpeter/poc-index-finalized-block-results

📜 Recent review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 5c97d88 and 2862171.

📒 Files selected for processing (4)
  • Makefile (1 hunks)
  • bootstrap/bootstrap.go (5 hunks)
  • config/config.go (1 hunks)
  • models/events.go (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (2)
  • models/events.go
  • Makefile
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: 2025-10-06T10:14:49.706Z
Learnt from: m-Peter
PR: onflow/flow-evm-gateway#890
File: services/ingestion/event_subscriber.go:235-239
Timestamp: 2025-10-06T10:14:49.706Z
Learning: In services/ingestion/event_subscriber.go, when reconnecting after disconnect errors (DeadlineExceeded, Internal, Unavailable), the subscription should reconnect at lastReceivedHeight rather than lastReceivedHeight+1. This avoids errors when the next height doesn't exist yet, and duplicate event processing is safe because the ingestion engine is explicitly designed to be idempotent (storage uses batch.Set() which overwrites existing entries).

Applied to files:

  • bootstrap/bootstrap.go
🧬 Code graph analysis (1)
bootstrap/bootstrap.go (5)
storage/pebble/events_hash.go (2)
  • EventsHash (14-16)
  • NewEventsHash (18-22)
services/ingestion/event_subscriber.go (2)
  • EventSubscriber (27-33)
  • NewRPCEventSubscriber (49-65)
services/ingestion/sealing_verifier.go (2)
  • SealingVerifier (30-54)
  • NewSealingVerifier (57-82)
services/ingestion/block_tracking_subscriber.go (1)
  • NewRPCBlockTrackingSubscriber (48-66)
models/errors/errors.go (1)
  • ErrStorageNotInitialized (37-37)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: Test
🔇 Additional comments (6)
config/config.go (1)

123-131: LGTM! Experimental feature flags are well-documented.

The new configuration fields are clearly named and include appropriate warnings about their experimental nature and potential risks. The comments adequately explain what each feature does and the cautions operators should be aware of.

bootstrap/bootstrap.go (5)

60-60: LGTM! Storages struct enhancement follows existing patterns.

The EventsHash field addition is consistent with other storage components and properly typed.


613-613: LGTM! EventsHash initialization follows established patterns.

The initialization is consistent with other storage components in the same function.


706-706: LGTM! EventsHash properly wired into returned Storages.

The EventsHash component is correctly included in the returned Storages struct, completing the integration of the events hash storage layer.


625-642: Rollback logic is correct; no changes needed.

Verification confirms BatchRemoveAboveHeight handles uninitialized storage safely. When storage is uninitialized, the first call to GetByHeight(height + 1) returns ErrEntityNotFound, triggering an immediate nil return. The method operates as an idempotent no-op on empty storage, so calling it unconditionally is safe.

The force start cadence height logic properly implements rollback semantics with correct error handling throughout.


163-193: No changes required — nil verifier is correctly handled.

Verification confirms that RPCBlockTrackingSubscriber properly guards all verifier accesses with nil checks. At lines 96, 118, and 203 of services/ingestion/block_tracking_subscriber.go, verifier methods are only invoked after confirming r.verifier != nil. The optional verifier design correctly maps to the conditional initialization logic in bootstrap.go.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@peterargue peterargue left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added a couple small comments, but otherwise this looks good. let's get it running against one of the live networks.

@m-Peter
Copy link
Collaborator Author

m-Peter commented Jan 23, 2025

@peterargue Thanks for the review 🙌 I've addressed all the comments.
Did you perhaps have the chance to take a look at the behavior described in #727 (comment) ? This is puzzling to me 🤔

@m-Peter m-Peter marked this pull request as ready for review January 23, 2025 15:29
coderabbitai[bot]

This comment was marked as resolved.

coderabbitai[bot]

This comment was marked as resolved.

@m-Peter m-Peter force-pushed the mpeter/poc-index-finalized-block-results branch from 314e7cc to bfe6188 Compare January 27, 2025 09:43
coderabbitai[bot]

This comment was marked as resolved.

coderabbitai[bot]

This comment was marked as resolved.

coderabbitai[bot]

This comment was marked as resolved.

coderabbitai[bot]

This comment was marked as resolved.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

♻️ Duplicate comments (1)
services/ingestion/block_tracking_subscriber.go (1)

343-359: Add maximum retry attempts to prevent infinite loops.

The retry loop could run indefinitely if NotFound or ResourceExhausted errors persist. This was previously flagged but not addressed.

+const maxRetries = 10
+retryCount := 0
+retryDelay := 200 * time.Millisecond
+
 // retry until we get the block from an execution node that has the events
 for {
     evts, err = r.client.GetEventsForBlockHeader(
         ctx,
         eventType,
         blockHeader,
     )
     if err != nil {
         // retry after a short pause
         if status.Code(err) == codes.NotFound || status.Code(err) == codes.ResourceExhausted {
-            time.Sleep(200 * time.Millisecond)
+            retryCount++
+            if retryCount > maxRetries {
+                return flow.BlockEvents{}, fmt.Errorf("max retries exceeded after %d attempts: %w", maxRetries, err)
+            }
+            time.Sleep(retryDelay)
+            retryDelay = retryDelay * 2 // exponential backoff
+            if retryDelay > 5*time.Second {
+                retryDelay = 5*time.Second // cap at 5 seconds
+            }
             continue
         }
 
         return flow.BlockEvents{}, fmt.Errorf("failed to get events from access node: %w", err)
     }
     break
 }
🧹 Nitpick comments (2)
services/ingestion/block_tracking_subscriber.go (2)

110-118: Consider using Error logging instead of Fatal for verifier failures.

Using Fatal logging will terminate the entire application if the verifier encounters an error. This might be too severe for a verification component that's optional and experimental.

 go func() {
     r.verifier.SetStartHeight(r.height)
     if err := r.verifier.Run(ctx); err != nil {
-        r.logger.Fatal().Err(err).Msg("failure running sealing verifier")
+        r.logger.Error().Err(err).Msg("failure running sealing verifier")
         return
     }
 }()

235-240: Consider making the block notification delay configurable.

The hardcoded value of 15 blocks for the notification delay might need adjustment based on network conditions or AN sync performance.

+const blockNotificationDelay = 15
+
 // ...
-if len(blockHeadersQueue) > 15 {
+if len(blockHeadersQueue) > blockNotificationDelay {
     earliestBlockHeader := blockHeadersQueue[0]
     r.keyLock.NotifyBlock(earliestBlockHeader)
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 7575806 and 6f5b3c9.

📒 Files selected for processing (1)
  • services/ingestion/block_tracking_subscriber.go (1 hunks)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
  • GitHub Check: Test
  • GitHub Check: Lint
🔇 Additional comments (1)
services/ingestion/block_tracking_subscriber.go (1)

28-45: Well-documented experimental feature with appropriate warnings.

The extensive documentation clearly outlines the limitations and risks of using unsealed finalized blocks. The warning about manual tracking requirements and Proof of Concept usage is appropriately prominent.

coderabbitai[bot]

This comment was marked as outdated.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)
go.mod (1)

3-3: Invalid go directive: use major.minor (no patch).

The go version in go.mod must be like 1.25, not 1.25.0. This breaks tooling.

Apply:

-go 1.25.0
+go 1.25
tests/go.mod (1)

3-3: Invalid go directive in tests module.

Same as root: use 1.25, not 1.25.0.

Apply:

-go 1.25.0
+go 1.25
🧹 Nitpick comments (4)
tests/go.mod (4)

14-14: Version drift: zerolog differs between root (v1.33.0) and tests (v1.34.0).

Not harmful, but avoid dual versions unless needed. Consider aligning to one version across modules.

Apply (either direction):

-	github.com/rs/zerolog v1.34.0
+	github.com/rs/zerolog v1.33.0

181-184: Prometheus stack minor bumps only in tests: consider aligning with root.

To reduce duplicate versions and metric wiring surprises, align client_golang, common, and procfs with the root module.


233-233: Duplicate YAML modules in graph.

Both go.yaml.in/yaml/v2 (v2.4.2) and gopkg.in/yaml.v2 (v2.4.0 at Line 252) appear. Prefer a single path if possible; if transitive, consider bumping deps to converge, then go mod tidy.


249-250: gRPC/protobuf patch drift vs root.

Tests use grpc v1.75.1 and protobuf v1.36.8; root has v1.75.0 and v1.36.7. Harmless, but aligning reduces duplicate loads and avoids subtle ABI issues.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between a8f7437 and efd12b0.

⛔ Files ignored due to path filters (2)
  • go.sum is excluded by !**/*.sum
  • tests/go.sum is excluded by !**/*.sum
📒 Files selected for processing (2)
  • go.mod (2 hunks)
  • tests/go.mod (7 hunks)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
  • GitHub Check: Lint
  • GitHub Check: Test
🔇 Additional comments (4)
go.mod (2)

14-14: flow-go bump to v0.43.0: confirm API compatibility.

v0.43.0 likely changes Access API/proto shapes. Please confirm RPCBlockHeaderSubscriber/CrossSporkClient compile and run against this version in CI, and that protobuf-gen outputs were refreshed.


148-156: Contract/template/protobuf version sync looks good; verify event types.

Upgrading core-contracts, templates, NFT, and flow protobuf to newer versions is consistent with finalized-header indexing. Please verify emitted EVM event type identifiers and payload fields used by the indexer didn’t change, and run go mod tidy -compat=1.25 to lock a coherent graph.

tests/go.mod (2)

10-10: Emulator pseudo‑version: does it expose finalized‑header subscription?

You noted E2E is blocked because the Emulator lacks this endpoint. This update suggests a very recent pseudo‑version; please confirm whether it now implements finalized‑header subscription. If not, gate/skip tests that depend on it to keep CI green.


12-12: Align flow-go version with root module (OK).

Tests now use v0.43.0, matching the root. Good for API parity.

Run go mod tidy -compat=1.25 in both modules to ensure the graph is minimal and consistent.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
bootstrap/bootstrap.go (1)

618-634: API verification confirms the atomicity concern is valid: SetProcessedSealedHeight creates its own batch and cannot participate in atomic writes, while SetLatestCadenceHeight accepts an external batch.

The issue is real. SetProcessedSealedHeight(height uint64) only accepts the height parameter and internally calls WithBatch(e.store, func(batch *pebbleDB.Batch) error {...}), creating and committing its own batch immediately. In contrast, SetLatestCadenceHeight(height uint64, batch *pebble.Batch) accepts an external batch parameter, allowing it to participate in the deferred atomic commit. If batch.Commit() fails after SetProcessedSealedHeight has already persisted, the two stores will be left inconsistent. The suggested deferred approach is the correct solution given the API constraints.

🧹 Nitpick comments (2)
bootstrap/bootstrap.go (2)

119-137: Use the provided ctx instead of context.Background

Propagate cancellation/timeouts from StartEventIngestion.

Apply:

-	latestCadenceBlock, err := b.client.GetLatestBlock(context.Background(), true)
+	latestCadenceBlock, err := b.client.GetLatestBlock(ctx, true)
@@
-	_, err = b.client.GetBlockHeaderByHeight(context.Background(), latestCadenceHeight)
+	_, err = b.client.GetBlockHeaderByHeight(ctx, latestCadenceHeight)

157-187: Nil-verifier safety is already properly handled; logging suggestion remains optional

The existing code already guards all verifier dereferences with nil-checks (block_tracking_subscriber.go lines 110 and 195), and StartEventIngestion is correctly called after StartAPIServer. The suggested logging additions from the diff have not been applied—these remain optional for improved observability.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between a404eab and 6552a69.

📒 Files selected for processing (1)
  • bootstrap/bootstrap.go (5 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
bootstrap/bootstrap.go (5)
storage/pebble/events_hash.go (2)
  • EventsHash (14-16)
  • NewEventsHash (18-22)
services/ingestion/event_subscriber.go (2)
  • EventSubscriber (27-33)
  • NewRPCEventSubscriber (49-65)
services/ingestion/sealing_verifier.go (2)
  • SealingVerifier (30-54)
  • NewSealingVerifier (57-82)
services/ingestion/block_tracking_subscriber.go (1)
  • NewRPCBlockTrackingSubscriber (47-65)
models/errors/errors.go (1)
  • ErrStorageNotInitialized (36-36)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
  • GitHub Check: Test
  • GitHub Check: Lint
🔇 Additional comments (3)
bootstrap/bootstrap.go (3)

60-61: Add EventsHash to Storages — LGTM

Field is appropriately scoped and enables verifier wiring.


607-608: Create EventsHash — LGTM

Initialization matches the Storage instance.


697-698: Wire EventsHash into Storages — LGTM

Storage bundle exposes EventsHash for verifier/consumers.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (4)
services/ingestion/block_tracking_subscriber.go (2)

143-294: Consider adding exponential backoff for reconnection attempts.

The reconnection logic on Line 279 attempts to reconnect on NotFound errors without any backoff delay. Consider adding exponential backoff to avoid tight reconnection loops during transient failures.

Example enhancement:

+	reconnectDelay := 200 * time.Millisecond
+	maxReconnectDelay := 5 * time.Second
+
 				if err := connect(lastReceivedHeight + 1); err != nil {
+					time.Sleep(reconnectDelay)
+					reconnectDelay = reconnectDelay * 2
+					if reconnectDelay > maxReconnectDelay {
+						reconnectDelay = maxReconnectDelay
+					}
 					eventsChan <- models.NewBlockEventsError(

353-369: Add maximum retry limit to prevent infinite loops.

The retry loop on Lines 353-369 can run indefinitely on persistent NotFound or ResourceExhausted errors. Consider adding a maximum retry count with exponential backoff to prevent infinite loops during prolonged outages.

Example enhancement:

+	const maxRetries = 30
+	retryCount := 0
+	retryDelay := 200 * time.Millisecond
+
 	// retry until we get the block from an execution node that has the events
 	for {
 		evts, err = r.client.GetEventsForBlockHeader(
 			ctx,
 			eventType,
 			blockHeader,
 		)
 		if err != nil {
 			// retry after a short pause
 			if status.Code(err) == codes.NotFound || status.Code(err) == codes.ResourceExhausted {
+				retryCount++
+				if retryCount > maxRetries {
+					return flow.BlockEvents{}, fmt.Errorf("max retries exceeded after %d attempts: %w", maxRetries, err)
+				}
-				time.Sleep(200 * time.Millisecond)
+				time.Sleep(retryDelay)
+				retryDelay = retryDelay * 2
+				if retryDelay > 5*time.Second {
+					retryDelay = 5*time.Second
+				}
 				continue
 			}
storage/pebble/events_hash.go (1)

47-62: Document the sequential insertion assumption.

The BatchRemoveAboveHeight method assumes event hashes are inserted sequentially with no gaps (line 52-53 comment). Consider adding a top-level comment on the EventsHash struct documenting this invariant, as it's critical for the correctness of this deletion logic.

Example enhancement:

+// EventsHash provides storage for per-height event hashes.
+//
+// INVARIANT: Event hashes are stored sequentially with no gaps in height.
+// This invariant is relied upon by BatchRemoveAboveHeight for efficient cleanup.
 type EventsHash struct {
 	store *Storage
 }
services/ingestion/sealing_verifier.go (1)

40-48: Consider monitoring cache sizes for production deployment.

The unsealedBlocksToVerify and sealedBlocksToVerify maps have no explicit size bounds. While normal operation should keep these small (cleaned up in verifyBlock at lines 415-416), a sustained divergence between sealed and unsealed streams could cause unbounded growth.

For production use, consider adding metrics to monitor cache sizes and alerts for divergence thresholds.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between e302a12 and 5c97d88.

📒 Files selected for processing (9)
  • bootstrap/bootstrap.go (5 hunks)
  • models/events.go (1 hunks)
  • services/ingestion/block_tracking_subscriber.go (1 hunks)
  • services/ingestion/sealing_verifier.go (1 hunks)
  • services/requester/cross-spork_client.go (7 hunks)
  • services/requester/cross-spork_client_test.go (2 hunks)
  • services/testutils/mock_client.go (1 hunks)
  • storage/pebble/events_hash.go (1 hunks)
  • storage/pebble/storage.go (1 hunks)
🧰 Additional context used
🧠 Learnings (2)
📚 Learning: 2025-10-06T10:14:49.706Z
Learnt from: m-Peter
PR: onflow/flow-evm-gateway#890
File: services/ingestion/event_subscriber.go:235-239
Timestamp: 2025-10-06T10:14:49.706Z
Learning: In services/ingestion/event_subscriber.go, when reconnecting after disconnect errors (DeadlineExceeded, Internal, Unavailable), the subscription should reconnect at lastReceivedHeight rather than lastReceivedHeight+1. This avoids errors when the next height doesn't exist yet, and duplicate event processing is safe because the ingestion engine is explicitly designed to be idempotent (storage uses batch.Set() which overwrites existing entries).

Applied to files:

  • services/ingestion/block_tracking_subscriber.go
📚 Learning: 2025-03-07T01:35:09.751Z
Learnt from: peterargue
PR: onflow/flow-evm-gateway#772
File: services/requester/keystore/key_store.go:50-62
Timestamp: 2025-03-07T01:35:09.751Z
Learning: In the flow-evm-gateway codebase, panics are acceptable in scenarios where immediate detection of critical bugs is desired during development and testing, particularly for invariant violations that should never occur in a correctly functioning system (e.g., when a key is available but locked in the keystore implementation).

Applied to files:

  • services/ingestion/block_tracking_subscriber.go
🧬 Code graph analysis (6)
storage/pebble/events_hash.go (3)
storage/pebble/storage.go (1)
  • Storage (13-16)
storage/pebble/db.go (1)
  • WithBatch (65-84)
models/errors/errors.go (2)
  • ErrEntityNotFound (38-38)
  • ErrStorageNotInitialized (36-36)
services/ingestion/sealing_verifier.go (5)
services/ingestion/engine.go (1)
  • Engine (40-57)
models/engine.go (2)
  • EngineStatus (19-23)
  • NewEngineStatus (25-31)
services/requester/cross-spork_client.go (1)
  • CrossSporkClient (111-116)
storage/pebble/events_hash.go (1)
  • EventsHash (14-16)
models/errors/errors.go (3)
  • ErrStorageNotInitialized (36-36)
  • ErrDisconnected (24-24)
  • ErrEntityNotFound (38-38)
services/ingestion/block_tracking_subscriber.go (6)
services/ingestion/event_subscriber.go (2)
  • RPCEventSubscriber (37-47)
  • NewRPCEventSubscriber (49-65)
services/ingestion/sealing_verifier.go (1)
  • SealingVerifier (30-54)
services/requester/cross-spork_client.go (1)
  • CrossSporkClient (111-116)
services/requester/keystore/key_store.go (1)
  • KeyLock (21-37)
models/events.go (3)
  • BlockEvents (213-216)
  • NewBlockEventsError (257-261)
  • NewSingleBlockEvents (240-255)
models/errors/errors.go (1)
  • ErrDisconnected (24-24)
bootstrap/bootstrap.go (5)
storage/pebble/events_hash.go (2)
  • EventsHash (14-16)
  • NewEventsHash (18-22)
services/ingestion/event_subscriber.go (2)
  • EventSubscriber (27-33)
  • NewRPCEventSubscriber (49-65)
services/ingestion/sealing_verifier.go (2)
  • SealingVerifier (30-54)
  • NewSealingVerifier (57-82)
services/ingestion/block_tracking_subscriber.go (1)
  • NewRPCBlockTrackingSubscriber (48-66)
models/errors/errors.go (1)
  • ErrStorageNotInitialized (36-36)
services/requester/cross-spork_client_test.go (2)
services/testutils/mock_client.go (1)
  • SetupClientForRange (64-77)
services/requester/cross-spork_client.go (1)
  • NewCrossSporkClient (120-178)
services/requester/cross-spork_client.go (1)
models/events.go (1)
  • BlockEvents (213-216)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: Test
🔇 Additional comments (33)
services/testutils/mock_client.go (1)

100-101: LGTM! Test mock aligned with spork root height changes.

Setting both NodeRootBlockHeight and SporkRootBlockHeight to startHeight provides consistent test behavior and aligns with the broader PR changes that introduce spork root height handling in the CrossSporkClient.

storage/pebble/storage.go (1)

61-65: LGTM! Batch delete helper follows established patterns.

The new delete method mirrors the existing set method's signature and batch-oriented design, enabling atomic deletion operations used by the EventsHash storage cleanup logic.

models/events.go (1)

155-158: LGTM! Clean accessor for underlying Flow events.

The new BlockEvents() getter provides a straightforward way to access the internal flow.BlockEvents, supporting the integration with the SealingVerifier workflow.

services/requester/cross-spork_client_test.go (1)

76-172: LGTM! Enhanced test coverage for spork boundary scenarios.

The test restructuring into explicit subtests improves clarity and adds valuable coverage for:

  • Continuous range validation between past and current sporks
  • Gap detection between spork root and node root heights
  • Gap detection between past spork end and current spork root

These scenarios align well with the updated CrossSporkClient validation logic.

bootstrap/bootstrap.go (5)

60-60: LGTM! EventsHash field added to Storages.

The new EventsHash field integrates the events hash storage mechanism into the Storages struct, enabling verification workflows.


157-187: LGTM! Subscriber selection logic properly gated by experimental flags.

The conditional logic appropriately selects between RPCBlockTrackingSubscriber (with optional SealingVerifier) and the original RPCEventSubscriber based on the experimental flags. The verifier is correctly instantiated only when both soft finality and sealing verification are enabled.


607-607: LGTM! EventsHash properly initialized.

The EventsHash is correctly constructed using NewEventsHash(store) during storage setup.


621-636: LGTM! Proper cleanup of EventsHash when forcing start height.

The logic correctly:

  • Retrieves the processed sealed height
  • Updates it to ForceStartCadenceHeight if it's higher
  • Removes event hashes above the forced height

This ensures consistency when reindexing from a specific height.


700-700: LGTM! EventsHash included in returned Storages.

The EventsHash is properly included in the returned Storages struct, completing the wiring.

services/ingestion/block_tracking_subscriber.go (6)

24-25: LGTM! Well-defined error types for specific failure scenarios.

The error definitions clearly distinguish between system transaction failures and spork root block edge cases.


42-46: LGTM! Clear struct composition with optional verifier.

The struct properly extends RPCEventSubscriber and includes an optional SealingVerifier for experimental verification workflows.


48-66: LGTM! Constructor properly initialized.

The constructor correctly composes the base RPCEventSubscriber and includes the optional verifier.


73-137: LGTM! Subscribe method handles backfill and verifier lifecycle.

The implementation correctly:

  • Handles backfill for past sporks
  • Feeds backfilled events to the verifier
  • Starts the verifier after backfilling (since backfilled data is sealed)
  • Transitions to live subscription

The verifier lifecycle management is appropriate for the PoC workflow.


296-342: LGTM! Event fetching logic properly handles block and transaction events.

The implementation correctly:

  • Fetches block events and checks for empty transaction hash
  • Handles ErrSystemTransactionFailed gracefully by continuing to fetch transaction events
  • Combines block and transaction events for processing

The error handling for system transaction failures is appropriate.


380-410: LGTM! Proper validation of block executed events.

The logic correctly:

  • Validates the expected number of block events
  • Handles the spork root block edge case
  • Verifies system transaction status when events are missing
  • Returns appropriate errors for unexpected scenarios

The defensive checks ensure data integrity.

storage/pebble/events_hash.go (5)

14-22: LGTM! Clean constructor for EventsHash storage.

The struct and constructor follow established pebble storage patterns.


24-35: LGTM! Store method uses batch operations for atomicity.

The method correctly uses WithBatch to ensure atomic storage of event hashes with proper error wrapping.


37-44: LGTM! GetByHeight properly retrieves and converts hashes.

The method correctly retrieves the stored bytes and converts them to flow.Identifier with appropriate error handling.


64-74: LGTM! Proper handling of uninitialized storage.

The ProcessedSealedHeight method correctly distinguishes between uninitialized storage (ErrStorageNotInitialized) and other errors.


76-87: LGTM! Batch variants support atomic operations.

Both SetProcessedSealedHeight and BatchSetProcessedSealedHeight are properly implemented, with the former delegating to the batch variant via WithBatch.

services/requester/cross-spork_client.go (6)

112-115: LGTM! Struct updated with spork root height field.

The currentSporkRootHeight field properly captures the spork root block height for boundary checks.


126-149: LGTM! Proper distinction between spork root and node root heights.

The initialization logic correctly:

  • Captures both SporkRootBlockHeight and NodeRootBlockHeight
  • Logs a warning when they differ (indicating missing blocks in the gap)
  • Documents the rationale for using SporkRootBlockHeight for checks

This handles the edge case where nodes are bootstrapped after the spork root block.


163-170: LGTM! Validation ensures continuous height ranges.

The validation correctly checks that:

  • Past spork clients form a continuous range
  • The last past spork ends at the node root block height (the first available height on the node)

This ensures no gaps in the queryable height range.


182-193: LGTM! New public methods expose spork root height information.

The new methods IsSporkRootBlockHeight and CurrentSporkRootHeight provide clean APIs for querying spork boundary information used by the block tracking subscriber.


288-298: LGTM! GetEventsForBlockHeader enables finalized block queries.

The method correctly uses GetEventsForBlockIDs (rather than the height range endpoint) to allow querying finalized but unsealed blocks, which is essential for the soft-finality feature.


300-315: LGTM! Block header subscription properly exposed.

The SubscribeBlockHeadersFromStartHeight method correctly:

  • Delegates to the appropriate client for the start height
  • Casts to grpc.Client to access the subscription API
  • Returns appropriate errors if the cast fails

This enables the block tracking subscriber to follow finalized blocks.

services/ingestion/sealing_verifier.go (7)

56-82: Constructor initialization looks correct.

The logic to compute lastProcessedUnsealedHeight from startHeight is appropriate, correctly handling the edge case when startHeight is 0.


139-164: Connection retry logic is reasonable for a PoC.

The string-based error matching (lines 153-154) is fragile but may be the only option if the Access API doesn't provide structured error types. The infinite retry with logging should make issues observable during operation.

Consider verifying whether the Access API provides structured error types that could replace the string matching for more robust error handling.


228-295: Backfill logic correctly handles missing block events.

The length check at line 263 correctly compares the number of BlockEvents entries (one per height) rather than individual events. The accumulation logic (lines 267-288) appropriately handles the edge case where EVM block events are missing by grouping transactions into the next available block.


297-347: Sequential ordering and caching logic is sound.

The CompareAndSwap check (lines 304-308) ensures sealed events are processed in strict sequential order, which is appropriate for this verification use case. The dual condition at line 324 correctly handles both missing unsealed data and rollback scenarios.

The logger.Fatal usage (line 334) was previously flagged in past reviews.


349-393: Unsealed event processing mirrors sealed event logic correctly.

The operation ordering (lines 362-371) is correct: store the hash before updating the height counter to maintain consistency. The symmetric structure with onSealedEvents makes the verification logic easier to understand.


429-477: Hash calculation and canonical ordering implemented correctly.

The canonical sorting (lines 464-469) by transaction index and event index ensures deterministic hash computation. The conversion logic properly handles all required fields including transaction IDs and indices.


102-123: Storage initialization logic handles edge cases correctly.

The initialization at lines 115-117 properly handles the case when startHeight is 0, ensuring lastVerifiedHeight is set appropriately. The fallback to initialize storage (lines 118-120) when ErrStorageNotInitialized is encountered is a good defensive pattern.

Comment on lines +187 to +193
case sealedEvents, ok := <-eventsChan:
if !ok {
if ctx.Err() != nil {
return nil
}
return fmt.Errorf("failed to receive block events: %w", err)
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Critical: Undefined variable err at line 192.

When eventsChan closes unexpectedly (line 188), the error message at line 192 attempts to wrap an undefined err variable. The err is only in scope within the errChan case (line 201).

Apply this diff to fix the undefined variable:

 		case sealedEvents, ok := <-eventsChan:
 			if !ok {
 				if ctx.Err() != nil {
 					return nil
 				}
-				return fmt.Errorf("failed to receive block events: %w", err)
+				return fmt.Errorf("sealed events channel closed unexpectedly")
 			}
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
case sealedEvents, ok := <-eventsChan:
if !ok {
if ctx.Err() != nil {
return nil
}
return fmt.Errorf("failed to receive block events: %w", err)
}
case sealedEvents, ok := <-eventsChan:
if !ok {
if ctx.Err() != nil {
return nil
}
return fmt.Errorf("sealed events channel closed unexpectedly")
}
🤖 Prompt for AI Agents
In services/ingestion/sealing_verifier.go around lines 187 to 193, the error
return wraps an undefined variable `err` when eventsChan is closed; replace the
invalid fmt.Errorf(..., err) with a concrete error (for example
fmt.Errorf("failed to receive block events: channel closed") or
errors.New("events channel closed")) so the function returns a valid error value
when the events channel closes unexpectedly.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

PoC to allow indexing unsealed finalized execution results

5 participants