Skip to content

fix(txm): nonce gap#561

Merged
GabrielMartinezRodriguez merged 5 commits intomasterfrom
gabriel/fix-randomnness
Apr 10, 2025
Merged

fix(txm): nonce gap#561
GabrielMartinezRodriguez merged 5 commits intomasterfrom
gabriel/fix-randomnness

Conversation

@GabrielMartinezRodriguez
Copy link
Contributor

@GabrielMartinezRodriguez GabrielMartinezRodriguez commented Mar 26, 2025

Description

The randomness service got stuck due to a nonce gap. This happened because, in the case where the server crashes for some reason, we might leave a transaction unsaved, while at the same time saving another transaction with a higher nonce. When the nonce manager starts and calculates the nonce gaps, we were incorrectly assuming that the nonce obtained from the blockchain's transaction count represents the last executed transaction for that account. However, that value actually represents the next available nonce.

For example, suppose we are processing two new transactions with nonces 9 and 10. If the server crashes and only the transaction with nonce 10 is saved, when the service restarts, it will calculate the nonce gaps assuming that nonce 9 was already used—because the transaction count returned from the blockchain indicates the next available nonce, not the last one executed. This leads to the gap and causes the service to get stuck

  • Include all relevant context (but no need to repeat the issue's content).
  • Draw attention to new, noteworthy & unintuitive elements.
Toggle Checklist

Checklist

Basics

  • B1. I have applied the proper label & proper branch name (e.g. norswap/build-system-caching).
  • B2. This PR is not so big that it should be split & addresses only one concern.
  • B3. The PR targets the lowest branch it can (ideally master).

Reminder: PR review guidelines

Correctness

  • C1. Builds and passes tests.
  • C2. The code is properly parameterized & compatible with different environments (e.g. local,
    testnet, mainnet, standalone wallet, ...).
  • C3. I have manually tested my changes & connected features.

< INDICATE BROWSER, DEMO APP & OTHER ENV DETAILS USED FOR TESTING HERE >

< INDICATE TESTED SCENARIOS (USER INTERFACE INTERACTION, CODE FLOWS) HERE >

  • C4. I have performed a thorough self-review of my code after submitting the PR,
    and have updated the code & comments accordingly.

Architecture & Documentation

  • D1. I made it easy to reason locally about the code, by (1) using proper abstraction boundaries,
    (2) commenting these boundaries correctly, (3) adding inline comments for context when needed.
  • D2. All public-facing APIs & meaningful (non-local) internal APIs are properly documented in code
    comments.
  • D3. If appropriate, the general architecture of the code is documented in a code comment or
    in a Markdown document.
  • D4. An appropriate Changeset has been generated (and committed) for changes that touch npm published packages (currently packages/core and packages/react), see here for more info.

@cloudflare-workers-and-pages
Copy link

cloudflare-workers-and-pages bot commented Mar 26, 2025

Deploying happychain with  Cloudflare Pages  Cloudflare Pages

Latest commit: de3af6c
Status: ✅  Deploy successful!
Preview URL: https://d2fae4a2.happychain.pages.dev
Branch Preview URL: https://gabriel-fix-randomnness.happychain.pages.dev

View logs

This was referenced Mar 26, 2025
@GabrielMartinezRodriguez GabrielMartinezRodriguez added the reviewing-1 Ready for, or undergoing first-line review label Mar 26, 2025
@GabrielMartinezRodriguez GabrielMartinezRodriguez marked this pull request as ready for review March 26, 2025 14:58
@GabrielMartinezRodriguez GabrielMartinezRodriguez force-pushed the gabriel/fix-randomnness branch 2 times, most recently from aa82f76 to 5b83c84 Compare March 27, 2025 10:32
@GabrielMartinezRodriguez GabrielMartinezRodriguez force-pushed the gabriel/fix-randomnness branch 3 times, most recently from 1beb802 to 3bfc7d8 Compare March 27, 2025 15:39
})

this.maxExecutedNonce = blockchainNonce
this.maxExecutedNonce = blockchainNonce - 1
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

probably need to clamp this to not go below zero?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

agree, edge case for when nonce is 0 is unlikely but technically possible for a new txm instance

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It’s actually the expected behavior for it to be -1. If the user has an account that has never been used, the blockchain nonce will be 0, and therefore the max executed nonce will become -1. We use the max executed nonce to check if there is any pending transaction with a nonce lower than the maximum executed nonce, and if so, we move it to Interrupted.

if (nonce <= this.transactionManager.nonceManager.maxExecutedNonce) {
    transaction.changeStatus(TransactionStatus.Interrupted)
    return
}

So I think it’s fine for it to remain at -1 because it serves our purpose.

If we change it to undefined, I think the code is going to get messier, because we would have to handle the undefined case in several places, whereas -1 is fully functional

@aodhgan aodhgan added reviewing-2 Ready for, or undergoing final review and removed reviewing-1 Ready for, or undergoing first-line review labels Apr 1, 2025
@aodhgan aodhgan added question Something has to be cleared up after review and removed reviewing-2 Ready for, or undergoing final review labels Apr 1, 2025
@GabrielMartinezRodriguez GabrielMartinezRodriguez added reviewing-2 Ready for, or undergoing final review and removed question Something has to be cleared up after review labels Apr 3, 2025
@GabrielMartinezRodriguez GabrielMartinezRodriguez force-pushed the gabriel/fix-randomnness branch 2 times, most recently from 840cd07 to 51d9734 Compare April 7, 2025 11:45
Base automatically changed from gabriel/txm-telemetry to master April 7, 2025 11:58
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This needs to revert — if it causes an issue for you, you'll need to foundryup to get the 1.0.0 release.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@norswap norswap added merge-after-changes Can be merged after requested changes are made and removed reviewing-2 Ready for, or undergoing final review labels Apr 8, 2025
@GabrielMartinezRodriguez GabrielMartinezRodriguez merged commit 4e5cb84 into master Apr 10, 2025
4 checks passed
@GabrielMartinezRodriguez GabrielMartinezRodriguez deleted the gabriel/fix-randomnness branch April 10, 2025 09:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

merge-after-changes Can be merged after requested changes are made

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants