Skip to content

Conversation

@TheBlueMatt
Copy link
Collaborator

No description provided.

If while a splice is pending, the channel happens to not have any
commitment updates, but did prior to the splice being negotiated, it's
possible that we end up with bogus holder HTLC data for the previous
commitment. After the splice becomes locked, we've successfully
transitioned to the new funding transaction, but that funding
transaction never had a commitment transaction negotiated for the
previous state.
@TheBlueMatt TheBlueMatt added this to the 0.2 milestone Oct 21, 2025
@ldk-reviews-bot
Copy link

ldk-reviews-bot commented Oct 21, 2025

👋 Thanks for assigning @jkczyz as a reviewer!
I'll wait for their review and will help manage the review process.
Once they submit their review, I'll check if a second reviewer would be helpful.

@codecov
Copy link

codecov bot commented Oct 21, 2025

Codecov Report

❌ Patch coverage is 96.77419% with 7 lines in your changes missing coverage. Please review.
✅ Project coverage is 88.80%. Comparing base (05f2848) to head (fb7ce22).
⚠️ Report is 21 commits behind head on main.

Files with missing lines Patch % Lines
lightning/src/ln/channel.rs 87.23% 5 Missing and 1 partial ⚠️
lightning/src/util/test_channel_signer.rs 96.42% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #4167      +/-   ##
==========================================
+ Coverage   88.78%   88.80%   +0.01%     
==========================================
  Files         180      180              
  Lines      137004   137262     +258     
  Branches   137004   137262     +258     
==========================================
+ Hits       121642   121896     +254     
- Misses      12538    12561      +23     
+ Partials     2824     2805      -19     
Flag Coverage Δ
fuzzing 21.49% <17.00%> (+0.53%) ⬆️
tests 88.64% <96.77%> (+0.01%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@TheBlueMatt TheBlueMatt force-pushed the 2025-10-upgrade-scid-timeout branch 3 times, most recently from 54881f5 to 545854f Compare October 21, 2025 22:53
@TheBlueMatt TheBlueMatt self-assigned this Oct 22, 2025
@ldk-reviews-bot
Copy link

🔔 1st Reminder

Hey @jkczyz! This PR has been waiting for your review.
Please take a look when you have a chance. If you're unable to review, please let us know so we can find another reviewer.

Comment on lines 13249 to 13251
// `position` above will also return `None` if we have historical scids but they all
// need to be removed, so `end` should point to the last index in such cases.
.unwrap_or(self.context.historical_scids.len());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just to be safe, could you move this up after position and change that and_then to a map? I think it should currently work as written, but arguably that change is more correct. Otherwise, we're assuming historical_scids is empty when the current funding doesn't have an SCID.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I rewrote the whole thing with filter+count.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FWIW, that requires traversing the entire list where the previous approach only needs to traverse the ones to drop.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you do that many on-chain splice transactions, I think you can afford the CPU :)

@ldk-reviews-bot
Copy link

👋 The first review has been submitted!

Do you think this PR is ready for a second reviewer? If so, click here to assign a second reviewer.

wpaulino and others added 4 commits October 24, 2025 21:10
We relied on `position` giving us the last index we need to prune, but
this may return `None` when all known legacy SCIDs need to be pruned. In
such cases, we ended up not pruning any of the legacy SCIDs at all.

Rewritten by: Matt Corallo <[email protected]>
When splicing, we're required by protocol to retain all the
existing keys material except the funding key which we're allowed
to rotate. In the original implementation we acknowledged that but
figured we'd stick with a single `pubkey` method in the
`ChannelSigner` anyway cause adding a specific method for it is
annoying.

Sadly, this was ultimately broken - in `FundingScope::for_splice`,
we called the signer's `new_pubkeys` method (renamed from `pubkeys`
after splicing initially landed), replacing all of the public keys
the `Channel` would use rather than just the funding key. This can
result in commitment signature mismatches if the signer changes any
keys aside from the funding one.

`InMemorySigner` did not do so, however, so we didn't notice the
bug. Luckily-ish, in 189b8ac we
started generating a fresh `remote_key` when splicing (at least
when upgrading from 0.1 to 0.2 or when setting `KeysManager` to use
v1 `remote_key` derivation). This breaks splicing cause we can't
communicate the new `remote_key` to the counterparty during the
splicing handshake.

Ultimately this bug is because the API we had didn't communicate to
the signer that we weren't allowed to change anything except the
funding key, and allowed returning a `ChannelPublicKeys` which
would break the channel.

Here we fix this by renaming `new_pubkeys` `pubkeys` again
(partially reverting 9d291e0 but
keeping the changed requirements that `pubkeys` only be called
once) and adding a new `ChannelSigner:new_funding_pubkey` method
specifically for splicing.

We also update `channel.rs` to correctly fetch the new funding
pubkey before sending `splice_init`, storing it in the
`PendingFunding` untl we build a `FundingScope`.
@TheBlueMatt TheBlueMatt force-pushed the 2025-10-upgrade-scid-timeout branch from 545854f to cf745d9 Compare October 24, 2025 21:38
@TheBlueMatt TheBlueMatt requested a review from jkczyz October 24, 2025 21:38
@TheBlueMatt TheBlueMatt force-pushed the 2025-10-upgrade-scid-timeout branch from cf745d9 to 3160be1 Compare October 24, 2025 21:47
@TheBlueMatt TheBlueMatt force-pushed the 2025-10-upgrade-scid-timeout branch from 3160be1 to 41d95a6 Compare October 24, 2025 21:50
Copy link
Contributor

@jkczyz jkczyz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CI failing :'(

Comment on lines 13249 to 13251
// `position` above will also return `None` if we have historical scids but they all
// need to be removed, so `end` should point to the last index in such cases.
.unwrap_or(self.context.historical_scids.len());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FWIW, that requires traversing the entire list where the previous approach only needs to traverse the ones to drop.

In the previous commit we partially reverted
9d291e0 renaming
`ChannelSigner::new_pubkeys` to `pubkeys` again, but we still don't
want to go back to requiring that `pubkeys` return the same
contents on each call. Thus, here, we add test logic to check that
`pubkeys` isn't called more than once.
`build_commitment_transaction`'s fifth argument is supposed to be
whether we're the one generating the commitment (i.e. because we're
signing rather than validating the commitment). During splicing,
this doesn't matter because there should be no async HTLC
addition/removal happening so the commitment generated wil be the
same in either case, but its still good to pass the correct bool.
If an HTLC was forwarded in 0.1, but waiting to be failed back, it
will ultimately be failed by adding it to the
`ChannelManager::pending_forwards` map with the channel's original
SCID. If that channel is spliced between when the HTLC was
forwarded (on 0.1) and when the HTLC is failed back (on 0.2), that
SCID may no longer exist, causing the HTLC fail-back to be lost.

Luckily, delaying when an SCID is expired is cheap - its just
storing an extra `u64` or two and generating one requires an
on-chain splice, so we simply delay removal of SCIDs for two months
at which point any incoming HTLCs should have been expired for six
weeks and the counterparty should have force-closed anyway.
@TheBlueMatt TheBlueMatt force-pushed the 2025-10-upgrade-scid-timeout branch from 41d95a6 to fb7ce22 Compare October 25, 2025 01:04
@TheBlueMatt
Copy link
Collaborator Author

CI failing :'(

Oops, fixed one issue, but most of it is an MSRV issue on syn.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

SCIDs in forward_htlcs (and decode_update_add_htlcs) can change out from under us

4 participants