Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Batch commitment_signed messages for splicing #3651

Open
wants to merge 5 commits into
base: main
Choose a base branch
from

Conversation

jkczyz
Copy link
Contributor

@jkczyz jkczyz commented Mar 6, 2025

Once a channel is funded, it may be spliced to add or remove funds. The new funding transaction is pending until confirmed on chain and thus needs to be tracked. Additionally, it may be replaced by another transaction using RBF with a higher fee. Hence, there may be more than one pending FundingScope to track for a splice.

This PR adds support for tracking pending funding scopes and accounting for any pending scopes where applicable (e.g., when handling and sending commitment_signed messages).

@ldk-reviews-bot
Copy link

ldk-reviews-bot commented Mar 6, 2025

👋 Thanks for assigning @TheBlueMatt as a reviewer!
I'll wait for their review and will help manage the review process.
Once they submit their review, I'll check if a second reviewer would be helpful.

@jkczyz jkczyz requested a review from wpaulino March 6, 2025 22:59
@jkczyz
Copy link
Contributor Author

jkczyz commented Mar 6, 2025

@wpaulino Just looking for a quick concept ACK. Still needed:

  • serialization of pending_funding, presumably?
  • tests for commitment_signed
  • update get_available_balances

Copy link
Contributor

@wpaulino wpaulino left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah this makes sense. We'll need to support sending a commitment_signed for each scope as well.

claimed_htlcs: ref mut update_claimed_htlcs, ..
} = &mut update {
debug_assert!(update_claimed_htlcs.is_empty());
*update_claimed_htlcs = claimed_htlcs.clone();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm, it'd be nice to not have this duplicated data, but I guess it's pretty small anyway.

Somewhat related, in #3606 we're introducing a new update variant (for the counterparty commitment only, but we'll need to do the same for the holder commitment as well) that only tracks the commitment transaction. I wonder if we can get away with using it for the additional funding scopes as a way to simplify the transition to the new variant, as you wouldn't be allowed to downgrade with a pending spliced channel anyway.

@ldk-reviews-bot
Copy link

👋 The first review has been submitted!

Do you think this PR is ready for a second reviewer? If so, click here to assign a second reviewer.

@jkczyz
Copy link
Contributor Author

jkczyz commented Mar 7, 2025

Yeah this makes sense. We'll need to support sending a commitment_signed for each scope as well.

By this do you mean we'll need msgs::CommitmentUpdate to contain a Vec of msgs::CommitmentSigned messages instead of a single one?

@jkczyz
Copy link
Contributor Author

jkczyz commented Mar 7, 2025

Pushed another commit for get_available_balances.

@wpaulino
Copy link
Contributor

By this do you mean we'll need msgs::CommitmentUpdate to contain a Vec of msgs::CommitmentSigned messages instead of a single one?

Yeah we'll need to go through each case where we owe the counterparty a commitment_signed (except for the initial one sent in dual funding/splicing) and make sure we always consider all scopes.

@jkczyz jkczyz force-pushed the 2025-03-multiple-funding-scopes branch from eac7be9 to 8b4e46a Compare March 11, 2025 22:44
@jkczyz jkczyz changed the title Add pending funding scopes to FundedChannel Batch commitment_signed messages for splicing Mar 11, 2025
// May or may not have a pending splice
Some(batch) => {
self.commitment_signed_batch.push(msg.clone());
if self.commitment_signed_batch.len() < batch.batch_size as usize {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should also consider the number of scopes available. We shouldn't receive a batch with anything other than that number.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There may be an edge case to consider. I started a discussion on the spec: https://github.com/lightning/bolts/pull/1160/files/8c907f6b8d26fad8ec79ad1fe3078eb92e5285a6#r1990528673

Though if pending_funding is empty, the spec states we should ignore any batched commitment_signed messages that don't match the new funding_txid.

  • If batch is set:
    ...
    • Otherwise (no pending splice transactions):
    • MUST ignore commitment_signed where funding_txid does not match
      the current funding transaction.
    • If commitment_signed is missing for the current funding transaction:
      • MUST send an error and fail the channel.
    • Otherwise:
      • MUST respond with a revoke_and_ack message.

@jkczyz
Copy link
Contributor Author

jkczyz commented Mar 12, 2025

Yeah we'll need to go through each case where we owe the counterparty a commitment_signed (except for the initial one sent in dual funding/splicing) and make sure we always consider all scopes.

Pushed a couple commits that I think accomplish this. Though I'm not sure about the following line from build_commitment_no_status_check:

let (mut htlcs_ref, counterparty_commitment_tx) =
self.build_commitment_no_state_update(&self.funding, logger);

It is called from methods like send_htlc_and_commit but for producing a ChannelMonitorUpdate. IIUC, I'll need to do this for all funding scopes?

@jkczyz jkczyz force-pushed the 2025-03-multiple-funding-scopes branch from c66e554 to 2db5c60 Compare March 12, 2025 22:34
@jkczyz jkczyz marked this pull request as ready for review March 12, 2025 22:35
@jkczyz jkczyz requested a review from wpaulino March 12, 2025 22:35
@jkczyz jkczyz force-pushed the 2025-03-multiple-funding-scopes branch 2 times, most recently from 3872586 to e371143 Compare March 12, 2025 22:46
@jkczyz jkczyz added the weekly goal Someone wants to land this this week label Mar 12, 2025
@jkczyz
Copy link
Contributor Author

jkczyz commented Mar 12, 2025

Rebased on main.

@jkczyz jkczyz requested a review from dunxen March 17, 2025 15:45
@jkczyz jkczyz force-pushed the 2025-03-multiple-funding-scopes branch from e371143 to 0362159 Compare March 18, 2025 23:01
@jkczyz jkczyz requested a review from wpaulino March 18, 2025 23:02
@ldk-reviews-bot
Copy link

🔔 1st Reminder

Hey @dunxen @wpaulino! This PR has been waiting for your review.
Please take a look when you have a chance. If you're unable to review, please let us know so we can find another reviewer.

@jkczyz jkczyz force-pushed the 2025-03-multiple-funding-scopes branch from 0362159 to 7021e01 Compare March 19, 2025 21:45
@jkczyz
Copy link
Contributor Author

jkczyz commented Mar 19, 2025

Responded and addressed a couple lingering comments.

@jkczyz jkczyz force-pushed the 2025-03-multiple-funding-scopes branch 3 times, most recently from 48bdd52 to 6db1c42 Compare March 21, 2025 14:42
@jkczyz
Copy link
Contributor Author

jkczyz commented Mar 21, 2025

Recent pushes should have fixed CI.

@ldk-reviews-bot
Copy link

🔔 2nd Reminder

Hey @dunxen @wpaulino! This PR has been waiting for your review.
Please take a look when you have a chance. If you're unable to review, please let us know so we can find another reviewer.

@ldk-reviews-bot
Copy link

🔔 3rd Reminder

Hey @dunxen @wpaulino! This PR has been waiting for your review.
Please take a look when you have a chance. If you're unable to review, please let us know so we can find another reviewer.

@jkczyz jkczyz force-pushed the 2025-03-multiple-funding-scopes branch from 6db1c42 to 43cd9b5 Compare March 24, 2025 21:55
@jkczyz
Copy link
Contributor Author

jkczyz commented Mar 24, 2025

Squashed per @wpaulino's request.

#[cfg(any(test, fuzzing))]
self.next_local_commitment_tx_fee_info_cached.write(writer)?;
#[cfg(any(test, fuzzing))]
self.next_remote_commitment_tx_fee_info_cached.write(writer)?;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need to store these?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably don't need to actually. Figured if a test reloaded the node in some scenario it might fail? Doesn't seem to affect any current tests. Can drop if you prefer.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They weren't stored before so let's drop it?

)?
},
// May or may not have a pending splice
Some(batch) => {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Error if batch_size == 1?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm... the spec puts the requirement on the sender but doesn't specify that N pending splices can't be zero. I suppose if there are zero and the sender uses 1, we potentially would use LatestHolderCommitmentTX (once added; see #3651 (comment)) rather than LatestHolderCommitmentTXInfo. Would that be a problem for downgrades?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That would be a problem for downgrades indeed. Maybe we should clarify the spec then, that wording does make it seem like a batch of 1 is allowed.

@jkczyz
Copy link
Contributor Author

jkczyz commented Mar 25, 2025

Pushed some squashed fixups addressing feedback. Going to rebase to resolve merge conflicts.

jkczyz added 5 commits March 25, 2025 12:23
Once a channel is funded, it may be spliced to add or remove funds. The
new funding transaction is pending until confirmed on chain and thus
needs to be tracked. Additionally, it may be replaced by another
transaction using RBF with a higher fee. Hence, there may be more than
one pending FundingScope to track for a splice.

This commit adds support for tracking pending funding scopes. The
following commits will account for any pending scopes where applicable
(e.g., when handling commitment_signed).
A FundedChannel may have more than one pending FundingScope during
splicing, one for the splice attempt and one or more for any RBF
attempts. The counterparty will send a commitment_signed message for
each pending splice transaction and the current funding transaction.

Defer handling these commitment_signed messages until the entire batch
has arrived. Then validate them individually, also checking if all the
pending splice transactions and the current funding transaction have a
corresponding commitment_signed in the batch.
A FundedChannel may have more than one pending FundingScope during
splicing, one for the splice attempt and one or more for any RBF
attempts. When this is the case, send a commitment_signed message for
each FundingScope and include the necessary batch information (i.e.,
batch_size and funding_txid) to the counterparty.
A FundedChannel may have more than one pending FundingScope during
splicing, one for the splice attempt and one or more for any RBF
attempts. When calling get_available_balances, consider all funding
scopes and take the minimum by next_outbound_htlc_limit_msat. This is
used both informationally and to determine which channel to use to
forward an HTLC.

The choice of next_outbound_htlc_limit_msat is somewhat arbitrary but
matches the field used when determining which channel used to forward an
HTLC. Any field should do since each field should be adjusted by the
same amount relative to another FundingScope given the nature of the
fields (i.e., inbound/outbound capacity, min/max HTLC limit).

Using the minimum was chosen since an order for an HTLC to be sent over
the channel, it must be possible for each funding scope -- both the
confirmed one and any pending scopes, one of which may eventually
confirm.
@jkczyz jkczyz force-pushed the 2025-03-multiple-funding-scopes branch from 135278e to 84ea044 Compare March 25, 2025 18:56
@jkczyz
Copy link
Contributor Author

jkczyz commented Mar 25, 2025

Rebased on main.

@jkczyz jkczyz requested a review from TheBlueMatt March 25, 2025 18:58
@@ -4927,6 +4930,7 @@ pub(super) struct DualFundingChannelContext {
pub(super) struct FundedChannel<SP: Deref> where SP::Target: SignerProvider {
pub funding: FundingScope,
pending_funding: Vec<FundingScope>,
commitment_signed_batch: BTreeMap<Txid, msgs::CommitmentSigned>,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Dear god WHAT? Who thought this was a good idea? 🤦

Maybe we should un-fuck the protocol at the PeerManager level rather than doing it here? Its kinda insane that we have to keep a pending list of messages in a queue just to process them later, obviously the protocol should have sent them as one message, but absent that it seems like something the PeerManager should do - its logically one message with 5 parts on the wire, which seems like something we shouldn't have to care about in channel.rs but rather our message de-framing logic should properly de-frame the N CommitmentSigneds into a CommitmentSignedBatch.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FWIW, the reason given for this was the 65K message size limitation.

lightning/bolts#1160 (comment)

I can look into moving the logic to PeerManager.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Then we should add the ability to frame larger messages at the wire. The 64K limit is supposed to be a nice anti-DoS limit by ensuring you never really need a large buffer to store pending messages, but if we're gonna work around it by sending multiple messages which get stored in a pending message queue then the whole point is kinda moot.

If we don't the spec still needs to treat them as a logical message - no other messages should be allowed to come in between, and we need some kind of init/complete message before/after so that the PeerManager can handle it without protocol-specific logic.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So some concerns:

  • Currently, ChannelManager also checks the message's channel_id before delegating to Channel, so PeerManager would need to track batches by channel, too. Should we batch using PeerState in ChannelManager instead of in PeerManager then?
  • Should we have a "raw" CommitmentSigned type with the optional batch data used for parsing (as it currently is written) and one without the batch data where Channel is given either a single one or BTreeMap for the entire batch? (i.e., never expose the optional batch TLV to Channel -- only infer it from when the method taking BTreeMap is called).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FYI, wrote the last comment before seeing your latest comment.

core::iter::once(funding)
.chain(pending_funding.iter())
.map(|funding| self.get_available_balances_for_scope(funding, fee_estimator))
.min_by_key(|balances| balances.next_outbound_htlc_limit_msat)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this is correct in the case where our counterparty spliced-out some funds - our next_outbound_htlc_limit_msat might be the same across two splices but inbound_capacity_msat is lower on one.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmmm.. are we able to pick one as the AvailableBalance? Or do we need to merge them somehow?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ISTM we should always report the lowest available balance for each type of balance (maybe we can do that more cleverly, though?)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
weekly goal Someone wants to land this this week
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants