Skip to content

Commit 93e7272

Browse files
committed
Generate new ReleasePaymentComplete monitor updates
`MonitorEvent`s aren't delivered to the `ChannelManager` in a durable fasion - if the `ChannelManager` fetches the pending `MonitorEvent`s, then the `ChannelMonitor` gets persisted (i.e. due to a block update) then the node crashes, prior to persisting the `ChannelManager` again, the `MonitorEvent` and its effects on the `ChannelManger` will be lost. This isn't likely in a sync persist environment, but in an async one this could be an issue. Note that this is only an issue for closed channels - `MonitorEvent`s only inform the `ChannelManager` that a channel is closed (which the `ChannelManager` will learn on startup or when it next tries to advance the channel state), that `ChannelMonitorUpdate` writes completed (which the `ChannelManager` will detect on startup), or that HTLCs resolved on-chain post closure. Of the three, only the last is problematic to lose prior to a reload. In previous commits we ensured that HTLC resolutions which came to `ChannelManager` via a `MonitorEvent` were replayed on startup if the `MonitorEvent` was lost. However, in cases where the `ChannelManager` was so stale that it didn't have the payment state for an HTLC at all, we only re-add it in cases where `ChannelMonitor::get_pending_or_resolved_outbound_htlcs` includes it. Because constantly re-adding a payment state and then failing it would generate lots of noise for users on startup (not to mention risk of confusing stale payment events for the latest state of a payment when the `PaymentId` has been reused to retry a payment). Thus, `get_pending_or_resolved_outbound_htlcs` does not include state for HTLCs which were resolved on chain with a preimage or HTLCs which were resolved on chain with a timeout after `ANTI_REORG_DELAY` confirmations. This critera matches the critera for generating a `MonitorEvent`, and works great under the assumption that `MonitorEvent`s are reliably delivered. However, if they are not, and our `ChannelManager` is lost or substantially old (or, in a future where we do not persist `ChannelManager` at all), we will not end up seeing payment resolution events for an HTLC. Instead, we really want to tell our `ChannelMonitor`s when the resolution of an HTLC is complete. Note that we don't particularly care about non-payment HTLCs, as there is no re-hydration of state to do there - `ChannelManager` load ignores forwarded HTLCs coming back from `get_pending_or_resolved_outbound_htlcs` as there's nothing to do - we always attempt to replay the success/failure and figure out if it mattered based on whether there was still an HTLC to claim/fail. Here we begin generating the new `ChannelMonitorUpdateStep::ReleasePaymentComplete` updates, updating functional tests for the new `ChannelMonitorUpdate`s where required.
1 parent f8a783a commit 93e7272

9 files changed

+276
-49
lines changed

lightning/src/chain/channelmonitor.rs

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -4119,6 +4119,7 @@ impl<Signer: EcdsaChannelSigner> ChannelMonitorImpl<Signer> {
41194119
if updates.update_id == LEGACY_CLOSED_CHANNEL_UPDATE_ID || self.lockdown_from_offchain {
41204120
assert_eq!(updates.updates.len(), 1);
41214121
match updates.updates[0] {
4122+
ChannelMonitorUpdateStep::ReleasePaymentComplete { .. } => {},
41224123
ChannelMonitorUpdateStep::ChannelForceClosed { .. } => {},
41234124
// We should have already seen a `ChannelForceClosed` update if we're trying to
41244125
// provide a preimage at this point.

lightning/src/ln/chanmon_update_fail_tests.rs

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -3870,6 +3870,7 @@ fn do_test_durable_preimages_on_closed_channel(
38703870
};
38713871
nodes[0].node.force_close_broadcasting_latest_txn(&chan_id_ab, &node_b_id, err_msg).unwrap();
38723872
check_closed_event(&nodes[0], 1, reason, false, &[node_b_id], 100000);
3873+
check_added_monitors(&nodes[0], 1);
38733874
let as_closing_tx = nodes[0].tx_broadcaster.txn_broadcasted.lock().unwrap().split_off(0);
38743875
assert_eq!(as_closing_tx.len(), 1);
38753876

lightning/src/ln/channelmanager.rs

Lines changed: 139 additions & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -1267,6 +1267,9 @@ pub(crate) enum EventCompletionAction {
12671267
channel_funding_outpoint: Option<OutPoint>,
12681268
channel_id: ChannelId,
12691269
},
1270+
1271+
/// Note that this action will be dropped on downgrade to LDK prior to 0.2!
1272+
ReleasePaymentCompleteChannelMonitorUpdate(PaymentCompleteUpdate),
12701273
}
12711274
impl_writeable_tlv_based_enum!(EventCompletionAction,
12721275
(0, ReleaseRAAChannelMonitorUpdate) => {
@@ -1279,6 +1282,7 @@ impl_writeable_tlv_based_enum!(EventCompletionAction,
12791282
ChannelId::v1_from_funding_outpoint(channel_funding_outpoint.unwrap())
12801283
})),
12811284
}
1285+
{1, ReleasePaymentCompleteChannelMonitorUpdate} => (),
12821286
);
12831287

12841288
/// The source argument which is passed to [`ChannelManager::claim_mpp_part`].
@@ -8013,7 +8017,17 @@ where
80138017
// rare cases where a MonitorUpdate is replayed after restart because a
80148018
// ChannelMonitor wasn't persisted after it was applied (even though the
80158019
// ChannelManager was).
8016-
// TODO
8020+
// For such cases, we also check that there's no existing pending event to
8021+
// complete this action already, which we let finish instead.
8022+
let action =
8023+
EventCompletionAction::ReleasePaymentCompleteChannelMonitorUpdate(update);
8024+
let have_action = {
8025+
let pending_events = self.pending_events.lock().unwrap();
8026+
pending_events.iter().any(|(_, act)| act.as_ref() == Some(&action))
8027+
};
8028+
if !have_action {
8029+
self.handle_post_event_actions([action]);
8030+
}
80178031
}
80188032
},
80198033
HTLCSource::PreviousHopData(HTLCPreviousHopData {
@@ -8642,19 +8656,34 @@ This indicates a bug inside LDK. Please report this error at https://github.com/
86428656
next_user_channel_id: Option<u128>, attribution_data: Option<AttributionData>,
86438657
send_timestamp: Option<Duration>,
86448658
) {
8659+
debug_assert_eq!(
8660+
startup_replay,
8661+
!self.background_events_processed_since_startup.load(Ordering::Acquire)
8662+
);
8663+
let htlc_id = SentHTLCId::from_source(&source);
86458664
match source {
86468665
HTLCSource::OutboundRoute {
86478666
session_priv, payment_id, path, bolt12_invoice, ..
86488667
} => {
8649-
debug_assert!(self.background_events_processed_since_startup.load(Ordering::Acquire),
8668+
debug_assert!(!startup_replay,
86508669
"We don't support claim_htlc claims during startup - monitors may not be available yet");
86518670
debug_assert_eq!(next_channel_counterparty_node_id, path.hops[0].pubkey);
8652-
let mut ev_completion_action =
8671+
8672+
let mut ev_completion_action = if from_onchain {
8673+
let release = PaymentCompleteUpdate {
8674+
counterparty_node_id: next_channel_counterparty_node_id,
8675+
channel_funding_outpoint: next_channel_outpoint,
8676+
channel_id: next_channel_id,
8677+
htlc_id,
8678+
};
8679+
Some(EventCompletionAction::ReleasePaymentCompleteChannelMonitorUpdate(release))
8680+
} else {
86538681
Some(EventCompletionAction::ReleaseRAAChannelMonitorUpdate {
86548682
channel_funding_outpoint: Some(next_channel_outpoint),
86558683
channel_id: next_channel_id,
86568684
counterparty_node_id: path.hops[0].pubkey,
8657-
});
8685+
})
8686+
};
86588687
self.pending_outbound_payments.claim_htlc(
86598688
payment_id,
86608689
payment_preimage,
@@ -11372,12 +11401,18 @@ This indicates a bug inside LDK. Please report this error at https://github.com/
1137211401
channel_id,
1137311402
};
1137411403
let reason = HTLCFailReason::from_failure_code(failure_reason);
11404+
let completion_update = Some(PaymentCompleteUpdate {
11405+
counterparty_node_id,
11406+
channel_funding_outpoint: funding_outpoint,
11407+
channel_id,
11408+
htlc_id: SentHTLCId::from_source(&htlc_update.source),
11409+
});
1137511410
self.fail_htlc_backwards_internal(
1137611411
&htlc_update.source,
1137711412
&htlc_update.payment_hash,
1137811413
&reason,
1137911414
receiver,
11380-
None,
11415+
completion_update,
1138111416
);
1138211417
}
1138311418
},
@@ -12864,8 +12899,62 @@ where
1286412899
channel_id,
1286512900
counterparty_node_id,
1286612901
} => {
12902+
let startup_complete =
12903+
self.background_events_processed_since_startup.load(Ordering::Acquire);
12904+
debug_assert!(startup_complete);
1286712905
self.handle_monitor_update_release(counterparty_node_id, channel_id, None);
1286812906
},
12907+
EventCompletionAction::ReleasePaymentCompleteChannelMonitorUpdate(
12908+
PaymentCompleteUpdate {
12909+
counterparty_node_id,
12910+
channel_funding_outpoint,
12911+
channel_id,
12912+
htlc_id,
12913+
},
12914+
) => {
12915+
let per_peer_state = self.per_peer_state.read().unwrap();
12916+
let mut peer_state = per_peer_state
12917+
.get(&counterparty_node_id)
12918+
.map(|state| state.lock().unwrap())
12919+
.expect("Channels originating a preimage must have peer state");
12920+
let update_id = peer_state
12921+
.closed_channel_monitor_update_ids
12922+
.get_mut(&channel_id)
12923+
.expect("Channels originating a preimage must have a monitor");
12924+
*update_id += 1;
12925+
12926+
let update = ChannelMonitorUpdate {
12927+
update_id: *update_id,
12928+
channel_id: Some(channel_id),
12929+
updates: vec![ChannelMonitorUpdateStep::ReleasePaymentComplete {
12930+
htlc: htlc_id,
12931+
}],
12932+
};
12933+
12934+
let during_startup =
12935+
!self.background_events_processed_since_startup.load(Ordering::Acquire);
12936+
if during_startup {
12937+
let event = BackgroundEvent::MonitorUpdateRegeneratedOnStartup {
12938+
counterparty_node_id,
12939+
funding_txo: channel_funding_outpoint,
12940+
channel_id,
12941+
update,
12942+
};
12943+
self.pending_background_events.lock().unwrap().push(event);
12944+
} else {
12945+
handle_new_monitor_update!(
12946+
self,
12947+
channel_funding_outpoint,
12948+
update,
12949+
peer_state,
12950+
peer_state,
12951+
per_peer_state,
12952+
counterparty_node_id,
12953+
channel_id,
12954+
POST_CHANNEL_CLOSE
12955+
);
12956+
}
12957+
},
1286912958
}
1287012959
}
1287112960
}
@@ -16557,6 +16646,7 @@ where
1655716646
monitor,
1655816647
Some(htlc.payment_hash),
1655916648
);
16649+
let htlc_id = SentHTLCId::from_source(&htlc_source);
1656016650
match htlc_source {
1656116651
HTLCSource::PreviousHopData(prev_hop_data) => {
1656216652
let pending_forward_matches_htlc = |info: &PendingAddHTLCInfo| {
@@ -16614,6 +16704,15 @@ where
1661416704
} => {
1661516705
if let Some(preimage) = preimage_opt {
1661616706
let pending_events = Mutex::new(pending_events_read);
16707+
let update = PaymentCompleteUpdate {
16708+
counterparty_node_id: monitor.get_counterparty_node_id(),
16709+
channel_funding_outpoint: monitor.get_funding_txo(),
16710+
channel_id: monitor.channel_id(),
16711+
htlc_id,
16712+
};
16713+
let mut compl_action = Some(
16714+
EventCompletionAction::ReleasePaymentCompleteChannelMonitorUpdate(update)
16715+
);
1661716716
// Note that we set `from_onchain` to "false" here,
1661816717
// deliberately keeping the pending payment around forever.
1661916718
// Given it should only occur when we have a channel we're
@@ -16622,15 +16721,6 @@ where
1662216721
// generating a `PaymentPathSuccessful` event but regenerating
1662316722
// it and the `PaymentSent` on every restart until the
1662416723
// `ChannelMonitor` is removed.
16625-
let mut compl_action = Some(
16626-
EventCompletionAction::ReleaseRAAChannelMonitorUpdate {
16627-
channel_funding_outpoint: Some(
16628-
monitor.get_funding_txo(),
16629-
),
16630-
channel_id: monitor.channel_id(),
16631-
counterparty_node_id: path.hops[0].pubkey,
16632-
},
16633-
);
1663416724
pending_outbounds.claim_htlc(
1663516725
payment_id,
1663616726
preimage,
@@ -16642,6 +16732,41 @@ where
1664216732
&pending_events,
1664316733
&&logger,
1664416734
);
16735+
// If the completion action was not consumed, then there was no
16736+
// payment to claim, and we need to tell the `ChannelMonitor`
16737+
// we don't need to hear about the HTLC again, at least as long
16738+
// as the PaymentSent event isn't still sitting around in our
16739+
// event queue.
16740+
let have_action = if compl_action.is_some() {
16741+
let pending_events = pending_events.lock().unwrap();
16742+
pending_events.iter().any(|(_, act)| *act == compl_action)
16743+
} else {
16744+
false
16745+
};
16746+
if !have_action && compl_action.is_some() {
16747+
let mut peer_state = per_peer_state
16748+
.get(&counterparty_node_id)
16749+
.map(|state| state.lock().unwrap())
16750+
.expect("Channels originating a preimage must have peer state");
16751+
let update_id = peer_state
16752+
.closed_channel_monitor_update_ids
16753+
.get_mut(channel_id)
16754+
.expect("Channels originating a preimage must have a monitor");
16755+
*update_id += 1;
16756+
16757+
pending_background_events.push(BackgroundEvent::MonitorUpdateRegeneratedOnStartup {
16758+
counterparty_node_id: monitor.get_counterparty_node_id(),
16759+
funding_txo: monitor.get_funding_txo(),
16760+
channel_id: monitor.channel_id(),
16761+
update: ChannelMonitorUpdate {
16762+
update_id: *update_id,
16763+
channel_id: Some(monitor.channel_id()),
16764+
updates: vec![ChannelMonitorUpdateStep::ReleasePaymentComplete {
16765+
htlc: htlc_id,
16766+
}],
16767+
},
16768+
});
16769+
}
1664516770
pending_events_read = pending_events.into_inner().unwrap();
1664616771
}
1664716772
},

lightning/src/ln/functional_test_utils.rs

Lines changed: 15 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -2803,6 +2803,9 @@ pub fn expect_payment_sent<CM: AChannelManager, H: NodeHolder<CM = CM>>(
28032803
expected_fee_msat_opt: Option<Option<u64>>, expect_per_path_claims: bool,
28042804
expect_post_ev_mon_update: bool,
28052805
) -> (Option<PaidBolt12Invoice>, Vec<Event>) {
2806+
if expect_post_ev_mon_update {
2807+
check_added_monitors(node, 0);
2808+
}
28062809
let events = node.node().get_and_clear_pending_events();
28072810
let expected_payment_hash = PaymentHash(
28082811
bitcoin::hashes::sha256::Hash::hash(&expected_payment_preimage.0).to_byte_array(),
@@ -3052,6 +3055,7 @@ pub struct PaymentFailedConditions<'a> {
30523055
pub(crate) expected_blamed_chan_closed: Option<bool>,
30533056
pub(crate) expected_mpp_parts_remain: bool,
30543057
pub(crate) retry_expected: bool,
3058+
pub(crate) from_mon_update: bool,
30553059
}
30563060

30573061
impl<'a> PaymentFailedConditions<'a> {
@@ -3062,6 +3066,7 @@ impl<'a> PaymentFailedConditions<'a> {
30623066
expected_blamed_chan_closed: None,
30633067
expected_mpp_parts_remain: false,
30643068
retry_expected: false,
3069+
from_mon_update: false,
30653070
}
30663071
}
30673072
pub fn mpp_parts_remain(mut self) -> Self {
@@ -3086,6 +3091,10 @@ impl<'a> PaymentFailedConditions<'a> {
30863091
self.retry_expected = true;
30873092
self
30883093
}
3094+
pub fn from_mon_update(mut self) -> Self {
3095+
self.from_mon_update = true;
3096+
self
3097+
}
30893098
}
30903099

30913100
#[cfg(any(test, feature = "_externalize_tests"))]
@@ -3195,7 +3204,13 @@ pub fn expect_payment_failed_conditions<'a, 'b, 'c, 'd, 'e>(
31953204
node: &'a Node<'b, 'c, 'd>, expected_payment_hash: PaymentHash,
31963205
expected_payment_failed_permanently: bool, conditions: PaymentFailedConditions<'e>,
31973206
) {
3207+
if conditions.from_mon_update {
3208+
check_added_monitors(node, 0);
3209+
}
31983210
let events = node.node.get_and_clear_pending_events();
3211+
if conditions.from_mon_update {
3212+
check_added_monitors(node, 1);
3213+
}
31993214
expect_payment_failed_conditions_event(
32003215
events,
32013216
expected_payment_hash,

0 commit comments

Comments
 (0)