Skip to content

Add new custom channels itest liquidity edge case for safe HTLC failure #1031

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 4 commits into
base: group-key-support
Choose a base branch
from

Conversation

GeorgeTsagk
Copy link
Member

Description

This new custom channels liquidity edge case tests that an asset HTLC with more units than the local balance may not get added to the channel.

Based on
lightninglabs/taproot-assets#1462
lightningnetwork/lnd#9687

@GeorgeTsagk GeorgeTsagk self-assigned this Apr 9, 2025
@GeorgeTsagk GeorgeTsagk requested review from guggero and ffranr April 14, 2025 10:01
@GeorgeTsagk
Copy link
Member Author

Will update dependencies to latest commit once they get merged.
Latest changes on the base PRs have mostly been cosmetic

Copy link
Member

@guggero guggero left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't thing we're asserting what we really want to assert.

@guggero
Copy link
Member

guggero commented Apr 24, 2025

Probably needs to be rebased on top of #987 to fully work (but then the first commit can be dropped).

This is definitely closer to what I had in mind. But I think we can do even better (using this we can also get rid of the previous use of withPayErrSubStr("context deadline exceeded") that currently takes a full minute to time out):

diff --git a/itest/assets_test.go b/itest/assets_test.go
index 566e00aa..4e89d485 100644
--- a/itest/assets_test.go
+++ b/itest/assets_test.go
@@ -2567,7 +2567,7 @@ func assertMinNumHtlcs(t *testing.T, node *HarnessNode, expected int) {
                if numHtlcs < expected {
                        return fmt.Errorf("expected %v HTLCs, got %v, %v",
                                expected, numHtlcs,
-                               spew.Sdump(toProtoJSON(t, listChansResp)))
+                               toProtoJSON(t, listChansResp))
                }
 
                return nil
@@ -2575,6 +2575,112 @@ func assertMinNumHtlcs(t *testing.T, node *HarnessNode, expected int) {
        require.NoError(t, err)
 }
 
+type subscribeEventsClient = routerrpc.Router_SubscribeHtlcEventsClient
+
+type htlcEventConfig struct {
+       timeout           time.Duration
+       numEvents         int
+       withLinkFailure   bool
+       withFailureDetail routerrpc.FailureDetail
+}
+
+func defaultHtlcEventConfig() *htlcEventConfig {
+       return &htlcEventConfig{
+               timeout: defaultTimeout,
+       }
+}
+
+type htlcEventOpt func(*htlcEventConfig)
+
+func withTimeout(timeout time.Duration) htlcEventOpt {
+       return func(config *htlcEventConfig) {
+               config.timeout = timeout
+       }
+}
+
+func withNumEvents(numEvents int) htlcEventOpt {
+       return func(config *htlcEventConfig) {
+               config.numEvents = numEvents
+       }
+}
+
+func withLinkFailure(detail routerrpc.FailureDetail) htlcEventOpt {
+       return func(config *htlcEventConfig) {
+               config.withLinkFailure = true
+               config.withFailureDetail = detail
+       }
+}
+
+func assertHtlcEvents(t *testing.T, c subscribeEventsClient,
+       opts ...htlcEventOpt) {
+
+       t.Helper()
+
+       cfg := defaultHtlcEventConfig()
+       for _, opt := range opts {
+               opt(cfg)
+       }
+
+       timeout := time.After(cfg.timeout)
+       events := make(chan *routerrpc.HtlcEvent)
+
+       go func() {
+               defer close(events)
+
+               for {
+                       evt, err := c.Recv()
+                       if err != nil {
+                               t.Logf("Received HTLC event error: %v", err)
+                               return
+                       }
+
+                       select {
+                       case events <- evt:
+                       case <-timeout:
+                               t.Logf("Htlc event receive timeout")
+                               return
+                       }
+               }
+       }()
+
+       var numEvents int
+       for {
+               type linkFailEvent = *routerrpc.HtlcEvent_LinkFailEvent
+
+               select {
+               case evt, ok := <-events:
+                       if !ok {
+                               t.Fatalf("Htlc event stream closed")
+                               return
+                       }
+
+                       if cfg.withLinkFailure {
+                               linkEvent, ok := evt.Event.(linkFailEvent)
+                               if !ok {
+                                       // We only count link failure events.
+                                       continue
+                               }
+
+                               if linkEvent.LinkFailEvent.FailureDetail !=
+                                       cfg.withFailureDetail {
+
+                                       continue
+                               }
+                       }
+
+                       numEvents++
+
+                       if numEvents == cfg.numEvents {
+                               return
+                       }
+
+               case <-timeout:
+                       t.Fatalf("Htlc event receive timeout")
+                       return
+               }
+       }
+}
+
 func assertNumHtlcs(t *testing.T, node *HarnessNode, expected int) {
        t.Helper()
 
diff --git a/itest/litd_custom_channels_test.go b/itest/litd_custom_channels_test.go
index e1984a56..071a3064 100644
--- a/itest/litd_custom_channels_test.go
+++ b/itest/litd_custom_channels_test.go
@@ -30,6 +30,7 @@ import (
        "github.com/lightningnetwork/lnd/fn"
        "github.com/lightningnetwork/lnd/lnrpc"
        "github.com/lightningnetwork/lnd/lnrpc/invoicesrpc"
+       "github.com/lightningnetwork/lnd/lnrpc/routerrpc"
        "github.com/lightningnetwork/lnd/lnrpc/walletrpc"
        "github.com/lightningnetwork/lnd/lntest"
        "github.com/lightningnetwork/lnd/lntest/node"
@@ -2601,12 +2602,12 @@ func testCustomChannelsBreach(ctx context.Context, net *NetworkHarness,
        t.Logf("Charlie UTXOs after breach: %v", toProtoJSON(t.t, charlieUTXOs))
 }
 
-// testCustomChannelsLiquidtyEdgeCasesCore is the core logic of the liquidity
+// testCustomChannelsLiquidityEdgeCasesCore is the core logic of the liquidity
 // edge cases. This test goes through certain scenarios that expose edge cases
 // and behaviors that proved to be buggy in the past and have been directly
 // addressed. It accepts an extra parameter which dictates whether it should use
 // group keys or asset IDs.
-func testCustomChannelsLiquidtyEdgeCasesCore(ctx context.Context,
+func testCustomChannelsLiquidityEdgeCasesCore(ctx context.Context,
        net *NetworkHarness, t *harnessTest, groupMode bool) {
 
        lndArgs := slices.Clone(lndArgsTemplate)
@@ -3114,6 +3115,11 @@ func testCustomChannelsLiquidtyEdgeCasesCore(ctx context.Context,
        // We now create a hodl invoice on Fabia, for 125k assets.
        hodlInv = createAssetHodlInvoice(t.t, erin, fabia, 125_000, assetID)
 
+       htlcStream, err := erin.RouterClient.SubscribeHtlcEvents(
+               ctx, &routerrpc.SubscribeHtlcEventsRequest{},
+       )
+       require.NoError(t.t, err)
+
        // Charlie tries to pay, this is not meant to succeed, as Erin does not
        // have enough assets to forward to Fabia.
        payInvoiceWithAssets(
@@ -3126,6 +3132,13 @@ func testCustomChannelsLiquidtyEdgeCasesCore(ctx context.Context,
        // outgoing one. So we expect a minimum of 4 HTLCs present on Erin.
        assertMinNumHtlcs(t.t, erin, 4)
 
+       // We also want to make sure that at least one failure occurred that
+       // hinted at the problem (not enough assets to forward).
+       assertHtlcEvents(
+               t.t, htlcStream, withNumEvents(1),
+               withLinkFailure(routerrpc.FailureDetail_INSUFFICIENT_BALANCE),
+       )
+
        logBalance(t.t, nodes, assetID, "with min 4 present HTLCs")
 
        // Now Fabia cancels the invoice, this is meant to cancel back any
@@ -3164,7 +3177,7 @@ func testCustomChannelsLiquidityEdgeCases(ctx context.Context,
 
        // Run liquidity edge cases and only use single asset IDs for invoices
        // and payments.
-       testCustomChannelsLiquidtyEdgeCasesCore(ctx, net, t, false)
+       testCustomChannelsLiquidityEdgeCasesCore(ctx, net, t, false)
 }
 
 // testCustomChannelsLiquidityEdgeCasesGroup is a test that runs through some
@@ -3174,7 +3187,7 @@ func testCustomChannelsLiquidityEdgeCasesGroup(ctx context.Context,
 
        // Run liquidity edge cases and only use group keys for invoices and
        // payments.
-       testCustomChannelsLiquidtyEdgeCasesCore(ctx, net, t, true)
+       testCustomChannelsLiquidityEdgeCasesCore(ctx, net, t, true)
 }
 
 // testCustomChannelsStrictForwarding is a test that tests the strict forwarding

@GeorgeTsagk GeorgeTsagk changed the base branch from master to group-key-support April 24, 2025 12:16
@GeorgeTsagk
Copy link
Member Author

Thanks for the extra assert recommendation @guggero, enhanced the itest with it

Also changed base to group-key-support. Can wait for base branch to merge first before merging this

@guggero
Copy link
Member

guggero commented Apr 24, 2025

Looks good! Had a formatting error in my message above, but this would still be nice to shave off two full minutes from the itest run time:

(using this we can also get rid of the previous use of withPayErrSubStr("context deadline exceeded") that currently takes a full minute to time out)

@guggero guggero force-pushed the group-key-support branch 3 times, most recently from 2ce063a to 23f300d Compare April 25, 2025 08:17
This new helper will be useful to check that a payment attempt produced
actual HTLCs that got added over a channel. This will be used in a
follow-up commit to verify that the attempt was truly carried out.
To further extend the check of whether an HTLC was added over a channel
we add a helper that checks for HTLC events. This will also be used in
the follow-up commit.
We enhance the liquidity edge cases itest with a new case which makes a
payment attempt over a last-hop that uses assets, and does not have
sufficient asset liquidity. The healthy expectation is for the HTLCs to
be added, as there exists partial liquidity, but they should eventually
fail. As a sanity check we then end things with a small payment that can
succeed, and assert its completion.
@GeorgeTsagk
Copy link
Member Author

GeorgeTsagk commented Apr 25, 2025

I found an easier way to shave off those extra seconds, simply by lowering the amounts of the quote/invoice leading to a faster failure.

Also added a few extra HTLC related asserts

This change is isolated in the last commit

Copy link
Member

@guggero guggero left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the update, LGTM 🎉

We lower the amounts for the manual rfq quote and the corresponding
invoice in order to make the payment fail faster. Previously we'd get
many layers of sharding over the payment which would lead to a timeout,
so we'd wait for that. Now with the lower amounts we're hitting the min
shard amount faster, leading to a NoRoute failure.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants