Skip to content

lightning-net-tokio does not seem to properly handle TCP closure #4205

@ZmnSCPxj-jr

Description

@ZmnSCPxj-jr

Referring to this:

https://blog.netherlabs.nl/articles/2009/01/18/the-ultimate-so_linger-page-or-why-is-my-tcp-not-reliable

The "correct" general pattern, as noted above, is:

  • shutdown(fd, SHUT_WR) -- in Tokio terms, use AsyncWriteExt::shutdown
  • With a timeout:
    • read() in a loop until you get EOF.
  • close() it.

While the BOLT8 protocol DOES provide lengths of expected data, the problem is if the length specifier message AND the subsequent data are both in the same packet scheduled by the kernel --- in that case, both of the length-prefix and the actual message can be disposed by the kernel on close() without being received by the receiving end in case of a clean disconnection request.

This can affect integration tests where we might cause clean disconnection from clean shutdown, which can cause close() in this case that will drop the last message sent to the peer, which now fails the integration test because it was waiting for a message that never comes because of the early close() without the shutdown(). (This really happened over in c-lightning a half-decade ago).

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions