Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

askrene getroutes produces a malformed response #7904

Open
Lagrang3 opened this issue Dec 5, 2024 · 3 comments · May be fixed by #7924
Open

askrene getroutes produces a malformed response #7904

Lagrang3 opened this issue Dec 5, 2024 · 3 comments · May be fixed by #7924
Labels

Comments

@Lagrang3
Copy link
Collaborator

Lagrang3 commented Dec 5, 2024

Issue and Steps to Reproduce

In regtest the command lightning-cli getroutes produces this output:

lightning-cli: Malformed response '63x17618/1 -> 19801029msat/220 15x485x17484/0 -> 19800000msat/120 122x485x35423/1 -> 19797031msat/40 122x323x35357/0 -> 19793073x3301x1646/0 -> 35694180msat/74 0x122x2044/1 -> 35679909msat/40 122x323x35357/0 -> 35672775msat/6"}}5x63x17618/1 -> 19801029msat/220 15x485x17484/0 -> 19800000msat/120 122x485x35423/1 -> 19797031msat/40 122x323x35357/0 -> 19793073msat/6"}}

{"jsonrpc":"2.0","method":"message","params":{"id":"cli:getroutes#170504","level":"info","message":"Flow 2/4: 35712991msat/218 0x3301x1646/0 -> 35694180msat/74 0x122x2044/1 -> 35679909msat/40 122x323x35357/0 -> 35672775msat/6"}}x90x29364/1 -> 19801029msat/364 15x63x17618/1 -> 19801029msat/220 15x485x17484/0 -> 19800000msat/120 122x485x35423/1 -> 19797031msat/40 122x323x35357/0 -> 19793073msat/6"}}

{"jsonrpc":"2.0","method":"message","params":{"id":"cli:getroutes#170504","level":"info","message":"Flow 2/4: 35712991msat/218 0x3301x1646/0 -> 35694180msat/74 0x122x2044/1 -> 35679909msat/40 122x323x35357/0 -> 35672775msat/6"}}'

To reproduce it is enough to run lightning on regtest using the compressed gossmap ./tests/data/gossip-store-2024-09-22.compressed
and call getroutes from node 022d223620a359a47ff7f7ac447c85c46c923da53389221a0054c11c1e3ca31d59
to node 032592ebeab65d2c20a60b6eec9bc1b80df9148d2f71d91a670fe7f4f3c25a5bb9, ie.

lightning-cli -k getroutes \
        source=022d223620a359a47ff7f7ac447c85c46c923da53389221a0054c11c1e3ca31d59 \
        destination=032592ebeab65d2c20a60b6eec9bc1b80df9148d2f71d91a670fe7f4f3c25a5bb9 \
        amount_msat=100000sat \
        final_cltv=6 \
        layers="[]" \
        maxfee_msat=500sat

My guess is that somehow plugin_notify_message is mixing with getroutes response, are they racing for the same pipe output?
Commenting the plugin_notify_messsage line in rq_log function in askrene.c file removes the problem.

@Lagrang3 Lagrang3 added the bug label Dec 5, 2024
@rustyrussell
Copy link
Contributor

I cannot reproduce this? I tried both manually (though source=022d223620a359a47ff7f7ac447c85c46c923da53389221a0054c11c1e3ca31d59 doesn't exist in the compressed gossmap, that's l2's address) and as a Python call:

@pytest.mark.slow_test
def test_bad_response(node_factory, bitcoind):
    # From tests/data/gossip-store-2024-09-22-node-map.xz:
    # Me: 3301:024b9a1fa8e006f1e3937f65f66c408e6da8e1ca728ea43222a7381df1cc449605:BLUEIRON
    # So we make l2 node 3301.
    outfile = tempfile.NamedTemporaryFile(prefix='gossip-store-')
    nodeids = subprocess.check_output(['devtools/gossmap-compress',
                                       'decompress',
                                       'tests/data/gossip-store-2024-09-22.compressed',
                                       outfile.name]).decode('utf-8').splitlines()

    l1 = node_factory.get_node(gossip_store_file=outfile.name)
    subprocess.check_output(['cli/lightning-cli',
                             '--network={}'.format(TEST_NETWORK),
                             '--lightning-dir={}'
                             .format(l1.daemon.lightning_dir),
                             "getroutes",
                             f"source={nodeids[3301]}",
                             "destination=032592ebeab65d2c20a60b6eec9bc1b80df9148d2f71d91a670fe7f4f3c25a5bb9",
                             "amount_msat=100000sat",
                             "final_cltv=6",
                             "layers=[]",
                             "maxfee_msat=500sat"])

@Lagrang3
Copy link
Collaborator Author

Lagrang3 commented Dec 6, 2024

Yes. Sorry, in my tests I was mapping node 3301 to 022d223620a359a47ff7f7ac447c85c46c923da53389221a0054c11c1e3ca31d59.
I am quite not able to reproduce it on pytest.
I see it happen using a bash script:

#!/bin/sh

# environment taken from contrib/startup_regtest.sh

export GOSSMAP_COMPRESS=./devtools/gossmap-compress

# Do the Right Thing if we're currently in top of srcdir.
if [ -z "$LIGHTNING_BIN" ] && [ -x cli/lightning-cli ] && [ -x lightningd/lightningd ]; then
	LIGHTNING_BIN=$(pwd)
fi

if [ -z "$LIGHTNING_BIN" ]; then
	# Already installed maybe?  Prints
	if ! type lightning-cli >/dev/null 2>&1 ; then
		echo lightning-cli: not found
		return 1
	fi
	if ! type lightningd >/dev/null 2>&1 ; then
		echo lightningd: not found
		return 1
	fi
	LCLI=lightning-cli
	LIGHTNINGD=lightningd
else
	LCLI="$LIGHTNING_BIN"/cli/lightning-cli
	LIGHTNINGD="$LIGHTNING_BIN"/lightningd/lightningd
	# This mirrors "type" output above.
fi

if [ -z "$LIGHTNING_DIR" ]; then
    # Default is to use the /tmp directory
    LIGHTNING_DIR=/tmp/askrene_benchmark/lightning
fi

if [ -z "$BITCOIN_DIR" ]; then
    BITCOIN_DIR=/tmp/askrene_benchmark/bitcoin
fi

# shellcheck disable=SC2153
if [ -z "$BITCOIN_BIN" ]; then
	# Already installed maybe?  Prints
	if ! type bitcoin-cli >/dev/null 2>&1 ; then
		echo bitcoin-cli: not found
		return 1
	fi
	if ! type bitcoind >/dev/null 2>&1 ; then
		echo bitcoind: not found
		return 1
	fi
	BCLI=bitcoin-cli
	BITCOIND=bitcoind
else
    BCLI="$BITCOIN_BIN"/bitcoin-cli
    BITCOIND="$BITCOIN_BIN"/bitcoind
fi

if [ -z "$COMPRESSED_GOSSMAP" ]; then
	COMPRESSED_GOSSMAP="./tests/data/gossip-store-2024-09-22.compressed"
fi


echo lightning-cli is "$LCLI"
echo lightningd is "$LIGHTNINGD"
echo lightning-dir is "$LIGHTNING_DIR"
export LCLI="$LCLI"
export LIGHTNINGD="$LIGHTNINGD"
export LIGHTNING_DIR="$LIGHTNING_DIR"

echo bitcoin-cli is "$BCLI"
echo bitcoind is "$BITCOIND"
echo bitcoin-dir is "$BITCOIN_DIR"
export BCLI="$BCLI"
export BITCOIND="$BITCOIND"
export BITCOIN_DIR="$BITCOIN_DIR"

echo compressed gossmap is "$COMPRESSED_GOSSMAP"
export COMPRESSED_GOSSMAP="$COMPRESSED_GOSSMAP"

if test ! -f "$COMPRESSED_GOSSMAP"; then
	echo Compressed gossmap file not found, aborting
	return 1
fi

# generate gossmap file
# load a single lightning node using predefined gossmap

wait_for_lightningd() {
	echo awaiting lightningd...
	for i in $(seq "5"); do
		if $LCLI --lightning-dir="$LIGHTNING_DIR" getinfo > /dev/null 2>&1; then
			break
		else
			sleep 1
		fi
	done
}

start_lightningd() {
	socket=7070
	mkdir -p "$LIGHTNING_DIR"
	# Node config
	cat <<- EOF > "$LIGHTNING_DIR/config"
	network=regtest
	log-level=debug
	log-file=$LIGHTNING_DIR/log
	addr=localhost:$socket
	allow-deprecated-apis=false
	developer
	dev-fast-gossip
	dev-bitcoind-poll=5
	experimental-dual-fund
	experimental-splicing
	funder-policy=match
	funder-policy-mod=100
	funder-min-their-funding=10000
	funder-per-channel-max=100000
	funder-fuzz-percent=0
	funder-lease-requests-only=false
	lease-fee-base-sat=2sat
	lease-fee-basis=50
	invoices-onchain-fallback
	EOF

	# Start the lightning nodes
	test -f "$LIGHTNING_DIR/lightningd-regtest.pid" || \
		"$LIGHTNINGD" "--network=regtest" "--lightning-dir=$LIGHTNING_DIR" "--bitcoin-datadir=$BITCOIN_DIR" "--database-upgrade=true" &
}

ln_cli(){
	$LCLI --lightning-dir="$LIGHTNING_DIR" $@
}

start_btc(){
	# Start bitcoind in the background
	mkdir -p "$BITCOIN_DIR"
	test -f "$BITCOIN_DIR/regtest/bitcoind.pid" || \
		"$BITCOIND" -datadir="$BITCOIN_DIR" -regtest -txindex -fallbackfee=0.00000253 -daemon

	# Wait for it to start.
	while ! "$BCLI" -datadir="$BITCOIN_DIR" -regtest ping 2> /tmp/null; do echo "awaiting bitcoind..." && sleep 1; done

	# Check if default wallet exists
	if ! "$BCLI" -datadir="$BITCOIN_DIR" -regtest listwalletdir | jq -r '.wallets[] | .name' | grep -wqe 'default' ; then
		# wallet dir does not exist, create one
		echo "Making \"default\" bitcoind wallet."
		"$BCLI" -datadir="$BITCOIN_DIR" -regtest createwallet default >/dev/null 2>&1
	fi

	# Check if default wallet is loaded
	if ! "$BCLI" -datadir="$BITCOIN_DIR" -regtest listwallets | jq -r '.[]' | grep -wqe 'default' ; then
		echo "Loading \"default\" bitcoind wallet."
		"$BCLI" -datadir="$BITCOIN_DIR" -regtest loadwallet default >/dev/null 2>&1
	fi

	# Kick it out of initialblockdownload if necessary
	if "$BCLI" -datadir="$BITCOIN_DIR" -regtest getblockchaininfo | grep -q 'initialblockdownload.*true'; then
		"$BCLI" -datadir="$BITCOIN_DIR" -regtest generatetoaddress 1 "$($BCLI -datadir="$BITCOIN_DIR" -regtest getnewaddress)" > /dev/null
	fi

	echo "Commands: "
	echo "	bt_cli"
}
bt_cli(){
	$BCLI -datadir="$BITCOIN_DIR" -regtest $@
}

generate_gossmap(){
	mkdir -p "$LIGHTNING_DIR/regtest"
	"$GOSSMAP_COMPRESS" decompress  \
		"$COMPRESSED_GOSSMAP" "$LIGHTNING_DIR/regtest/gossip_store"
}

NODEIDS=""

start_cln(){
	GENIDS=`generate_gossmap`
	readarray -t NODEIDS <<< "$GENIDS"
	start_lightningd
	wait_for_lightningd

	echo "Commands: "
	echo "	ln_cli"
}

stop_cln() {
	ln_cli stop
	sleep 1
	test ! -f "$LIGHTNING_DIR/lightningd-regtest.pid" || \
		(kill "$(cat "$LIGHTNING_DIR/lightningd-regtest.pid")"; \
		rm "$LIGHTNING_DIR/lightningd-regtest.pid")
}

stop_btc(){
	bt_cli stop
	sleep 1
	test ! -f "$BITCOIN_DIR/regtest/bitcoind.pid" || \
		(kill "$(cat "$BITCOIN_DIR/regtest/bitcoind.pid")"; \
		rm "$BITCOIN_DIR/regtest/bitcoind.pid")
}

start_ln(){
	start_btc
	start_cln
}

stop_ln(){
	stop_cln
	stop_btc
}

destroy_ln(){
	rm -rf $LIGHTNING_DIR
	rm -rf $BITCOIN_DIR
}

cleanup(){
	stop_ln
	destroy_ln
}

trap cleanup EXIT

echo Useful commands:
echo "  stop_ln: shutdown"
echo "  destroy_ln: remove ln directories"

start_ln

source=${NODEIDS[3301]}

for idx in "${!NODEIDS[@]}"; do
	destination=${NODEIDS[$idx]}
	echo ID: $idx, source=$source, destination=$destination
	result=`ln_cli -k getroutes source=$source destination=$destination amount_msat=100000sat final_cltv=6 layers="[]" maxfee_msat=500sat`
	if [ $idx -ge 500 ]; then
		break
	fi
done

stop_ln

# time_list=`cat "$LIGHTNING_DIR/log" | grep "notify msg info: minflow elapsed time" | awk '{print $(NF-1)}'`
# echo $time_list | python3 plot-perf.py

destroy_ln

@Lagrang3
Copy link
Collaborator Author

Lagrang3 commented Dec 10, 2024

It's lightning-cli

==143570== Memcheck, a memory error detector
==143570== Copyright (C) 2002-2022, and GNU GPL'd, by Julian Seward et al.
==143570== Using Valgrind-3.19.0 and LibVEX; rerun with -h for copyright info
==143570== Command: /home/lagrange/BACKUP/l4-appdata/github/lagrang3/lightning/cli/lightning-cli --lightning-dir=/tmp/askrene_benchmark/lightning -k getroutes source=032ed0d87ba2bd68e3a386717cf2faaae4fa6d6da247986b1997113930e4f841d5 destination=03b2f16bf472dd03c55c2ce9910aab717321db4489cd87df5225adadb08031da4b amount_msat=100000sat final_cltv=6 layers=[] maxfee_msat=500sat
==143570== 
==143570== Invalid read of size 1
==143570==    at 0x484A430: memmove (vg_replace_strmem.c:1382)
==143570==    by 0x10C3D2: main (lightning-cli.c:871)
==143570==  Address 0x4a62f80 is 0 bytes after a block of size 1,040 alloc'd
==143570==    at 0x48407B4: malloc (vg_replace_malloc.c:381)
==143570==    by 0x11402E: allocate (tal.c:256)
==143570==    by 0x11471E: tal_alloc_ (tal.c:473)
==143570==    by 0x1147EA: tal_alloc_arr_ (tal.c:517)
==143570==    by 0x10C206: main (lightning-cli.c:816)
==143570== 
==143570== Invalid read of size 1
==143570==    at 0x484A43D: memmove (vg_replace_strmem.c:1382)
==143570==    by 0x10C3D2: main (lightning-cli.c:871)
==143570==  Address 0x4a62f81 is 1 bytes after a block of size 1,040 alloc'd
==143570==    at 0x48407B4: malloc (vg_replace_malloc.c:381)
==143570==    by 0x11402E: allocate (tal.c:256)
==143570==    by 0x11471E: tal_alloc_ (tal.c:473)
==143570==    by 0x1147EA: tal_alloc_arr_ (tal.c:517)
==143570==    by 0x10C206: main (lightning-cli.c:816)
==143570== 
==143570== Invalid write of size 1
==143570==    at 0x484A433: memmove (vg_replace_strmem.c:1382)
==143570==    by 0x10C3D2: main (lightning-cli.c:871)
==143570==  Address 0x4a62f80 is 0 bytes after a block of size 1,040 alloc'd
==143570==    at 0x48407B4: malloc (vg_replace_malloc.c:381)
==143570==    by 0x11402E: allocate (tal.c:256)
==143570==    by 0x11471E: tal_alloc_ (tal.c:473)
==143570==    by 0x1147EA: tal_alloc_arr_ (tal.c:517)
==143570==    by 0x10C206: main (lightning-cli.c:816)

@Lagrang3 Lagrang3 linked a pull request Dec 10, 2024 that will close this issue
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants