Skip to content

Regression: WebAssembly backend does not generate return_call for musttail calls with multiple return values #133292

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
camilstaps opened this issue Mar 27, 2025 · 11 comments

Comments

@camilstaps
Copy link

camilstaps commented Mar 27, 2025

Given x.ll

%t = type { i64, i64 }

declare %t @f()

define private %t @g() {
entry:
  %0 = musttail call %t @f()
  ret %t %0
}

the WebAssembly backend does not generate a return_call but a plain call, thus growing the stack.

I'm using LLVM 20.1.2 with the invocation llc-20 -mattr=+multivalue,+tail-call -mtriple=wasm64-unknown-unknown --filetype=asm -o x.wat x.ll, but the same issue arises on wasm32 and without multivalue enabled.

When changing the type to %t = type i64, return_call is used as expected.

The (faulty) generated WebAssembly for the module above is:

	.file	"x.ll"
	.globaltype	__stack_pointer, i64
	.functype	f (i64) -> ()
	.functype	.Lg (i64) -> ()
	.section	.text..Lg,"",@
	.type	.Lg,@function                   # -- Begin function g
.Lg:                                    # @g
	.functype	.Lg (i64) -> ()
	.local  	i64, i64
# %bb.0:                                # %entry
	global.get	__stack_pointer
	i64.const	16
	i64.sub 
	local.tee	1
	global.set	__stack_pointer
	local.get	1
	call	f
	local.get	1
	i64.load	0
	local.set	2
	local.get	0
	local.get	1
	i64.load	8
	i64.store	8
	local.get	0
	local.get	2
	i64.store	0
	local.get	1
	i64.const	16
	i64.add 
	global.set	__stack_pointer
                                        # fallthrough-return
	end_function
                                        # -- End function
	.section	.custom_section.target_features,"",@
	.int8	10
	.int8	43
	.int8	11
	.ascii	"bulk-memory"
	.int8	43
	.int8	15
	.ascii	"bulk-memory-opt"
	.int8	43
	.int8	22
	.ascii	"call-indirect-overlong"
	.int8	43
	.int8	10
	.ascii	"multivalue"
	.int8	43
	.int8	15
	.ascii	"mutable-globals"
	.int8	43
	.int8	19
	.ascii	"nontrapping-fptoint"
	.int8	43
	.int8	15
	.ascii	"reference-types"
	.int8	43
	.int8	8
	.ascii	"sign-ext"
	.int8	43
	.int8	9
	.ascii	"tail-call"
	.int8	43
	.int8	8
	.ascii	"memory64"
	.section	.text..Lg,"",@

Note that this is a regression: LLVM 18 would generate a return_call:

.Lg:                                    # @g
	.functype	.Lg () -> (i64, i64)
# %bb.0:                                # %entry
	return_call	f
	end_function
@llvmbot
Copy link
Member

llvmbot commented Mar 27, 2025

@llvm/issue-subscribers-backend-webassembly

Author: Camil Staps (camilstaps)

Given x.ll
%t = type { i64, i64 }

declare %t @<!-- -->f()

define private %t @<!-- -->g() {
entry:
  %0 = musttail call %t @<!-- -->f()
  ret %t %0
}

the WebAssembly backend does not generate a return_call but a plain call, thus growing the stack.

I'm using LLVM 20.1.2 with the invocation llc-20 -mattr=+multivalue,+tail-call -mtriple=wasm64-unknown-unknown --filetype=asm -o x.wat x.ll, but the same issue arises on wasm32 and without multivalue enabled.

When changing the type to %t = type i64, return_call is used as expected.

The (faulty) generated WebAssembly for the module above is:

	.file	"x.ll"
	.globaltype	__stack_pointer, i64
	.functype	f (i64) -&gt; ()
	.functype	.Lg (i64) -&gt; ()
	.section	.text..Lg,"",@
	.type	.Lg,@<!-- -->function                   # -- Begin function g
.Lg:                                    # @<!-- -->g
	.functype	.Lg (i64) -&gt; ()
	.local  	i64, i64
# %bb.0:                                # %entry
	global.get	__stack_pointer
	i64.const	16
	i64.sub 
	local.tee	1
	global.set	__stack_pointer
	local.get	1
	call	f
	local.get	1
	i64.load	0
	local.set	2
	local.get	0
	local.get	1
	i64.load	8
	i64.store	8
	local.get	0
	local.get	2
	i64.store	0
	local.get	1
	i64.const	16
	i64.add 
	global.set	__stack_pointer
                                        # fallthrough-return
	end_function
                                        # -- End function
	.section	.custom_section.target_features,"",@
	.int8	10
	.int8	43
	.int8	11
	.ascii	"bulk-memory"
	.int8	43
	.int8	15
	.ascii	"bulk-memory-opt"
	.int8	43
	.int8	22
	.ascii	"call-indirect-overlong"
	.int8	43
	.int8	10
	.ascii	"multivalue"
	.int8	43
	.int8	15
	.ascii	"mutable-globals"
	.int8	43
	.int8	19
	.ascii	"nontrapping-fptoint"
	.int8	43
	.int8	15
	.ascii	"reference-types"
	.int8	43
	.int8	8
	.ascii	"sign-ext"
	.int8	43
	.int8	9
	.ascii	"tail-call"
	.int8	43
	.int8	8
	.ascii	"memory64"
	.section	.text..Lg,"",@

@camilstaps camilstaps changed the title WebAssembly backend does not generate return_call for musttail calls with multiple return values Regression: WebAssembly backend does not generate return_call for musttail calls with multiple return values Mar 28, 2025
@camilstaps
Copy link
Author

c921ac7 seems relevant (by which I don't mean to say it introduced the regression), as does the discussion here: #82714 (review)

we will still emit multivalue into the target features section if -mmultivalue is passed, but that will no longer affect codegen at all unless the new temporary flag is also passed

Indeed if I pass the "new temporary flag" -target-abi experimental-mv, return_call is used. However, the commit message mentions

'experimental-mv' is just one multivalue ABI we currently have, and it is still experimental, meaning it is not very well optimized or tuned for performance.

Is there an overview of the currently supported ABIs, and which do/do not have this feature?

Regardless of this, it seems to me that any ABI should either use return_call or throw an error when musttail is used.

@aheejin any chance you can help me understand the current situation?

@sbc100
Copy link
Collaborator

sbc100 commented Mar 28, 2025

This sounds like a bug regardless of multivalue, right? The multivalue question is not unimportant but can be separated from the bug (which occurs even without multivalue and without wasm64)..

@camilstaps
Copy link
Author

@sbc100 I agree. When I wrote the initial issue I did not realize LLVM now adheres to a standardized ABI, and then the solution could be to use multiple return values (as in LLVM 18). However, if the ABI must be followed, the solution is (I suppose) either to throw an error if the tail call cannot be generated, or generate code with __stack_pointer and return_call.

@dschuff
Copy link
Member

dschuff commented Mar 28, 2025

Regarding the ABI, I would say that currently the only "truly stable" ABI (which I would definitely say we shouldn't break) is the default C ABI for wasm32 (i.e. where musttail isn't used since the C frontend doesn't generate musttail even when tailcall is enabled; and where the experimental multivalue ABI is disabled). I think it could make sense to document an "LLVM-facing ABI" (i.e. one that specifies the behavior of LLVM constructs like musttail and maybe other things that don't appear in C, in a effort to get more compatibility and/or stability for non-C languages. Then, how stable we consider this ABI to be would probably be a function of usage and regular cost/benefit (i.e. we might consider changing/breaking if the breakage would affect very few users and we think we'd get a large benefit, etc).

@sbc100
Copy link
Collaborator

sbc100 commented Mar 31, 2025

Regarding the ABI, I would say that currently the only "truly stable" ABI (which I would definitely say we shouldn't break) is the default C ABI for wasm32 (i.e. where musttail isn't used since the C frontend doesn't generate musttail even when tailcall is enabled; and where the experimental multivalue ABI is disabled). I think it could make sense to document an "LLVM-facing ABI" (i.e. one that specifies the behavior of LLVM constructs like musttail and maybe other things that don't appear in C, in a effort to get more compatibility and/or stability for non-C languages. Then, how stable we consider this ABI to be would probably be a function of usage and regular cost/benefit (i.e. we might consider changing/breaking if the breakage would affect very few users and we think we'd get a large benefit, etc).

Is there no way to express musttail in C/C++ today?

@dschuff
Copy link
Member

dschuff commented Mar 31, 2025

There's no way in "standard" C. Clang has an attribute that lets you do it.

@camilstaps
Copy link
Author

camilstaps commented Apr 1, 2025

I think it could make sense to document an "LLVM-facing ABI" (i.e. one that specifies the behavior of LLVM constructs like musttail and maybe other things that don't appear in C, in a effort to get more compatibility and/or stability for non-C languages. Then, how stable we consider this ABI to be would probably be a function of usage and regular cost/benefit (i.e. we might consider changing/breaking if the breakage would affect very few users and we think we'd get a large benefit, etc).

Such an ABI would be helpful for us.

I have now also seen WebAssembly/tool-conventions#247, and it seems to make sense if this were specifiable on the function level, e.g. as a property of the calling convention – just like tail calls cannot be generated with all calling conventions for other targets. With such a setup ABI stability would also be less important (to us, at least): we could use the standard ABI for any FFI and the "LLVM-facing ABI" for anything internal, somewhat similar to how one might use ccc/fastcc. I don't know if this makes sense in the larger architecture; just a thought.

@sbc100
Copy link
Collaborator

sbc100 commented Apr 1, 2025

There's no way in "standard" C. Clang has an attribute that lets you do it.

So we could re-write the above failing snipped in C code to reproduce this failure from C?

@sbc100
Copy link
Collaborator

sbc100 commented Apr 1, 2025

I assume this failure (i.e. the failures to generate a wasm tail call) only happens with struct returns?

@camilstaps
Copy link
Author

So we could re-write the above failing snipped in C code to reproduce this failure from C?

Not sure. The obvious thing to try

struct s { long long x; long long y; };

extern struct s f (void);

struct s g (void) {
	[[clang::musttail]] return f();
}

compiles to LLVM IR simulating the stack, avoiding the issue:

define hidden void @g(ptr dead_on_unwind noalias writable sret(%struct.s) align 8 %0) #0 {
  %2 = alloca %struct.s, align 8
  musttail call void @f(ptr dead_on_unwind writable sret(%struct.s) align 8 %0)
  ret void

3:                                                ; No predecessors!
  ret void
}

declare void @f(ptr dead_on_unwind writable sret(%struct.s) align 8) #1

This is with clang-20 --target=wasm32 -mmultivalue -mtail-call -c -emit-llvm x.c.

I assume this failure (i.e. the failures to generate a wasm tail call) only happens with struct returns?

It seems so (and of course if the translation to LLVM IR ensures that there are no such returns, the failure cannot be reproduced from C).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants