-
Notifications
You must be signed in to change notification settings - Fork 14.7k
Description
Bugzilla Link | 52323 |
Version | unspecified |
OS | Linux |
Blocks | #4440 |
CC | @andyhhp,@chandlerc,@DougGregor,@efriedma-quic,@jyknight,@m-gupta,@nickdesaulniers,@pageexec,@phoebewang,@zygoloid,@rnk |
Extended Description
Hello
[FYI, this is being cross-requested of GCC too]
Linux and other kernel level software makes use of -mindirect-branch=thunk-extern
to be able to alter the handling of indirect branches at boot. It turns out to be advantageous to inline the thunks when retpoline is not in use. https://lore.kernel.org/lkml/[email protected]/ is some infrastructure to make this work.
In some cases, we want to be able to inline an lfence; jmp *%reg
thunk. This is fine for the low 8 registers, but not fine for %r{8..15} where the REX prefix pushes the replacement size to being 6 bytes.
It would be very useful to have a code-gen option to write out call %cs:__x86_indirect_thunk_r{8..15}
where the redundant %cs prefix will increase the instruction length to 6, allowing the non-retpoline form to be inlined.
Relatedly, x86 straight line speculation has been discussed before, but without any action taken. It would be helpful to have a code gen option which would emit int3
following any ret
instruction, and any indirect jump, as neither of these two cases have following architectural execution.
The reason these two are related is that if both options are in use, we want an extra byte of replacement space to be able to inline lfence; jmp *%reg; int3
.
Third Clang has been observed to spot conditional tail calls as Jcc __x86_indirect_thunk_*
. This is a 6 byte source size, but needs up to 9 bytes of space for inlining including an int3
for straight line speculation reasons (See https://lore.kernel.org/lkml/[email protected]/ for full details). It might be enough to simply prohibit an optimisation like this when trying to pad retpolines for inlineability.