-
Notifications
You must be signed in to change notification settings - Fork 12.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Unnecessarily large constant created from reordering add and shift #123239
Comments
trunk: define i1 @foo(i64 %x) {
entry:
%0 = add i64 %x, 3940649673949184
%1 = icmp ult i64 %0, 844424930131968
ret i1 %1
} vs 17.x: define i1 @foo(i64 %x) {
entry:
%shr = lshr i64 %x, 48
%conv = trunc i64 %shr to i32
%0 = add nsw i32 %conv, -65522
%1 = icmp ult i32 %0, 3
ret i1 %1
} IIRC this has been reported before |
We lower to:
It should be possible to fold some
|
@llvm/issue-subscribers-backend-x86 Author: dzaima (dzaima)
https://godbolt.org/z/xoKf6bnTb
The code: #include<stdint.h>
#include<stdbool.h>
bool foo(uint64_t x) {
uint16_t tag = x>>48;
return tag>=0b1111111111110010 && tag<=0b1111111111110100;
} with foo:
movabs rax, 3940649673949184
add rax, rdi
shr rax, 48
cmp eax, 3
setb al
ret whereas 18.0 did this, which is strictly better (i.e. is the exact same set of instructions, just in a different order and without movabs): foo:
shr rdi, 48
add edi, -65522
cmp edi, 3
setb al
ret |
Hi! This issue may be a good introductory issue for people new to working on LLVM. If you would like to work on this issue, your first steps are:
If you have any further questions about this issue, don't hesitate to ask via a comment in the thread below. |
@llvm/issue-subscribers-good-first-issue Author: dzaima (dzaima)
https://godbolt.org/z/xoKf6bnTb
The code: #include<stdint.h>
#include<stdbool.h>
bool foo(uint64_t x) {
uint16_t tag = x>>48;
return tag>=0b1111111111110010 && tag<=0b1111111111110100;
} with foo:
movabs rax, 3940649673949184
add rax, rdi
shr rax, 48
cmp eax, 3
setb al
ret whereas 18.0 did this, which is strictly better (i.e. is the exact same set of instructions, just in a different order and without movabs): foo:
shr rdi, 48
add edi, -65522
cmp edi, 3
setb al
ret |
Hey can I pick this up? Im new to the project and just looking for some beginner friendly issues |
Sure. Are you familiar with Alive2? The first step is to determine the constraints on the truncation fold. |
Nope. Let me then start by getting familiar with Alive2. I'll spend today looking at that then, thanks! |
So as I first step I will have to convert the C code into IR using latest clang and version 18 and then compare the IR using alive2? @RKSimon |
You can use the IR here: #123239 (comment) |
Cool thanks! |
$ALIVE2_HOME/alive2/build/alive-tv old.ll new.ll
Hey @RKSimon this is the findings I got. Sorry for the delayed response. |
@souvik150 instcombine will fold to the canonical i64 IR - which is what alive2 has done for you above. https://alive2.llvm.org/ce/z/hU6D28 shows the reverse fold - but what we need to do is work out what bounds the addition and comparison constants need for us to perform the truncated add + comparison in a general case. |
Hi. Is it ok for me to pick this up, or is it still being worked on? @RKSimon @souvik150 |
ping @souvik150 |
…uncate (srl X, C2)), C1') (#126448) Addresses the poor codegen identified in #123239 and a few extra cases. This transformation is correct for `eq` (https://alive2.llvm.org/ce/z/qZhwtT), `ne` (https://alive2.llvm.org/ce/z/6gsmNz), `ult` (https://alive2.llvm.org/ce/z/xip_td) and `ugt` (https://alive2.llvm.org/ce/z/39XQkX). Fixes #123239
https://godbolt.org/z/xoKf6bnTb
The code:
with
-O3
as of clang 19 (and still in trunk) compiles to:whereas 18.0 did this, which is strictly better (i.e. is the exact same set of instructions, just in a different order and without movabs):
The text was updated successfully, but these errors were encountered: