-
Notifications
You must be signed in to change notification settings - Fork 13.6k
Open
Labels
A-LLVMArea: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues.Area: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues.C-bugCategory: This is a bug.Category: This is a bug.C-optimizationCategory: An issue highlighting optimization opportunities or PRs implementing suchCategory: An issue highlighting optimization opportunities or PRs implementing suchI-slowIssue: Problems and improvements with respect to performance of generated code.Issue: Problems and improvements with respect to performance of generated code.
Description
I compiled this code with -Copt-level=3 -Ctarget-cpu=raptorlake
:
pub fn a0(x: [u8; 64]) -> bool {
x == [0; 64]
}
pub fn b0(x: [u8; 64]) -> bool {
x.iter().all(|&y| y == 0)
}
pub fn a1(x: [u8; 64]) -> bool {
x == [255; 64]
}
pub fn b1(x: [u8; 64]) -> bool {
x.iter().all(|&y| y == 255)
}
I expected all four functions to be vectorized. Instead, b0
, a1
, and b1
are vectorized, but a0
is not vectorized.
Generated assembly
example::a0::h0c5a938daa1dd8cf:
mov rax, qword ptr [rdi + 24]
mov rcx, qword ptr [rdi]
mov rdx, qword ptr [rdi + 16]
or rdx, qword ptr [rdi + 48]
mov rsi, qword ptr [rdi + 8]
or rcx, qword ptr [rdi + 32]
or rcx, rdx
or rax, qword ptr [rdi + 56]
or rsi, qword ptr [rdi + 40]
or rsi, rax
or rsi, rcx
sete al
ret
example::b0::he89db7fcb6c0c0f1:
vmovdqu ymm0, ymmword ptr [rdi + 32]
vpor ymm0, ymm0, ymmword ptr [rdi]
vptest ymm0, ymm0
sete al
vzeroupper
ret
example::a1::h00de82c55f406ff7:
vmovdqu ymm0, ymmword ptr [rdi]
vpand ymm0, ymm0, ymmword ptr [rdi + 32]
vpcmpeqd ymm1, ymm1, ymm1
vptest ymm0, ymm1
setb al
vzeroupper
ret
example::b1::h06b133df359bb172:
vmovdqu ymm0, ymmword ptr [rdi + 32]
vpand ymm0, ymm0, ymmword ptr [rdi]
vpcmpeqd ymm1, ymm1, ymm1
vptest ymm0, ymm1
setb al
vzeroupper
ret
Meta
Rust 1.87.0 on Godbolt.
@rustbot labels +C-optimization
Metadata
Metadata
Assignees
Labels
A-LLVMArea: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues.Area: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues.C-bugCategory: This is a bug.Category: This is a bug.C-optimizationCategory: An issue highlighting optimization opportunities or PRs implementing suchCategory: An issue highlighting optimization opportunities or PRs implementing suchI-slowIssue: Problems and improvements with respect to performance of generated code.Issue: Problems and improvements with respect to performance of generated code.