Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Inefficient codegen for loop with known small trip count #80298

Open
dzaima opened this issue Feb 1, 2024 · 3 comments
Open

Inefficient codegen for loop with known small trip count #80298

dzaima opened this issue Feb 1, 2024 · 3 comments

Comments

@dzaima
Copy link

dzaima commented Feb 1, 2024

The code

#include<stdint.h>
void f(char* a, uint8_t l) {
  for (int i = 0; i < l; i++) a[i]++;
}

compiled with -O3 -march=haswell (or -O3 -mavx2 or similar; compiler explorer), generates unrolled xmm operations, even though ymm ones can trivially replace them.

Same thing happens with an assume of length:

void f(char* a, int l) {
  __builtin_assume(l < 256);
  for (int i = 0; i < l; i++) a[i]++;
}
@llvmbot
Copy link
Member

llvmbot commented Feb 1, 2024

@llvm/issue-subscribers-backend-x86

Author: dzaima (dzaima)

The code
#include&lt;stdint.h&gt;
void f(char* a, uint8_t l) {
  for (int i = 0; i &lt; l; i++) a[i]++;
}

compiled with -O3 -march=haswell (or -O3 -mavx2 or similar; compiler explorer), generates unrolled xmm operations, even though ymm ones can trivially replace them.

Same thing happens with an assume of length:

void f(char* a, int l) {
  __builtin_assume(l &lt; 256);
  for (int i = 0; i &lt; l; i++) a[i]++;
}

@miguelraz
Copy link
Contributor

@RKSimon I'd like to be assigned this issue and try my hand at it.

@RKSimon
Copy link
Collaborator

RKSimon commented Apr 5, 2024

@RKSimon I'd like to be assigned this issue and try my hand at it.

Go for it! @fhahn might be able to give some tips regarding any loop vectorizer related issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants