Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Selecting batch type to work with given a register size constraint? #1085

Closed
Andersama opened this issue Feb 6, 2025 · 3 comments
Closed

Comments

@Andersama
Copy link

Andersama commented Feb 6, 2025

I'm seeing a lot of code examples which make use of the flexibility of determining the architecture or setting it at compile time. I was curious if there was a way to hint or specify the size of register that I'm expecting the api to be able to use instead.

This is namely to do with the swizzle operations. I've got an algorithm that only really needs to work with the 128 bit wide registers.

//so it seems a bit strange to do:
xsimd::batch<uint8_t, xsimd::ssse3> mask;
//to potentially make use of
template <class A>
XSIMD_INLINE batch<uint8_t, A> swizzle(batch<uint8_t, A> const& self, batch<uint8_t, A> mask, requires_arch<ssse3>) noexcept
{
    return _mm_shuffle_epi8(self, mask);
}
//when I'd also want to make sure that this algorithm works on other architectures

I'm imagining something like this would make sense?

xsimd::batch<uint8_t, xsimd::register_width_128> mask;

Let me know if I'm missing something in the library that'd let me do this.

@serge-sans-paille
Copy link
Contributor

Looks like you'd be interested in xsimd::make_sized_batch (rtd: https://xsimd.readthedocs.io/en/latest/api/xsimd_batch.html#_CPPv4I0_NSt6size_tEEN5xsimd16make_sized_batchE )

@Andersama
Copy link
Author

Thanks, that appears to be exactly what I needed.

@serge-sans-paille
Copy link
Contributor

Cool, closing the issue then.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants