Store new type components as SIMD vectors #136

tsoutsman · 2021-10-11T08:58:55Z

Edit: I've made a few edits to this issue that I will append to the end. They negate some of what I say, so it might be better to read them first.

Hi, I'd like to preface this issue by saying that this suggestion is pretty out there and probably won't be implemented. I'm writing this more just to put the idea out there. I'm not particularly familiar with SIMD, but I've been reading about it, and it kinda seems like "free" speed.

The idea is to give the option to store new types (i.e. structs with one field, e.g. Health(u32)) as Simd objects in the SparseMap. For SparseMaps with this option enabled, the data array would have the type signature: Vec<Simd<[T; N]>> where T is the type of the inner field of the new type (e.g. u32) and N is configurable. Alternatively, the data could be stored as a Vec<T> and then, when needed, we could transmute it into a Vec<Simd<[T; N]>> (I think). With this approach, there should be some new type around Vec that ensures that its length is always a multiple of N. Although this requires unsafe transmutes, I think it is the better option.

The option could be made an attribute of the Component derive macro. For example,

#[derive(Component)]
#[simd(u32x8)] // <- Specifies which SIMD vector to use
struct Health(pub u32);

Adding the attribute would lead to some additional trait being derived. The additional trait would look something like this:

trait NewType {
	// We need to constrain the SIMD Vector somehow to contain only `Ts`, but its items are an
	// associated type is not generic, so I'm not sure how to. This could potentially be done in the
	// Derive macro itself. At the end of the day, if someone is trying to store a u32 in an f64x8, its
	// kinda on them.
	type Vector: packed_simd::SimdVector;
	type Inner;
	
	fn value(&self) -> Self::Inner;
	fn set(&mut self, v: Self::Inner);
	// You might also need a function to get a mutable reference to the inner value.
}

Health would be stored as a Vec<u32> (or Vec<Simd<[u32; 8]>>) and iterating over View<Health> would yield Simd<[u32; 8]>. The derived NewType trait would look like:

impl NewType<u32> for Health {
	type Vector: simd_packed::u32x8;
	type Inner: u32;
	
	fn value(&self) -> Self::Inner {
		self.0
	}
	
	fn set(&mut self, v: Self::Inner) {
		self.0 = v;
	}
}

Good Stuff

Fast.

Bad Stuff

This would obviously be a complete nightmare to implement.
Iterating over SIMD and non-SIMD components.
I think it would be best not to allow (if that is even possible. You would need some way to differentiate between SIMD and non-SIMD components by the View). There could be a method that you can call on a View that converts all the SIMDs into scalars. For example:
```
for (_, _) in (simd_component.scalar(), non_simd_component) {}
```
From<Self::Inner> would need to be implemented for the component, e.g. From<u32> for Health(u32) to do this.
The SparseMap stores a different type to the component it represents. This might not be that big an issue because AllStorages.storages is a hashmap where the key is the component type, so I don't think the SparseMap's type is that important.
Data alignment. If there are nine components that we are storing in a Simd<[u32; 8]>, then we would need two of them and just pad the second one with 7 bits. This could get quite confusing, especially when iterating over components, as the user wouldn't know if there is actually a zero in that location or if it's padding (Although I don't think the user should really care, as the changes to the padding would just be discarded). Also, we might not be able to just pad it with 0s, as what if the user wants to divide? That would cause a panic.
Methods on the new types. I guess the burden could be on the user to manually convert it to the new type, call the method and turn it back into a Simd.
Removing components. If the components are stored as Vec<Simd<[T; N]>>, this would probably be very expensive, but if they are stored as a Vec<T>, it should be fairly simple.

If this ever gets implemented, it could potentially be used to store new types with more than one field. The SparseMap would contain multiple data arrays. For example, the component:

struct Position(u32, u64);

could be stored as:

data: (
	[simd_packed::u32x2::new(3u32, 4u32)], // <- First field
	[simd_packed::u32x2::new(3u64, 4u64)], // <- Second field
)

The longer I write, the more I realise this is a terrible idea, so imma stop. There are probably a ton of things that I've missed that would further complicate the implementation.

I'd be happy to help in any way I can; however, this seems like the kind of thing which requires extensive knowledge of Rust generics, traits, and macros. I'm reasonably comfortable with these concepts, but I've only been using Rust for about seven months, so take that as you will.

Something to note is this won't affect the performance or ergonomics of the library if the user doesn't opt-in.

Edit:
According to this post (re 5) it could be possible to just store the new types themselves as long as they are #repr[(transparent)] and properly aligned.

Edit 2:
The following code works:

use packed_simd_2::u8x4;

#[repr(transparent)]
#[derive(Clone, PartialEq, Eq, Debug)]
struct NewType(pub u8);

fn main() {
    let x = vec![NewType(0); 12];
    let y = unsafe { x.align_to::<u8x4>() };
    assert_eq!(
        y,
        (&[][..], vec![u8x4::new(0, 0, 0, 0); 3].as_ref(), &[][..])
    );

    let x = vec![NewType(0); 11];
    let y = unsafe { x.align_to::<u8x4>() };
    assert_eq!(
        y,
        (
            &[][..],
            vec![u8x4::new(0, 0, 0, 0); 2].as_ref(),
            vec![NewType(0); 3].as_ref()
        )
    );
}

but from the slice docs:

The method may make the middle slice the greatest length possible for a given type and input slice, but only your algorithm’s performance should depend on that, not its correctness.

Potentially after aligning we could take the first and last slices and feed them into a slower function that manually creates the u8x4 (or whatever vector is being used). Order doesn't matter, (unless EntityIds are involved?).

The text was updated successfully, but these errors were encountered:

tsoutsman · 2021-10-12T00:15:26Z

I've kinda been thinking about this a bit more, and since the component can just be stored as normal in the data array, there could potentially be a simd method for iteration, like so:

#[derive(Component)]
#[simd(u32x8)]
struct Health(pub u32);

fn system(healths: ViewMut<Health>) {
    for health in healths {
        // Health is just a normal component.
    }
    for health in healths.simd() {
        // Health is a u8x4
    }
}

If the user does not call the simd method then the #[simd()] attribute has no effect.

However, there is another issue I didn't mention. Currently, iterators return a reference to the object, but if we were to pack it into a SIMD vector, it would have to be the actual object, not just a reference.

leudz · 2021-10-12T02:17:39Z

Hi, I'm no expert in SIMD but I was also tempted to use it in the past. So shipyard can already work with SIMD to some extent.

Before diving into the code, note that there is another way to store Position (for example): f32x4 ([x, y, z, 1.0]).
This way you don't need a different storage and basically the only bad thing is the wasted memory.

So as you noticed, trying to use SIMD + ECS is not that easy.
If you store the components as SIMD values you'll have many issues going from add/remove to taking a reference to a single element.
If you store them as regular types you need to think about alignment and find a way to get components that are next to each other in memory.
But it's (far) easier to solve the issues of the latter solution.

What shipyard can do: single storage

When you iterate a single storage it's guaranteed that the components are next to each other.
Then you need to be able to get some number of consecutive components. Good news, there is a method for that: into_chunk_exact.
This method will give you X components at a time instead of 1.
If you want them to be correctly aligned, you can process the first components one by one then switch to a chunk iterator when you reach the correct SIMD alignment.
There is a simple example of SIMD iteration here, V3.

What shipyard cannot do (anymore/yet): multiple storages

As you know with multiple SparseMap the components could be next to each other in memory but are very likely not.
You can technically still do SIMD stuff using gather-scatter but if you just want to do a few SIMD operations it will likely be (a lot) slower than not using SIMD.

To me the "best" solution here is what I call "packing". If you read Skypjack's blog it's called "grouping".
I named it packing at first because it was supposed to describe many ways to keep a storage organized.
In the earliest versions, component insertion and modification tracking was implemented by swapping the components inside the storage.

There was also "tight"/"loose"/"free" packing ("free" was never implemented).
All three are ways to keep multiple storages "sorted by entities" basically. Skypjack explains it better than I will, if you haven't read it yet, here's the link.

"tight" and "loose" were removed in version 0.5 (I think). But the issue was not the feature but the implementation.
These two packs had non negligible cost even on non packed storages.
Plus they were very restrictive. Once a storage was tightly packed you couldn't tightly pack it again.
For loose there was main and secondary storages. A main storage could not be packed twice.
As you imagine this can cause issues when you have a component used often (like Position) that you want to pack multiple times.

Later Skypjack talked about a better way, nested grouping. You could tightly pack as many storages as you want as long as the later pack contains the types of the pack before. Example: [A, B] and [A, B, C] and [A, B, C, D, E].
This is interesting but I never implemented it, sparsey did however. I don't think they have "loose" or "free" but their grouping can be nested.

So what's the future for shipyard? Implementing nested groups?
No, at least it's not what I want, there is an experiment that should work but might not. If it doesn't work then yes I guess nested groups.

This new packing implementation should have very low cost on non packed storages, no limit on what can be packed and better add/remove scaling.
I "just" need the time to implement it and see if it works. 😅

So back to SIMD, if you have this packing in place then you can do the same as with a single storage. You'll be able to create a chunk iterator and use that.
I believe it would be fairly easy to add simd4/simd8/... methods on iterators to remove the boilerplate.
It would be an extension trait but could be incorporated once SIMD finally stabilize.

With all that said the tiny example and probably the extension trait should be part of a recipe (one day ^^).

I hope this was at least a bit helpful.

tsoutsman · 2021-10-12T03:02:22Z

Thanks, that was very helpful. I somehow completely forgot that if you iterate over multiple SparseMaps, the components are not next to each other. And with the into_chunk_exact method it seems like all I wanted is already basically implemented.

tsoutsman closed this as completed Oct 12, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Store new type components as SIMD vectors #136

Store new type components as SIMD vectors #136

tsoutsman commented Oct 11, 2021 •

edited

Loading

tsoutsman commented Oct 12, 2021

leudz commented Oct 12, 2021

tsoutsman commented Oct 12, 2021

Store new type components as SIMD vectors #136

Store new type components as SIMD vectors #136

Comments

tsoutsman commented Oct 11, 2021 • edited Loading

Good Stuff

Bad Stuff

tsoutsman commented Oct 12, 2021

leudz commented Oct 12, 2021

What shipyard can do: single storage

What shipyard cannot do (anymore/yet): multiple storages

tsoutsman commented Oct 12, 2021

tsoutsman commented Oct 11, 2021 •

edited

Loading