Skip to content

Low-latency ComponentArrays #302

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
visr opened this issue Apr 11, 2025 · 4 comments
Open

Low-latency ComponentArrays #302

visr opened this issue Apr 11, 2025 · 4 comments

Comments

@visr
Copy link

visr commented Apr 11, 2025

Hi, I really enjoy using ComponentArrays. I used it as the state vector in OrdinaryDiffEq. One issue that we ran into was that the size of our components change for each simulation, leading to having to recompile a lot of code each time. The issue can be summarized as such:

julia> a = ComponentVector(; a=[1], b=[2,3])
julia> b = ComponentVector(; a=[1,2], b=[3])
julia> typeof(a) == typeof(b)
false

This is because the Axes keys and values are both in the type domain. I was recently discussing this on Slack with @MasonProtter, @SouthEndMusic and @ChrisRackauckas.

I made a little prototype struct CArray as a possible replacement of the current ComponentArray and I want some early feedback to see if folks would be interested in this, or point out flaws in this design.

I haven't focused yet on matching the API, but do already make sure that range, integer and nested components all work. The hope is that this can be mostly compatible, but probably still breaking. Some quick possibly flawed benchmarks show similar performance.

struct CArray{T, N, A<:DenseArray{T,N}, NT} <: DenseArray{T, N}
    data::A
    axes::NT
end

The full prototype is here: https://gist.github.com/visr/dde7ab3999591637451341e1c1166533

@jonniedie
Copy link
Collaborator

Yeah, I think this is a good idea. I prototyped something like this a while back when I was first writing ComponentArrays and decided against it because microbenchmarks were slightly slower and I wanted this to truly be a zero-cost abstraction. As I've come to use it more, though, I've realized that that was a mistake and it's prevented both resizable inner arrays and, maybe even more importantly, type-stable construction when there are array components. If this prototype solves those problems and is only a little slower, you'd have my vote.

@ChrisRackauckas
Copy link
Member

I think there's a not-so-difficult way to implement this. The core type is struct Axis{IdxMap} <: AbstractAxis{IdxMap} end, basically IdxMap is all of the information for the indexing. What you want to do is create a:

abstract type AbstractRuntimeAxis <: AbstractAxis{Nothing} end # Maybe? Or Maybe make new highest level abstract type

RuntimeAxis{T} <: AbstractRuntimeAxis
  idxmap::T
end

and once you have that, you just adjust:

@inline indexmap(::AbstractAxis{IdxMap}) where IdxMap = IdxMap

@inline indexmap(x::RuntimeAxis) = x.idxmap

Now RuntimeAxis should be able to be used where Axis was, but it's using the idxmap as runtime information instead of as new type information. Then we just need to figure out what constructor we want, RTComponentArray, or ComponentArray{false} or something, to choose to use a RuntimeAxis instead.

You'll want to handle a few other axis types in the same way, like the views, and then make sure runtime stuff makes runtime stuff, and then the whole package should basically just automatically carry over to this form since it seems to use indexmap everywhere, so it's generic to the axis type that is used already (because it needs to for the view axis stuff to work).

You can even then make a version where that idxmap is vectors instead of tuples if you want the index map to be completely mutable and completely runtime. That would incur more overhead than the named tuple version. The NT version is probably the most useful case here, but it's something to consider as it would be the zero compile time specialization case.

@visr
Copy link
Author

visr commented Apr 25, 2025

Yeah probably that would be the easiest way to get this in as an additional opt-in feature.

If I suppress any output printing to avoid stackoverflow error I can construct a ComponentArray that doesn't include the axis values in the type:

ax = RuntimeAxis((; a = 1:1, b = 2:3))
ca = ComponentArray([1.0, 2.0, 3.0], ax);
typeof(ca)
# ComponentArrays.LazyArray{Float64, 1, Base.Generator{Vector{Float64}, ComponentArrays.var"#18#19"{Tuple{RuntimeAxis{@NamedTuple{a::UnitRange{Int64}, b::UnitRange{Int64}}}}}}}

I cannot really get any components out yet, everything gets wrapped in ComponentArrays.LazyArray.

My feeling is that if we'd switch to something like the gist is that we can significantly simplify the ComponentArrays code, at the cost of some breaking changes. Though I don't know enough about the story behind ShapedAxis, PartitionedAxis, ViewAxis, CombinedAxis if all that is really needed.

Note by the way that in the prototype I don't restrict the axis to a particular type, so in this version anything that supports this would work:

loc = getproperty(axes, name)
component(data, loc)

@ChrisRackauckas
Copy link
Member

My feeling is that if we'd switch to something like the gist is that we can significantly simplify the ComponentArrays code, at the cost of some breaking changes. Though I don't know enough about the story behind ShapedAxis, PartitionedAxis, ViewAxis, CombinedAxis if all that is really needed.

I think that stuff is pretty needed for all of the reshape and odd cases to work out. There's a lot of tests though so if you could get them to pass then good on you. But I think that's fighting a devil: it was added for a reason and it's complex for a reason so I'd just go with it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants