Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CUDA support #64

Open
roflmaostc opened this issue Nov 22, 2022 · 2 comments
Open

CUDA support #64

roflmaostc opened this issue Nov 22, 2022 · 2 comments

Comments

@roflmaostc
Copy link
Contributor

Hi,

how can we add CUDA support for this?

julia> ShiftedArrays.fftshift(CUDA.rand(2,2))
2×2 CircShiftedArray{Float32, 2, CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}}:
┌ Warning: Performing scalar indexing on task Task (runnable) @0x00007f8167e4e6e0.
│ Invocation of getindex resulted in scalar indexing of a GPU array.
│ This is typically caused by calling an iterating implementation of a method.
│ Such implementations *do not* execute on the GPU, but very slowly on the CPU,
│ and therefore are only permitted from the REPL for prototyping purposes.
│ If you did intend to index this array, annotate the caller with @allowscalar.
└ @ GPUArraysCore ~/.julia/packages/GPUArraysCore/lojQM/src/GPUArraysCore.jl:90
 0.332543  0.906493
 0.51592   0.40837

Best,

Felix

@piever
Copy link
Collaborator

piever commented Nov 30, 2022

I think it's just a problem with displaying things. The actual fftshift method shouldn't call getindex. Might be worth to inquire over at CUDA.jl as to whether there's a way for array wrappers to avoid this warning when displaying the array.

@RainerHeintzmann
Copy link
Contributor

Overloading the display can be done. However the real problem is that all broadcasting operations fall back on get_index() calls. This causes CUDA.jl, depending on the settings to either fail or be really slow.
A sensible solution would be to implement broadcasting rules for ShiftedArrays. Ideally these rules would be able to broadcast between arrays of the same shifts, of different shifts and between other AbstractArrays. For a circshifted array one probably needs to calculate intersection points and split the arrays into seperate parts. What do you think? Who knows how to implement proper broadcasting rules?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants