Conversation
| RAJA::loop<threads_x>(ctx, RAJA::RangeSegment(0, 1), [&](int c) { | ||
| // __once_loop_start | ||
| // Use a single logical thread per team for shared initialization. | ||
| RAJA::loop<threads_x>(ctx, RAJA::once(), [&](int c) { |
There was a problem hiding this comment.
Does it make more sense to have a once policy?
There was a problem hiding this comment.
and have loop not take an iteration space?
There was a problem hiding this comment.
Sure, is that consistent with the use case?
There was a problem hiding this comment.
Or have a non-loop function like loop?
RAJA::non_loop(ctx, [&]() {...});There was a problem hiding this comment.
I missed that you only want one thread doing this so you really do need the proper policy. Do you need index c?
There was a problem hiding this comment.
The less to maintain the better, with the range PR you could just have the range be range(1) which is pretty short.
There was a problem hiding this comment.
Whats holding me back is that range(1) is kind of a trick, something more explicit would be nice. @tomstitt what do you think?
There was a problem hiding this comment.
The range(1) on a worksharing loop would give you the equivalent of single, one thread runs it but all other threads wait until it's done. If that's what this is then that would be a reasonable way to implement it or have people do it, but if you want it not to wait, then I don't think that would get you there, at least not without an appropriate policy or something to request a non-waiting behavior. It's also only sort-of a loop right? Doing masked might be reasonable, have it default to a mask of 0 but allow for an argument that would let you either say how many or which threads should execute it?
There was a problem hiding this comment.
I like having a wrapper around range(1), we do want something that we can range(1) for any (combination) of threads in x, y, z, not sure if that helps or hurts ideas. The main use cases I can think of are "once across x,y,z" and "once in z" where we nest some x & y work under that, including a "once across x,y"
There was a problem hiding this comment.
I like the wrapper too, in reading the code it makes it easier to identify what is going on.
|
As a stylistic note, we do this in OpenMP with It might be worth using an alternate name to disambiguate it from that "once" behavior, possibly also to indicate the blocking or non-blocking behavior. |
Summary
This PR add the RAJA once function, the raja once function simplifies having to mask out threads for operations by returning a RAJA::RangeSegment(0,1). This comes up when we only want 1 thread doing a certain operation in GPU kernels.