Put RNG in shared memory where beneficial #229
Draft
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
The shared memory hardware can handle the access pattern on the RNG better than the local memory (in case of spilling) or global memory. For most kernels, it is thus beneficial to move the RNG into shared memory at the beginning of the kernel and store it back at the end in case the track survives.
Running
example19 -particles 10000 -batch 5000 -gdml_file ./examples/Example14/macros/testEm3.gdml
on the V100.master (V100):
Mean: 4.00011
Stddev: 0.00030375
rng_sm (V100, only TransportElectrons):
Mean: 3.70078
Stddev: 0.00141616
rng_sm (V100, TransportElectrons and TransportGammas):
Mean: 3.94435
Stddev: 0.00170546
So it’s only beneficial to put the RNG in SM in TransportElectrons.
rng_sm (V100, TransportElectrons and electron interaction kernels):
Mean: 3.68766
Stddev: 0.000562445
rng_sm (V100, TransportElectrons and electron/gamma interaction kernels):
Mean: 3.65415
Stddev: 0.00167166
The benefit in the interaction kernels is significantly smaller. Furthermore, contention for the shared memory is increased, since SM is also needed to compact active tracks at the beginning of the interaction kernels.
Opinions on the SM usage for the interaction kernels?
Depends on: