Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP: Fall back to Kokkos::sort #1207

Open
wants to merge 13 commits into
base: develop
Choose a base branch
from
1 change: 1 addition & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,7 @@
- [[PR 1161]](https://github.com/parthenon-hpc-lab/parthenon/pull/1161) Make flux field Metadata accessible, add Metadata::CellMemAligned flag, small perfomance upgrades

### Changed (changing behavior/API/variables/...)
- [[PR 1206]](https://github.com/parthenon-hpc-lab/parthenon/pull/1206) Leapfrog fix
- [[PR1203]](https://github.com/parthenon-hpc-lab/parthenon/pull/1203) Pin Ubuntu CI image
- [[PR1177]](https://github.com/parthenon-hpc-lab/parthenon/pull/1177) Make mesh-level boundary conditions usable without the "user" flag
- [[PR 1187]](https://github.com/parthenon-hpc-lab/parthenon/pull/1187) Make DataCollection::Add safer and generalize MeshBlockData::Initialize
Expand Down
5 changes: 5 additions & 0 deletions src/interface/swarm_comms.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -269,6 +269,11 @@ void Swarm::LoadBuffers_() {

// Remove particles that were loaded to send to another block from this block
RemoveMarkedParticles();
} else {
for (int n = 0; n < pmb->neighbors.size(); n++) {
const int bufid = pmb->neighbors[n].bufid;
vbswarm->send_size[bufid] = 0;
}
}
}

Expand Down
40 changes: 6 additions & 34 deletions src/utils/sort.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -61,55 +61,27 @@ void sort(ParArray1D<Key> data, KeyComparator comparator, size_t min_idx,
size_t max_idx) {
PARTHENON_DEBUG_REQUIRE(min_idx < data.extent(0), "Invalid minimum sort index!");
PARTHENON_DEBUG_REQUIRE(max_idx < data.extent(0), "Invalid maximum sort index!");
#ifdef KOKKOS_ENABLE_CUDA
#ifdef __clang__
PARTHENON_FAIL("sort is using thrust and there exists an incompatibility with clang, "
"see https://github.com/lanl/parthenon/issues/647 for more details. We "
"won't fix it because eventually the Parthenon sort should make use of "
"Kokkos::sort once a performant implementation is availabe. If you see "
"this message and need sort on CUDA devices with clang compiler please "
"get in touch by opening an issue on the Parthenon GitHub repo.");
#else
#if defined(KOKKOS_ENABLE_CUDA) && !defined(__clang__)
thrust::device_ptr<Key> first_d = thrust::device_pointer_cast(data.data()) + min_idx;
thrust::device_ptr<Key> last_d = thrust::device_pointer_cast(data.data()) + max_idx + 1;
thrust::sort(first_d, last_d, comparator);
#endif
#else
if (std::is_same<DevExecSpace, HostExecSpace>::value) {
std::sort(data.data() + min_idx, data.data() + max_idx + 1, comparator);
} else {
PARTHENON_FAIL("sort is not supported outside of CPU or NVIDIA GPU. If you need sort "
"support on other devices, e.g., AMD or Intel GPUs, please get in "
"touch by opening an issue on the Parthenon GitHub.");
}
auto sub_data = Kokkos::subview(data, std::make_pair(min_idx, max_idx + 1));
Kokkos::sort(sub_data, comparator);
#endif // KOKKOS_ENABLE_CUDA
}

template <class Key>
void sort(ParArray1D<Key> data, size_t min_idx, size_t max_idx) {
PARTHENON_DEBUG_REQUIRE(min_idx < data.extent(0), "Invalid minimum sort index!");
PARTHENON_DEBUG_REQUIRE(max_idx < data.extent(0), "Invalid maximum sort index!");
#ifdef KOKKOS_ENABLE_CUDA
#ifdef __clang__
PARTHENON_FAIL("sort is using thrust and there exists an incompatibility with clang, "
"see https://github.com/lanl/parthenon/issues/647 for more details. We "
"won't fix it because eventually the Parthenon sort should make use of "
"Kokkos::sort once a performant implementation is availabe. If you see "
"this message and need sort on CUDA devices with clang compiler please "
"get in touch by opening an issue on the Parthenon GitHub repo.");
#else
#if defined(KOKKOS_ENABLE_CUDA) && !defined(__clang__)
thrust::device_ptr<Key> first_d = thrust::device_pointer_cast(data.data()) + min_idx;
thrust::device_ptr<Key> last_d = thrust::device_pointer_cast(data.data()) + max_idx + 1;
thrust::sort(first_d, last_d);
#endif
#else
if (std::is_same<DevExecSpace, HostExecSpace>::value) {
std::sort(data.data() + min_idx, data.data() + max_idx + 1);
} else {
PARTHENON_FAIL("sort is not supported outside of CPU or NVIDIA GPU. If you need sort "
"support on other devices, e.g., AMD or Intel GPUs, please get in "
"touch by opening an issue on the Parthenon GitHub.");
}
auto sub_data = Kokkos::subview(data, std::make_pair(min_idx, max_idx + 1));
Kokkos::sort(sub_data);
#endif // KOKKOS_ENABLE_CUDA
}

Expand Down
Loading