Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Coalesced Buffer Communication #1192

Open
wants to merge 117 commits into
base: develop
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from 105 commits
Commits
Show all changes
117 commits
Select commit Hold shift + click to select a range
539ba5f
start on combined communication
lroberts36 Oct 17, 2024
d1c1274
just reuse BndInfo
lroberts36 Oct 17, 2024
e42ee36
partial
lroberts36 Oct 17, 2024
40a0a02
cleanup serialization, decouple
lroberts36 Oct 18, 2024
00ce27b
missed on last commit
lroberts36 Oct 18, 2024
64d655e
fix bug
lroberts36 Oct 19, 2024
c3ddf52
Actually set up to do communication
lroberts36 Oct 19, 2024
74f9c33
actually add the communication
lroberts36 Oct 19, 2024
ee04547
split into cpp file
lroberts36 Oct 19, 2024
cc14c89
format
lroberts36 Oct 19, 2024
295e8a3
working mpi communication
lroberts36 Oct 19, 2024
d8fbd65
pull out and store buffers
lroberts36 Oct 19, 2024
566f36d
fix serial builds
lroberts36 Oct 19, 2024
8a9a1bc
be a little more careful
lroberts36 Oct 21, 2024
75667ab
Set things up for communication
lroberts36 Oct 21, 2024
d56bc78
Make functions avilable on device
lroberts36 Oct 21, 2024
a35fccf
Add untested PackAndSend
lroberts36 Oct 21, 2024
6436fdc
Add receive and unpack
lroberts36 Oct 21, 2024
5fd8de5
Receive everything
lroberts36 Oct 21, 2024
95db032
compiles
lroberts36 Oct 21, 2024
9c4010f
small name change
lroberts36 Oct 21, 2024
b4efb3d
segfault
lroberts36 Oct 21, 2024
995913e
correctly point to send buffers
lroberts36 Oct 21, 2024
160c77f
allow explicit staling of send buffers
lroberts36 Oct 21, 2024
6355185
taking a few steps
lroberts36 Oct 22, 2024
12df3b6
switch to reference symantics
lroberts36 Oct 22, 2024
b55659c
remove print statements
lroberts36 Oct 22, 2024
1be047d
clear the combined buffers after remesh
lroberts36 Oct 22, 2024
1b405dc
some other debugging stuff
lroberts36 Oct 22, 2024
7f5b944
fix bug
lroberts36 Oct 22, 2024
b0dd208
format and lint
lroberts36 Oct 22, 2024
d8ae6e8
small
lroberts36 Oct 22, 2024
d0d0194
small part 2
lroberts36 Oct 22, 2024
6bafb0a
small part 3
lroberts36 Oct 22, 2024
07f62e2
format
lroberts36 Oct 22, 2024
809dcb1
Fix on vista
Oct 23, 2024
43c376b
format and lint
lroberts36 Oct 23, 2024
505426d
Update src/bvals/comms/combined_buffers.hpp
lroberts36 Oct 23, 2024
e60b82c
Update src/bvals/comms/combined_buffers.cpp
lroberts36 Oct 23, 2024
4d49ce5
use separate comm
lroberts36 Oct 23, 2024
ee58f4d
save other communication mechanism
lroberts36 Oct 24, 2024
59c8680
add a barrier at the end of ReceiveBoundBufs
lroberts36 Oct 24, 2024
dc54426
fix reallocation issue
lroberts36 Oct 24, 2024
b823b74
Make things work with AMR and flux correction
lroberts36 Oct 24, 2024
5649581
pre check small buffers for staleness
lroberts36 Oct 24, 2024
17f25e0
format and lint
lroberts36 Oct 24, 2024
cff847b
Add some debugging code
lroberts36 Oct 30, 2024
8cc982c
implement a number of different receive strategies and use issend
lroberts36 Oct 30, 2024
72ec3ac
format and lint
lroberts36 Oct 30, 2024
0ff61c0
remove extra iterations
lroberts36 Oct 30, 2024
f5f7bba
remove MPI_BARRIER
lroberts36 Oct 30, 2024
7c320cb
clear message buffer
lroberts36 Oct 31, 2024
20e9765
remove unused stuff
lroberts36 Oct 31, 2024
977b2a3
working side by side but not using new stuff
lroberts36 Nov 1, 2024
fe8a2af
working with new split
lroberts36 Nov 1, 2024
2e35467
removed extra junk
lroberts36 Nov 1, 2024
7c80de1
format and lint
lroberts36 Nov 1, 2024
ec10ab2
add line
lroberts36 Nov 1, 2024
675895e
compile w/o mpi
lroberts36 Nov 1, 2024
9bb4747
remov mesh passing
lroberts36 Nov 1, 2024
cef604f
format and lint
lroberts36 Nov 1, 2024
655dc39
fix non-mpi compilation
lroberts36 Nov 1, 2024
d786b86
start working to pass around var ids
lroberts36 Nov 1, 2024
ece20a3
pass MeshData
lroberts36 Nov 1, 2024
4116362
almost there...
lroberts36 Nov 1, 2024
25c2c51
use the MeshData uids
lroberts36 Nov 1, 2024
d7ba65d
Working with subsets, no cacheing
lroberts36 Nov 1, 2024
91771ad
format
lroberts36 Nov 1, 2024
5cddc0d
start on cacheing
lroberts36 Nov 4, 2024
207c2dc
include allocation status in output
lroberts36 Nov 4, 2024
ac01d92
sparse maybe working
lroberts36 Nov 4, 2024
fa53f72
add comm switch
lroberts36 Nov 4, 2024
24aa55f
fix logic
lroberts36 Nov 4, 2024
f72ee8f
format and lint
lroberts36 Nov 4, 2024
a194eba
Check that send buffers are completed before deleting
lroberts36 Nov 5, 2024
3b21bbc
Merge branch 'develop' into lroberts36/add-combined-buffer-communication
lroberts36 Nov 12, 2024
8670107
don't start comms
lroberts36 Nov 13, 2024
f2d8c0c
some more stuff that doesn't work
lroberts36 Nov 13, 2024
deeccf4
small
lroberts36 Nov 13, 2024
c22cf96
regular Isend
lroberts36 Nov 13, 2024
d5d4328
don't require all received
lroberts36 Nov 13, 2024
255e85a
format
lroberts36 Nov 14, 2024
3941117
Add documentation
lroberts36 Nov 14, 2024
9159c93
Merge branch 'develop' into lroberts36/add-combined-buffer-communication
lroberts36 Nov 14, 2024
52778c5
small doc
lroberts36 Nov 15, 2024
451b246
rename to coalesced
lroberts36 Nov 15, 2024
43c4efa
split things up
lroberts36 Nov 15, 2024
5ff9c60
comment
lroberts36 Nov 15, 2024
3adb853
some renaming
lroberts36 Nov 15, 2024
06231e1
rename
lroberts36 Nov 15, 2024
b14432f
reanme again
lroberts36 Nov 15, 2024
24a9733
one line to fix everything
lroberts36 Nov 18, 2024
1e5e4d4
format
lroberts36 Nov 18, 2024
1f1c5f3
default to coalesced comms
lroberts36 Nov 18, 2024
0c3131e
cache different var sets
lroberts36 Nov 18, 2024
08d9c13
allocate combined buffers only as needed
lroberts36 Nov 19, 2024
a694074
changelog
lroberts36 Nov 19, 2024
3353b52
small
lroberts36 Nov 19, 2024
408ecb3
copyright year
lroberts36 Nov 19, 2024
06bd5d3
fix a couple of things
lroberts36 Nov 19, 2024
0bd053e
remove unused
lroberts36 Nov 19, 2024
0183590
remove comment
lroberts36 Nov 19, 2024
975ce4a
oops
lroberts36 Nov 19, 2024
867af0a
skip non-communicated variables
lroberts36 Nov 21, 2024
b573337
Merge branch 'develop' into lroberts36/add-combined-buffer-communication
lroberts36 Nov 25, 2024
5648a82
Update doc/sphinx/src/boundary_communication.rst
lroberts36 Nov 25, 2024
1bcfdba
Update doc/sphinx/src/boundary_communication.rst
lroberts36 Nov 25, 2024
7edde1e
address brryan comment
lroberts36 Nov 25, 2024
5fefb71
doc at brryan's suggestion
lroberts36 Nov 25, 2024
9ae28c8
remove commented outlines
lroberts36 Nov 25, 2024
e95f691
do view of views correctly for Kokkos 4.5.1
lroberts36 Nov 26, 2024
c67a5fb
act on a bunch of small comments
lroberts36 Nov 27, 2024
6aa7dcd
move functions and add max_iters to clear
lroberts36 Nov 27, 2024
0f3795d
fix buffer bugs?
lroberts36 Nov 28, 2024
22ecb2e
no need to check if not doing coalesced
lroberts36 Nov 28, 2024
822a859
Remove deprecated note
lroberts36 Nov 28, 2024
8739373
Merge branch 'develop' into lroberts36/add-combined-buffer-communication
Yurlungur Dec 2, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,7 @@
## Current develop

### Added (new features/APIs/variables/...)
- [[PR 1192]](https://github.com/parthenon-hpc-lab/parthenon/pull/1103) Coalesced buffer communication
- [[PR 1103]](https://github.com/parthenon-hpc-lab/parthenon/pull/1103) Add sparsity to vector wave equation test
- [[PR 1185]](https://github.com/parthenon-hpc-lab/parthenon/pull/1185) Bugfix to particle defragmentation
- [[PR 1184]](https://github.com/parthenon-hpc-lab/parthenon/pull/1184) Fix swarm block neighbor indexing in 1D, 2D
Expand Down
90 changes: 90 additions & 0 deletions doc/sphinx/src/boundary_communication.rst
Original file line number Diff line number Diff line change
Expand Up @@ -476,3 +476,93 @@ For backwards compatibility, we keep the aliases
- ``ReceiveFluxCorrections`` = ``ReceiveBoundBufs<BoundaryType::flxcor_recv>``
- ``SetFluxCorrections`` = ``SetBoundBufs<BoundaryType::flxcor_recv>``

Coalesced MPI Communication
lroberts36 marked this conversation as resolved.
Show resolved Hide resolved
---------------------------

As is described above, a one-dimensional buffer is packed and unpacked for each communicated
field on each pair of blocks that share a unique topological element (below we refer to this
as a variable-boundary buffer). For codes with larger numbers of variables and/or in
simulations run with smaller block sizes, this can result in a large total number of buffers
and importantly a large number of buffers that need to be communicated across MPI ranks. The
latter fact can have significant performance implications, as each ``CommBuffer<T>::Send()``
call for these non-local buffers corresponds to an ``MPI_Isend``. Generally, these messages
contain a small amount of data which results in a small effective MPI bandwith. Additionally,
MPI implementations seem to have a hard time dealing with the large number of messages
required. In some cases, this can result in poor scaling behavior for Parthenon.

To get around this, we introduce a second level of buffers for communicating across ranks.
For each ``MeshData`` object on a given MPI rank, coalesced buffers equal in size to all
MPI non-local variable-boundary buffers are created for each other MPI rank that ``MeshData``
communicates to. These coalesced buffers are then filled from the single variable-boundary
buffers, a *single* MPI send is called per MPI rank pair, and the receiving ranks unpack the
coalesced buffer into the single variable-boundary buffers. This can drastically reduce the
number of MPI sends and increase the total amount of data sent per message, thereby
increasing the effective bandwidth. Further, in cases where Parthenon is running on GPUs but
GPUDirect MPI is not available, this can also minimize the number of DtoH and HtoD copies
during communication.

To use coalesced communication, your input must include:

.. code::

parthenon/mesh/do_coalesced_comms = true

curently by default this is set to ``true``.
Comment on lines +506 to +510
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we think this works for all downstreams including kharma, artemis and riot I am in favor of the default being true. If there's some doubt, we should maybe change the default to false.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I tend to lean toward default false until there is more downstream testing. To make sure it is passing regression tests, it needs to be set to true for now though (or we would have to change all the parameter input). There is some discussion of this above though where @brryan suggested we keep true.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am fine with it being default true. But I would also be fine modifying all the tests to set it to true manually.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm in principle also happy with default true (assuming that all downstream codes work/perform as expected as others already noted).


Implementation Details
~~~~~~~~~~~~~~~~~~~~~~

The coalesced send and receive buffers for each rank are stored in ``Mesh::pcoalesced_comms``,
which is a ``std::shared_ptr`` to a ``CoalescedComms`` object. To do coalesced communication
two pieces are required: 1) an initialization step telling all ranks what coalesced buffer
messages they can expect and 2) a mechanism for packing, sending and unpacking the coalesced
buffers during each boundary communication step.

For the first piece, after every remesh during ``BuildBoundaryBuffers``, each non-local
variable-boundary buffer is registered with ``pcoalesced_comms``. Once all these buffers are
registered, ``CoalescedComms::ResolveAndSendSendBuffers()`` is called, which determines all
the coalesced buffers that are going to be sent from a given rank to every other rank, packs
information about each of the coalesced buffers into MPI messages, and sends them to the other
ranks so that the receiving ranks know how to interpret the messages they receive from a given
rank. ``CoalescedComms::ReceiveBufferInfo()`` is then called to receive this information from
other ranks. This process basically just packs ``BndId`` objects, which contain the information
necessary to identify a variable-boundary communication channel and the amount of data that
is communicated across that channel, and then unpacks them on the receiving end and finds the
correct variable-boundary buffers. These routines are called once per rank (rather than per
``MeshData``).
Yurlungur marked this conversation as resolved.
Show resolved Hide resolved

For the second piece, variable-boundary buffers are first filled as normal in ``SendBoundBufs``
but the states of the ``CommBuffer``s are updated without actually calling the associated
``MPI_Isend``s. Then ``CoalescedComms::PackAndSend(MeshData<Real> *pmd, BoundaryType b_type)``
is called, which for each rank pair associated with ``pmd`` packs the variable-boundary buffers
into the coalesced buffer, packs a second message containing the sparse allocation status of
lroberts36 marked this conversation as resolved.
Show resolved Hide resolved
each variable-boundary buffer, send these two messages, and then stales the associated
variable-boundary buffers since their data is no longer required. On the receiving side,
``ReceiveBoundBufs`` receives these messages, sets the corresponding variable-boundary
buffers to the correct ``received`` or ``received_null`` state, and then unpacks the data
into the buffers. Note that the messages received here do not necessarily correspond to the
``MeshData`` that is passed to the associated ``ReceiveBoundBufs`` call, so all
variable-boundary associated with a given receiving ``MeshData`` must still be checked for
being in a received state. Once they are all in a received state, setting of boundaries,
prolongation, etc. can proceed normally.

Some notes:
- Internally ``CoalescedComms`` contains maps from MPI rank and ``BoundaryType`` (e.g. regular
communication, flux correction) to ``CoalescedBuffersRank`` objects for sending and receiving
rank pairs. These ``CoalescedBuffersRank`` objects in turn contain maps from ``MeshData``
partition id of the sending ``MeshData`` (which also doubles as the MPI tag for the messages)
to ``CoalescedBuffer`` objects.
lroberts36 marked this conversation as resolved.
Show resolved Hide resolved
- ``CoalescedBuffersRank`` is where the post-remesh initialization routines are actually
implemented. This can either correspond to the send or receive side.
- ``CoalescedBuffer`` corresponds to each coalesced buffer and is where the
the packing, sending, receiving, and unpacking details for coalesced boundary communication
lroberts36 marked this conversation as resolved.
Show resolved Hide resolved
are implemented. This object internally owns the ``CommunicationBuffer<BufArray1D<Real>>``
that is used for sending and receiving the coalesced data (as well as the communication buffer
used for communicating allocation status).
- Because Parthenon allows communication on ``MeshData`` objects that contain a subset of the
``MetaData::FillGhost`` fields in a simulation, we need to be able to interpret coalesced
messages that that contain a subset of fields. Most of what is needed for this is implemented
in ``GetBndIdsOnDevice``.
- Currently, there is a ``Compare`` method in ``CoalescedBuffer`` that is just for
debugging. It should compare the received coalesced messages to the variable-boundary buffer
messages, but using it requires some hacks in the code to send both types of buffers.
10 changes: 3 additions & 7 deletions example/fine_advection/advection_driver.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -95,9 +95,6 @@ TaskCollection AdvectionDriver::MakeTaskCollection(BlockList_t &blocks, const in
auto &mc1 = pmesh->mesh_data.Add(stage_name[stage], mbase);
auto &mdudt = pmesh->mesh_data.Add("dUdt", mbase);

auto start_send = tl.AddTask(none, parthenon::StartReceiveBoundaryBuffers, mc1);
auto start_flxcor = tl.AddTask(none, parthenon::StartReceiveFluxCorrections, mc0);
lroberts36 marked this conversation as resolved.
Show resolved Hide resolved

// Make a sparse variable pack descriptors that can be used to build packs
// including some subset of the fields in this example. This will be passed
// to the Stokes update routines, so that they can internally create variable
Expand Down Expand Up @@ -146,9 +143,8 @@ TaskCollection AdvectionDriver::MakeTaskCollection(BlockList_t &blocks, const in
}
}

auto set_flx = parthenon::AddFluxCorrectionTasks(
start_flxcor | flx | flx_fine | vf_dep, tl, mc0, pmesh->multilevel);

auto set_flx = parthenon::AddFluxCorrectionTasks(flx | flx_fine | vf_dep, tl, mc0,
Yurlungur marked this conversation as resolved.
Show resolved Hide resolved
pmesh->multilevel);
auto update = set_flx;
if (do_regular_advection) {
update = AddUpdateTasks(set_flx, tl, parthenon::CellLevel::same, TT::Cell, beta, dt,
Expand All @@ -170,7 +166,7 @@ TaskCollection AdvectionDriver::MakeTaskCollection(BlockList_t &blocks, const in
}

auto boundaries = parthenon::AddBoundaryExchangeTasks(
update | update_vec | update_fine | start_send, tl, mc1, pmesh->multilevel);
update | update_vec | update_fine, tl, mc1, pmesh->multilevel);

auto fill_derived =
tl.AddTask(boundaries, parthenon::Update::FillDerived<MeshData<Real>>, mc1.get());
Expand Down
4 changes: 4 additions & 0 deletions src/CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -101,9 +101,13 @@ add_library(parthenon
bvals/comms/bvals_in_one.hpp
bvals/comms/bvals_utils.hpp
bvals/comms/build_boundary_buffers.cpp
bvals/comms/bnd_id.cpp
bvals/comms/bnd_id.hpp
bvals/comms/bnd_info.cpp
bvals/comms/bnd_info.hpp
bvals/comms/boundary_communication.cpp
bvals/comms/coalesced_buffers.cpp
bvals/comms/coalesced_buffers.hpp
bvals/comms/tag_map.cpp
bvals/comms/tag_map.hpp

Expand Down
44 changes: 30 additions & 14 deletions src/basic_types.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -77,6 +77,36 @@ enum class BoundaryType : int {
gmg_prolongate_recv
};

inline constexpr bool IsSender(BoundaryType btype) {
if (btype == BoundaryType::flxcor_recv) return false;
if (btype == BoundaryType::gmg_restrict_recv) return false;
if (btype == BoundaryType::gmg_prolongate_recv) return false;
return true;
}

inline constexpr bool IsReceiver(BoundaryType btype) {
if (btype == BoundaryType::flxcor_send) return false;
if (btype == BoundaryType::gmg_restrict_send) return false;
if (btype == BoundaryType::gmg_prolongate_send) return false;
return true;
}

inline constexpr BoundaryType GetAssociatedReceiver(BoundaryType btype) {
if (btype == BoundaryType::flxcor_send) return BoundaryType::flxcor_recv;
if (btype == BoundaryType::gmg_restrict_send) return BoundaryType::gmg_restrict_recv;
if (btype == BoundaryType::gmg_prolongate_send)
return BoundaryType::gmg_prolongate_recv;
return btype;
}

inline constexpr BoundaryType GetAssociatedSender(BoundaryType btype) {
if (btype == BoundaryType::flxcor_recv) return BoundaryType::flxcor_send;
if (btype == BoundaryType::gmg_restrict_recv) return BoundaryType::gmg_restrict_send;
if (btype == BoundaryType::gmg_prolongate_recv)
return BoundaryType::gmg_prolongate_send;
return btype;
brryan marked this conversation as resolved.
Show resolved Hide resolved
}

enum class GridType : int { none, leaf, two_level_composite, single_level_with_internal };
struct GridIdentifier {
GridType type = GridType::none;
Expand All @@ -102,20 +132,6 @@ inline bool operator<(const GridIdentifier &lhs, const GridIdentifier &rhs) {
return lhs.logical_level < rhs.logical_level;
}

constexpr bool IsSender(BoundaryType btype) {
if (btype == BoundaryType::flxcor_recv) return false;
if (btype == BoundaryType::gmg_restrict_recv) return false;
if (btype == BoundaryType::gmg_prolongate_recv) return false;
return true;
}

constexpr bool IsReceiver(BoundaryType btype) {
if (btype == BoundaryType::flxcor_send) return false;
if (btype == BoundaryType::gmg_restrict_send) return false;
if (btype == BoundaryType::gmg_prolongate_send) return false;
return true;
}

// Enumeration for accessing a field on different locations of the grid:
// CC = cell center of (i, j, k)
// F1 = x-normal face at (i - 1/2, j, k)
Expand Down
69 changes: 69 additions & 0 deletions src/bvals/comms/bnd_id.cpp
Original file line number Diff line number Diff line change
@@ -0,0 +1,69 @@
//========================================================================================
// Parthenon performance portable AMR framework
// Copyright(C) 2024 The Parthenon collaboration
// Licensed under the 3-clause BSD License, see LICENSE file for details
//========================================================================================
// (C) (or copyright) 2020-2024. Triad National Security, LLC. All rights reserved.
//
// This program was produced under U.S. Government contract 89233218CNA000001 for Los
// Alamos National Laboratory (LANL), which is operated by Triad National Security, LLC
// for the U.S. Department of Energy/National Nuclear Security Administration. All rights
// in the program are reserved by Triad National Security, LLC, and the U.S. Department
// of Energy/National Nuclear Security Administration. The Government is granted for
// itself and others acting on its behalf a nonexclusive, paid-up, irrevocable worldwide
// license in this material to reproduce, prepare derivative works, distribute copies to
// the public, perform publicly and display publicly, and to permit others to do so.
//========================================================================================

#include <algorithm>
#include <cstdio>
#include <iostream> // debug
#include <memory>
#include <string>
#include <vector>

#include "basic_types.hpp"
#include "bvals/comms/bnd_id.hpp"
#include "bvals/comms/bvals_utils.hpp"
#include "bvals/neighbor_block.hpp"
#include "config.hpp"
#include "globals.hpp"
#include "interface/state_descriptor.hpp"
#include "interface/variable.hpp"
#include "kokkos_abstraction.hpp"
#include "mesh/domain.hpp"
#include "mesh/mesh.hpp"
#include "mesh/mesh_refinement.hpp"
#include "mesh/meshblock.hpp"
#include "prolong_restrict/prolong_restrict.hpp"
#include "utils/error_checking.hpp"

namespace parthenon {

BndId BndId::GetSend(MeshBlock *pmb, const NeighborBlock &nb,
std::shared_ptr<Variable<Real>> v, BoundaryType b_type,
int partition, int start_idx) {
auto [send_gid, recv_gid, vlabel, loc, extra_id] = SendKey(pmb, nb, v, b_type);
BndId out;
out.send_gid() = send_gid;
out.recv_gid() = recv_gid;
out.loc_idx() = loc;
out.var_id() = v->GetUniqueID();
out.extra_id() = extra_id;
out.rank_send() = Globals::my_rank;
out.rank_recv() = nb.rank;
out.partition() = partition;
out.size() = BndInfo::GetSendBndInfo(pmb, nb, v, nullptr).size();
out.start_idx() = start_idx;
return out;
}

void BndId::PrintInfo(const std::string &start) {
Yurlungur marked this conversation as resolved.
Show resolved Hide resolved
printf("%s var %s (%i -> %i) starting at %i with size %i (Total combined buffer size = "
"%i, buffer size = %i, buf_allocated = %i) [rank = %i]\n",
start.c_str(), Variable<Real>::GetLabel(var_id()).c_str(), send_gid(),
recv_gid(), start_idx(), size(), coalesced_buf.size(), buf.size(), buf_allocated,
Globals::my_rank);
}

} // namespace parthenon
111 changes: 111 additions & 0 deletions src/bvals/comms/bnd_id.hpp
Original file line number Diff line number Diff line change
@@ -0,0 +1,111 @@
//========================================================================================
// Parthenon performance portable AMR framework
// Copyright(C) 2024 The Parthenon collaboration
// Licensed under the 3-clause BSD License, see LICENSE file for details
//========================================================================================
// (C) (or copyright) 2020-2024. Triad National Security, LLC. All rights reserved.
//
// This program was produced under U.S. Government contract 89233218CNA000001 for Los
// Alamos National Laboratory (LANL), which is operated by Triad National Security, LLC
// for the U.S. Department of Energy/National Nuclear Security Administration. All rights
// in the program are reserved by Triad National Security, LLC, and the U.S. Department
// of Energy/National Nuclear Security Administration. The Government is granted for
// itself and others acting on its behalf a nonexclusive, paid-up, irrevocable worldwide
// license in this material to reproduce, prepare derivative works, distribute copies to
// the public, perform publicly and display publicly, and to permit others to do so.
//========================================================================================

#ifndef BVALS_COMMS_BND_ID_HPP_
#define BVALS_COMMS_BND_ID_HPP_

#include <memory>
#include <string>
#include <vector>

#include "basic_types.hpp"
#include "bvals/neighbor_block.hpp"
#include "coordinates/coordinates.hpp"
#include "interface/variable_state.hpp"
#include "mesh/domain.hpp"
#include "mesh/forest/logical_coordinate_transformation.hpp"
#include "utils/communication_buffer.hpp"
#include "utils/indexer.hpp"
#include "utils/object_pool.hpp"

namespace parthenon {

template <typename T>
class Variable;

// Provides the information necessary for identifying a unique variable-boundary
// buffer, identifying the coalesced buffer it is associated with, and its
// position within the coalesced buffer.
struct BndId {
lroberts36 marked this conversation as resolved.
Show resolved Hide resolved
constexpr static std::size_t NDAT = 10;
int data[NDAT];

// Information for identifying the buffer with a communication
// channel, variable, and the ranks it is communicated across
KOKKOS_FORCEINLINE_FUNCTION
int &send_gid() { return data[0]; }
KOKKOS_FORCEINLINE_FUNCTION
int &recv_gid() { return data[1]; }
KOKKOS_FORCEINLINE_FUNCTION
int &loc_idx() { return data[2]; }
KOKKOS_FORCEINLINE_FUNCTION
int &var_id() { return data[3]; }
KOKKOS_FORCEINLINE_FUNCTION
int &extra_id() { return data[4]; }
Yurlungur marked this conversation as resolved.
Show resolved Hide resolved
KOKKOS_FORCEINLINE_FUNCTION
int &rank_send() { return data[5]; }
KOKKOS_FORCEINLINE_FUNCTION
int &rank_recv() { return data[6]; }
BoundaryType bound_type;

// MeshData partition id of the *sender*
// not set by constructors and only necessary for coalesced comms
KOKKOS_FORCEINLINE_FUNCTION
int &partition() { return data[7]; }
KOKKOS_FORCEINLINE_FUNCTION
int &size() { return data[8]; }
KOKKOS_FORCEINLINE_FUNCTION
int &start_idx() { return data[9]; }

bool buf_allocated;
buf_pool_t<Real>::weak_t buf; // comm buffer from pool
BufArray1D<Real> coalesced_buf; // Combined buffer

void PrintInfo(const std::string &start);

KOKKOS_DEFAULTED_FUNCTION
BndId() = default;
KOKKOS_DEFAULTED_FUNCTION
BndId(const BndId &) = default;

explicit BndId(const int *const data_in) {
for (int i = 0; i < NDAT; ++i) {
data[i] = data_in[i];
}
}

void Serialize(int *data_out) {
for (int i = 0; i < NDAT; ++i) {
data_out[i] = data[i];
}
}

bool SameBVChannel(const BndId &other) {
// Don't want to compare start_idx, so -1
for (int i = 0; i < NDAT - 1; ++i) {
if (data[i] != other.data[i]) return false;
}
return true;
}
Yurlungur marked this conversation as resolved.
Show resolved Hide resolved

static BndId GetSend(MeshBlock *pmb, const NeighborBlock &nb,
std::shared_ptr<Variable<Real>> v, BoundaryType b_type,
int partition, int start_idx);
};
} // namespace parthenon

#endif // BVALS_COMMS_BND_ID_HPP_
Loading
Loading