-
Notifications
You must be signed in to change notification settings - Fork 8
Internal mutation of communicator or other MPI objects in relation to C++ const semantics #980
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
I disagree with all of this. MPI handles are not objects. They are object handles. The handles are const. The hidden state of the object itself is not relevant. |
I don't want to derail the discussion, but at no point I said that handle is an object. If I have to characterize a handle, I would say, with caveats, that is closer to a pointer to the (interesting) object. The hidden state of the object is relevant because it tells you what you can do with it. I understand that historically the "handle" is what called "the communicator", this alone creates confusion in this discussion. At the end these are all definitions, if there is a concrete effect in the proposed interface, this is what we should focus on. |
I think what is missing from the write-up is a clear motivation about why we should care about whether the internal (non-observable) state of objects with user-managed handles changes or not. I have a vague idea about multi-threading semantics potentially playing a role here but I am neither sold on being overly restrictive nor sure that I fully grasp the problem you're getting at. From a user perspective, these handles are const (they won't change) and I don't care about implementation details as long as I get correct results based on correct usage of the API. |
Explain to me your argument with immediate send changing the state of a communicator when that communicator is MPI_COMM_WORLD, which is a literal constant value in both the MPICH and MPI-5 ABIs. Tell me what about the literal value that isn't pointing to anything has its state mutated by isend. |
My assertion is that you incorrectly conflate internal state change in the MPI library in the global message queue (or whatever you want to call it) with state change in object handles. As no MPI implementation I know of has per-communicator message queues, it's likely that you are wrong in both practice and theory when it comes to isend mutating a communicator. |
@devreal Fair enough, I will try to improve the motivation as we continue the conversation. Ultimately it boils down to what a C++ interface will look like, which is a very concrete product of this discussion. The answer to "why we should care"?, seems to depend on the definition of "we". This discussion comes from a subgroup of this forum that seemed to care about this question. @jeffhammond Yes, it is possible that the example I made isend is incorrect, in theory and in practice, for the reason you state (the queue is not container in the communicator). (The good thing is that we are thinking about the true internal state of the communicator, which is what this discussion is really about) At the end the day, I care because this discussion will answer these very important questions. Even if I am incorrect in the analysis, there is a very concrete questions here below in the code.
|
The nature of the internal state of MPI is invisible to you and the C++ bindings. It has no effect on const, any more than the firmware version running on the NIC. Pretend MPI is implemented in hardware. Every MPI handle is just a handle to a structure in an ASIC that lives outside of the host process address space. Design your MPI bindings for that and they will be correct. |
Thank you, @jeffhammond, for the guideline about pretending MPI is implemented in hardware. Does this guideline answer the question of what should be For example, should ...or in more practical terms (independent of C++ ideas), can I call Given my knowledge level, I don't see yet how your guideline can answer this question. |
The answer for your last question only depends on the MPI threading level. If you initialized with thread-multiple, you can make these calls concurrently. With a lower threading level, you can only call a small subset of MPI functions concurrently. I think I would start with a mental model of seeing MPI opaque handles as a const reference to an object with some members declared mutable. Based on my observation that use of const decorated functions modifying mutable members is common practice in C++ (e.g. having mutable mutex members), I don't understand the statement in your initial post about the meaning of const in C++. I tend to interpret const in C++ as the function has no caller-visible side effects to the object. |
I consulted a coworker who is very active in WG21, who said that const is almost always pointless and the only reason he adds it to API declarations is to avoid wasting time arguing with people who think it matters. |
Yes, thank you. That makes sense, but there are two problems with that: The threading level is runtime, so we can only choose it at runtime.
Ok, but effectively what "members are declared mutable" depends a lot on the threading-level chosen (or obtained) during initialization.
I agree with the last observation; the only things that can be "honestly" mutable are mutexes and things locked by mutexes.
And that is a correct. My statement is that it extends also to include visible side effects from other threads, not just the caller (not just the called in the same thread). |
That is the most precise piece of advice to resolve this issue. Accuracy is a different matter, and I am afraid this person is pulling someone's leg or exaggerating. Even to please others and stop arguments, you must know where to put |
I am opening this issue after a lengthy discussion at a C++ MPI weekly meeting organized by @tonyskjellum and encouraged by the participants @sg0, Tim Uhl, and @EvanDrakeSuggs.
What I can write in this first post is simply a layout of the problem.
I expect this discussion to be long and full of subtleties as we go deeper.
The main issue that I propose to discuss is to see to what degree const-correctness in C++ can reflect fundamental aspects of MPI communication, efficient implementations of MPI, and common practice.
Background: It is central to the idea of C++ that the language provides a way to communicate aspects and guarantees under mutation, mainly in the form of the
const
attribute (and its siblingmutable
), that adds beneficial semantic information to a program.Historically, the
const
aspect of a function or a variable has been interpreted simply as saying that a particular operation or part of the program would leave the relevant object the same before and after an operation.This changed dramatically in C++11.
The current interpretation of
const
became more stringent because it was found helpful to interpretconst
-ness not only as meaning that an object is left in the same state before and after an operation, but also "during" the operation.This modern interpretation of
const
is driven by efficiency and maximizing the utility of this language feature.When writing a C++ wrapper to MPI (or any C-interface that is not
const
-aware), adding the keywordconst
requires both a deep knowledge of the 1) interface (and its semantics), 2) the implementation (internal mutation) and 3) (the most difficult) fundamental understanding of what makes implementations efficient given the constraints, or the actual system, or even the underlying hardware that drives sound implementations.Problem
The problem is that the MPI standard says little about the mutation of MPI "objects" in the MPI standard.
Objects can include: a) communicators object themselves, b) request objects, and (perhaps less interesting for this discussion) c) data being communicated.
In most cases, the internal mutation is implied by common knowledge.
However, during discussions, they are not well-known, not agreeing upon, or are interpreted as quality-of-implementation issues.
This uncertainty implies that a C++ interface will have to be very "defensive," leaving out performance on the table and not even able to exploit idiomatic C++.
In other words, for every little doubt we have, we will force ourselves to remove
const
keywords from many places in a C++ interface.It is generally agreed that a C++ interface to MPI will have a communicator object.
If it exists, this is not simply the handle of a C interface but what the handle "points" to.
In other words, we want to deal with an object that exists in this form:
Given that, we found three prevalent simple scenarios that illustrate this point.
(Please don't concentrate on the proposed syntax; if they are member functions or free functions, for example; it is the semantics what matters).
Take the simple example of
send
.Should
send
be declared as aconst
member?My claim is that, in a runtime environment (where it is not known at compile time whether the MPI is initialized or threaded), the send function shouldn't be
const
.This surprised many because they said the communicator should be in the same state before and after sending a message.
My answer is that even if that is the case, it doesn't matter. If there is an internal change of the communicator during the send operation, even if it is a small cache (that is not guaranteed to be synchronized), the operation should not be const.
(This is without entering into the philosophical questions of whether the communicator is the "same" before and after sending.)
immediate_send
(assuming we want that in the interface, which is a separate question).Here, my claim is that this member
::immediate_send
shouldn't beconst
either because the communicator is in a different "state" after the immediate-send and it will have a pending operation..duplicate()
operation should not beconst
either.The reason, and this empirical, is that it seems that the
MPI_Duplicate
modifies (at least temporarily) the state of the source communicator.Among other things, this prevents the implementation of a C++ interface that has a communicator copy-constructor, which is something that is surprising.
This cases are just the tip of the iceberg.
Proposal
These examples illustrate the surprising implications of the guarantees (or lack of) provided by the standard MPI.
Please note that, as C++ programmers, we are not "demanding" that implementation do one or the other thing so that we can use the
const
keyword everywhere.The idea is for anyone developing C++ interfaces and using them, to faithfully reflect semantics and implementation mutations on the MPI objects.
Changes to the Text
I will need a lot of help proposing changes to the text, and honestly, I prefer it if other people do it.
What I can say is that any clarification in this direction will need to go much beyond the ubiquitous:
The reason is that this only says that functions can be called from different threads, but it does say anything about calls to the same (or different) function that share, for example, MPI comm handle arguments.
Impact on Implementations
Certain aspect of the implementation will have be agreed upon and, if not, explicitly stated whether mutation may happen internally.
In other other words internal (unsynchronized) mutation will became part of the documented interface (even if the language doesn't provide a mechanism for that, i.e. in fortran or C).
My assumption is that implementations are already optimal in this aspect; if they need to mutate internal state to do operations then there are already good reasons for that.
Implementations can help by stating this mutation in their documentation/notes.
Impact on Users
People using C++ interfaces will be empowered, and programs will be safer because
const
(or the lack of it) will accurately reflect the nature of MPI communication and the fundamental algorithms and efficiency trade-offs.References and Pull Requests
There is a lot of material to discuss this, something to start with:
https://github.com/llnl/b-mpi3
https://github.com/llnl/b-mpi3?tab=readme-ov-file#thread-safety
https://github.com/llnl/b-mpi3?tab=readme-ov-file#duplication-of-communicator
https://web.archive.org/web/20170119232617/https://channel9.msdn.com/posts/C-and-Beyond-2012-Herb-Sutter-You-dont-know-blank-and-blank
https://web.archive.org/web/20160924183715/https://channel9.msdn.com/Shows/Going+Deep/C-and-Beyond-2012-Herb-Sutter-Concurrency-and-Parallelism
The text was updated successfully, but these errors were encountered: