-
-
Couldn't load subscription status.
- Fork 1.8k
Factor out serialization for BuildResult
#4588
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Factor out serialization for BuildResult
#4588
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for catching my mistake before it gets too annoying :)
or 28 since the other conditions used decimal.
I have to say that having the conditions in decimal but the definition of the protocol number in hexa was causing way too much computations in my brain, which is why I wrote that one in hexa (but apparently that still wasn't enough to prevent me from messing-up the numbers… )
src/libstore/remote-store.cc
Outdated
|
|
||
| // BuildResult marshalliig is valid for daemon protocol >= 29 && legacy ssh >= 6 | ||
|
|
||
| BuildResult read(const Store & store, Source & from, Phantom<BuildResult> _) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These serialization methods seem a bit too rigid. What if we need to extend the BuildResult type? We would have to change this “instance”, but we can't because that would break backwards-compat with older stores.
Maybe we could start embedding some json (or bson or cbor or whatever) like I've done for Realisation so that we have more flexibility wrt backwards-compat?
Or maybe we could change these read functions to also take as parameter the version of the remote store so that they can adjust how they serialize accordingly (that would just be pushing the complexity of all the if (GET_PROTOCOL_MINOR()) into the worker_proto namespace, but that's still better than having these inlined I think)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe we could start embedding some json (or bson or cbor or whatever) like I've done for Realisation so that we have more flexibility wrt backwards-compat?
Eh, I don't like that as a solution for back compat. It just kicks the can down the road for whatever decodes that. (I do like it for reasons of having a standardized format, though :).)
Or maybe we could change these
readfunctions to also take as parameter the version of the remote store so that they can adjust how they serialize accordingly (that would just be pushing the complexity of all theif (GET_PROTOCOL_MINOR())into theworker_protonamespace, but that's still better than having these inlined I think)?
I made branch with that. The annoying thing is that the daemon protocol and serve protocol have separate versions, and it's unclear how to deal with that.
In #4588 I have Map<std::string, BuildResult>, so using the worker protocol stuff in particular, and not just factoring out into unrelated functions with extra version params, is needed for the Map generics to work.
I think my preference is just to do this for now. But then change the worker proto to allow for noncannonical serialization with like an optional extra parameter. Then the protocol version for each can be used to choose the serialization, without given the map set etc version parameters they don't use.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Eh, I don't like that as a solution for back compat. It just kicks the can down the road for whatever decodes that
Kind-of, yeah. But 99% of the time “whatever decodes that” is just BuildResult read(...) (or even one layer down a BuildResult::fromJSON() static method), so it stays confined. And because we can introspect what we get, we don't have to pass the protocol number everywhere anymore (I'm no expert, but I think that's what protobuf does for example, and that seem to scale pretty well).
I do like it for reasons of having a standardized format, though :)
You're giving me second though here, because I wouldn't really like an hypothetical standardized format to be specified directly in the C++ code 🤔. But maybe I'm just over-thinking it.
The annoying thing is that the daemon protocol and serve protocol have separate versions, and it's unclear how to deal with that.
Oh yes 😒. The simpler I guess would be to align both versions, but it's not utterly pretty either…
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On the other hand, since buildPaths throws and exception unless it succeeds, maybe I should just be returning the realizations themselves, and then this problem goes away. I think maybe #4594 which I should soon finish is the first thing to consider, and then returning to this one and its follow-up "track".
|
👍 Applying this patch fixes |
081cb95 to
00a4d5e
Compare
BuildResultBuildResult
|
I marked this as stale due to inactivity. → More info |
00a4d5e to
15ffdeb
Compare
BuildResultBuildResult --- contains #6223
|
OK! Relying on the new #6223, this no longer constrains the evolution of the code. |
BuildResult --- contains #6223BuildResult
15ffdeb to
e07c16d
Compare
|
Rebased. |
This is generally good practice, and will become more important in a moment.
It's a bit of a sledgehammar, and the data types we want to serialize have their own headers.
This allows that the versioning parts for it to actually be correct. Version agnostic serializers are deduped crudely with `#include`, while version-specific ones written separately. For me, there are two main motivations for this: 1. No more perverse incentives. fe1f34f did some awkward things because the serializers did not store the version. I don't want anyone making changes to be pushed towards keeping the serialization logic with the core data types because it's easier. 2. I am confident with the incremental approach I tried out in NixOS#1165, we can get Hydra using `ssh-ng://` (and the `Store` abstraction) pretty quickly. After that, I hope we can deprecate `ssh://` for a few releases and then remove it. I don't want to worry about accidentally changing it / breaking it during that time, and having separated codes ensures it stays out of the way! When the `#include "gen-` is ugly, because `ssh://` is hopefully on its way out, I don't mind so much. Once it is gone, we can get rid of that since we just will have the worker versions of things.
These will make it easier to incorporate the versions next.
No more spooky included .cc file
The other weird included file is gone!!
This will allow us to factor out more serializers
It does not belong with the data type itself. This also materializes the fact that `copyPath` does not do any version negotiation just just hard-codes "16".
…er way In NixOS#6311 (comment), I realized since derivation goals' wanted outputs can "grow" due to overlapping dependencies (See `DerivationGoal::addWantedOutputs`, called by `Worker::makeDerivationGoalCommon`), the previous bug fix had an unfortunate side effect of causing more pointless rebuilds. In paticular, we have this situation: 1. Goal made from `DerivedPath::Built { foo, {a} }`. 2. Goal gives on on substituting, starts building. 3. Goal made from `DerivedPath::Built { foo, {b} }`, in fact is just modified original goal. 4. Though the goal had gotten as far as building, so all outputs were going to be produced, `addWantedOutputs` no longer knows that and so the goal is flagged to be restarted. This might sound far-fetched with input-addressed drvs, where we usually basically have all our goals "planned out" before we start doing anything, but with CA derivation goals and especially RFC 92, where *drv resolution* means goals are created after some building is completed, it is more likely to happen. So the first thing to do was restore the clearing of `wantedOutputs` we used to do, and then filter the outputs in `buildPathsWithResults` to only get the ones we care about. But fix also has its own side effect in that the `DerivedPath` in the `BuildResult` in `DerivationGoal` cannot be trusted; it is merely the *first* `DerivedPath` for which this goal was originally created. To remedy this, I made `BuildResult` be like it was before, and instead made `KeyedBuildResult` be a subclass wit the path. Only `buildPathsWithResults` returns `KeyedBuildResult`s, everything else just becomes like it was before, where the "key" is unambiguous from context. I think separating the "primary key" field(s) from the other fields is good practical in general anyways. (I would like to do the same thing for `ValidPathInfo`.) Among other things, it allows constructions like `std::map<Key, ThingWithKey>` where doesn't contain duplicate keys and just precludes the possibility of those duplicate keys being out of sync. We might leverage the above someday to overload `buildPathsWithResults` to take a *set* of return a *map* per the above.
With the differences between the two protocols reconciled, I also factor out a `worker_proto` for `BuildResult` to be used in the latest versions of each protocol.
e07c16d to
be1d63d
Compare
|
Because the other dependent PR is merged, #6223 now just contain this while being linear. This PR is no longer needed. |
With the differences between the two protocols reconciled, I also factor out canonical serializers for
BuildResultto be used in the latest versions of each protocol.Depends on #6223
Depends on #6312There is currently some bug, which is hard to debug because of the forking. Once of the dependent PRs is merged, I can rebase the other, and that will help with bisecting.