Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix: Bounds in output for Metadata::None variables #1188

Merged
merged 9 commits into from
Nov 13, 2024

Conversation

AstroBarker
Copy link
Collaborator

In updating phoebus to use up-to-date parthenon I ran into an issue where a Metadata::None variable was failing to output due to view bounds accesses. There seems to have been an off-by-one error in the relevant code block in output_utils.hpp:313. I've updated the bounds there to correctly output Metadata::None variables.

To test this, I modified the advection example to have an optional Metadata::None variable (enabled in the input deck <Advection>/test_metadata_none). I set the output_hdf5 test to use this variable.

PR Checklist

  • Code passes cpplint
  • New features are documented.
  • Adds a test for any bugs fixed. Adds tests for new features.
  • Code is formatted
  • Changes are summarized in CHANGELOG.md
  • Change is breaking (API, behavior, ...)
    • Change is additionally added to CHANGELOG.md in the breaking section
    • PR is marked as breaking
    • Short summary API changes at the top of the PR (plus optionally with an automated update/fix script)
  • CI has been triggered on Darwin for performance regression tests.
  • Docs build
  • (@lanl.gov employees) Update copyright on changed files

Copy link
Collaborator

@pdmullen pdmullen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice find! Output bugs are always tricky to track down.

@Yurlungur Yurlungur enabled auto-merge October 10, 2024 23:04
Copy link
Collaborator

@Yurlungur Yurlungur left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the fix! This breakage is definitely on me.

@Yurlungur
Copy link
Collaborator

Looks like tests are failing @AstroBarker ?

@AstroBarker
Copy link
Collaborator Author

Looks like tests are failing @AstroBarker ?

Looks like output_hdf5 so probably it's the gold files

@Yurlungur
Copy link
Collaborator

almost certainly. run locally and see what the error is. might need to update the gold files.

@AstroBarker
Copy link
Collaborator Author

almost certainly. run locally and see what the error is. might need to update the gold files.

Yeah they pass locally for me only if I update the goldfiles. I added a new output for the Metadata::None variable in the hdf5 tests. What's the process for updating the proper gold files?

@Yurlungur
Copy link
Collaborator

Run parthenon/tst/regression/gold_standard/make_tarball.sh. Then create a release from the generated tarball. You will also need to update the tarball version and hash in the CMake

auto-merge was automatically disabled October 15, 2024 02:16

Head branch was pushed to by a user without write access

Copy link
Collaborator

@pgrete pgrete left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In addition to the comment on filling the content of the new var: Should we also add a proper check that this is working, i.e., read the data of the variable to confirm that the data written is looking what it's supposed to look like?

Comment on lines 322 to 338
if (test_metadata_none) {
IndexRange ib = pmb->cellbounds.GetBoundsI(IndexDomain::interior);
IndexRange jb = pmb->cellbounds.GetBoundsJ(IndexDomain::interior);
IndexRange kb = pmb->cellbounds.GetBoundsK(IndexDomain::interior);

// packing in principle unnecessary/convoluted here and just done for demonstration
std::vector<std::string> vars({"metadata_none_var"});
PackIndexMap imap;
const auto &v = rc->PackVariables(vars, imap);

const int ivar = imap.get("metadata_none_var").first;
pmb->par_for(
PARTHENON_AUTO_LABEL, 0, 2, kb.s, kb.e, jb.s, jb.e, ib.s, ib.e,
KOKKOS_LAMBDA(const int n, const int k, const int j, const int i) {
v(ivar, n, k, j, i) = n + k * j * i;
});
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure what/why this is happening here.
If I read the the field correctly, then it's a vector with 3 components and extents larger than the mesh size by one.
In this par_for the block indices are used (without any global offset), so every block will have the same indices set to the same values and a large fraction of the index space will remain 0.
Is that intentional?

Copy link
Collaborator Author

@AstroBarker AstroBarker Oct 21, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch, I didn't pay much mind to the contents of the fields, only that its shape reproduced the crash. This should be fixed to be filled now, I think.

@Yurlungur Yurlungur enabled auto-merge October 24, 2024 22:33
Comment on lines 337 to 339
v(ivar_lo, n, k, j, i) = n + k * j * i;
v(ivar_lo + 1, n, k, j, i) = 1 + n + k * j * i;
v(ivar_hi, n, k, j, i) = 2 + n + k * j * i;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i just looked at the updated loop.
While I understand the fix going from mesh to meshbock, I now don't get the ivar_lo/ivar_hi indexing.
What's supposed to happen here? I could see this working if ivar_lo/ivar_hi replace the n` index.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any comment on this? ping @AstroBarker

I'm happy to merge this without adding test to read the data (which would be great), but I'd like to understand this loop before merging as from my current understanding it might seem like being an unintended bug.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've updated the loop, removing the ivar_lo/hi stuff, which I agree was not necessary in the loop, and added a section to the hdf5 test that checks the contents of this variable. Sorry for the delay!

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great, thanks!

@pgrete pgrete disabled auto-merge November 12, 2024 15:53
@pgrete pgrete enabled auto-merge (squash) November 13, 2024 09:54
@pgrete
Copy link
Collaborator

pgrete commented Nov 13, 2024

Formatter does not work on forks. I'm force merging.

@pgrete pgrete disabled auto-merge November 13, 2024 15:02
@pgrete pgrete enabled auto-merge (squash) November 13, 2024 15:03
@pgrete
Copy link
Collaborator

pgrete commented Nov 13, 2024

I don't see the force merge button any more? Am I missing sth?

@Yurlungur
Copy link
Collaborator

I don't see the force merge button any more? Am I missing sth?

I already clicked auto merge. It seems like not all the "long" tests triggered. I will try to disable/re-enable.

@Yurlungur Yurlungur disabled auto-merge November 13, 2024 15:09
@Yurlungur Yurlungur enabled auto-merge November 13, 2024 15:09
@pgrete
Copy link
Collaborator

pgrete commented Nov 13, 2024

I don't see the force merge button any more? Am I missing sth?

I already clicked auto merge. It seems like not all the "long" tests triggered. I will try to disable/re-enable.

Yes, that's what I did earlier today and all tests passed (except for the formatting one that, as we know, does not report results for PRs from forks).
Previously, there was a force merge button but I don't see it any more (maybe because of the changed branched protection rules?)

@Yurlungur
Copy link
Collaborator

I don't see the force merge button any more? Am I missing sth?

I already clicked auto merge. It seems like not all the "long" tests triggered. I will try to disable/re-enable.

Yes, that's what I did earlier today and all tests passed (except for the formatting one that, as we know, does not report results for PRs from forks). Previously, there was a force merge button but I don't see it any more (maybe because of the changed branched protection rules?)

If you want to force merge, disable auto-merge. Then there should be an option.

@pgrete
Copy link
Collaborator

pgrete commented Nov 13, 2024

I don't see the force merge button any more? Am I missing sth?

I already clicked auto merge. It seems like not all the "long" tests triggered. I will try to disable/re-enable.

Yes, that's what I did earlier today and all tests passed (except for the formatting one that, as we know, does not report results for PRs from forks). Previously, there was a force merge button but I don't see it any more (maybe because of the changed branched protection rules?)

If you want to force merge, disable auto-merge. Then there should be an option.

That is basically my question. I don't see the button any more.
So if you still see it, feel free to go ahead and force merge (as the tests including extended ones passed earlier today).

@Yurlungur Yurlungur disabled auto-merge November 13, 2024 15:47
@Yurlungur Yurlungur enabled auto-merge November 13, 2024 15:48
@Yurlungur
Copy link
Collaborator

I don't see the force merge button any more? Am I missing sth?

I already clicked auto merge. It seems like not all the "long" tests triggered. I will try to disable/re-enable.

Yes, that's what I did earlier today and all tests passed (except for the formatting one that, as we know, does not report results for PRs from forks). Previously, there was a force merge button but I don't see it any more (maybe because of the changed branched protection rules?)

If you want to force merge, disable auto-merge. Then there should be an option.

That is basically my question. I don't see the button any more. So if you still see it, feel free to go ahead and force merge (as the tests including extended ones passed earlier today).

Shoot you're right. I don't see it either. Maybe we submit a new MR on a branch and see if that goes through? Should figure out how to make fork merges work though...

@Yurlungur Yurlungur disabled auto-merge November 13, 2024 15:51
@Yurlungur Yurlungur enabled auto-merge November 13, 2024 15:51
@pgrete pgrete disabled auto-merge November 13, 2024 15:53
@pgrete pgrete merged commit b3a2b8d into parthenon-hpc-lab:develop Nov 13, 2024
43 of 58 checks passed
@pgrete
Copy link
Collaborator

pgrete commented Nov 13, 2024

Alright, so it wasn't just me and it was related to the branch protection rules.
I recently enabled
image
because of my accidental push to develop.
So I temp disabled it, now force merged, and then re-enabled it as a quick workaround for this PR.

In the long run it'd definitely be nice to have the formatter also trigger for forks.
I looked into this in the past and was not able to figure out a solution within a reasonable time (and thus tabled it).
So from my point of view, I'm happy to continue working with a manual work around (unless more and more PRs are coming from forks) and spent the time debugging CI on other (CI) pieces ;)

@Yurlungur
Copy link
Collaborator

Aah I see. Ok sounds good.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants