-
Notifications
You must be signed in to change notification settings - Fork 37
2023.03.09 Meeting Notes
Philipp Grete edited this page Mar 9, 2023
·
2 revisions
- Individual/group updates
- Intel update (phydro runs on ponte vecchio!)
- review non-WIP PRs
JM
- couple of PRs in flight
- found "bug" in prolong/restrict (PR with guard rails open) due to inconsistent Metadata (setting properties versus handling logic in the constructor) -> have issue open to discuss "fix" to Metadata
- cleaned up and generalized integrators (now as class), contains SSPRK4, PR open, need review
- would be nice to also support RKL(2) supertimestepping as part of the machinery, PM and PG will check
- presenting in monthly institutional computing group meeting for Parthenon + downstream codes
- if so. wants to highlight things there, send material to JM
BP
- new PR for decallocating MPI comms in reductions
- depending on MPI lib this resulted in host running out of memory or even failing without error message
LR
- running sims with 64 sparse vars (that indeed save memory)
- still ran OOM, tracked down to LoadBalancing which didn't differentiate between (different) sparse and dense var when rebuilding the tree. implemented and tested
- reduced buffer size approach <- seems to be faster
- completely buffer free approach
- code currently lives in riot branch, so will eventually end up in develop
PM
- fighting MPI issues (not related to downstream codes) like
GTL_DEBUG: [69] cudaEventQuery: uncorrectable NVLink error detected during the execution MPICH ERROR [Rank 69] - Abort(373423874) (rank 69 in comm 0): Fatal error in PMPI_Test: Invalid count, error stack
FG
- figuring out copyright/open sourcing issues
JS
- having fun with Polaris and different Kokkos version
- observed perf. regression (2x) from 3.7 to 4.0, need to investigate in Parthenon, too
BW
- would be interested in SDC (spectral deferred correction) integrator
- could be used to control error in operator splitting or arb. high order integration
PG
- catching up on PR review/issue backlog
- Ascent PR ready for merge
- an internal (to be released) version of OneAPI contains the fixes required to make parthenon(-hydro) compile (ahead of time, AOT) mode
- works with Kokkos 4.0
- PG will now work with Intel on getting performance numbers and profiling data for more detailed comparison
- https://github.com/parthenon-hpc-lab/parthenon/pull/841 -> LR/PM (review)
- https://github.com/parthenon-hpc-lab/parthenon/pull/810 -> FG (review)
- https://github.com/parthenon-hpc-lab/parthenon/pull/838 -> LR (update)
- https://github.com/parthenon-hpc-lab/parthenon/pull/840 -> PG, PM (review)