Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

restart fix to soil heterotrophic respiration #1109

Closed
wants to merge 2 commits into from

Conversation

rgknox
Copy link
Contributor

@rgknox rgknox commented Oct 27, 2023

Description:

FATES tracks a small set of diagnostics that rely on soil heterotrophic respiration, such as NEP. Previously, our history diagnostics had used the bc_in structure to fill these diagnostics. However, we have been having trouble with these variables on restarts, because when restarting the fates history, the bc_in field has not bee initialized yet by the host. The fix here is for FATES to remember soil heterotrophic respiration at the site level, include this in the restart, and use this value to fill the diagnostics as well.

Fixes #1106

Collaborators:

@glemieux

Expectation of Answer Changes:

B4B

This should have no noticeable impact on memory usage, and should be b4b (ctsm) and b4b with fix (e3sm). Note, this was not a problam with ctsm because the API is different, and we likely found a way to fill the bc_in variable earlier in the restart sequence.

Checklist

If this is your first time contributing, please read the CONTRIBUTING document.

All checklist items must be checked to enable merging this pull request:

Contributor

  • The documentation has been assessed to determine if updates are necessary (NOT NECESSARY)

Integrator

  • FATES PASS/FAIL regression tests were run
  • Evaluation of test results for answer changes was performed and results provided

Documentation

Test Results:

CTSM (or) E3SM (specify which) test hash-tag:

CTSM (or) E3SM (specify which) baseline hash-tag:

FATES baseline hash-tag:

Test Output:

@glemieux
Copy link
Contributor

Is it possible that this is actually an upstream issue? I'm looking through the e3sm_land_developer and e3sm_land_debug tests and there are no ERS non-fates tests that include HR. I'm wondering if we should run a non-fates ERS test with HR in the output.

@glemieux
Copy link
Contributor

Testing this with E3SM-Project/E3SM#5555 still showing COMPARE_base_rest failure for FATES_HET_RESP in the e3sm_land_developer suite with ERS_Ld20.f45_f45.IELMFATES.pm-cpu_gnu.elm-fates.

Perlmutter: /pscratch/sd/g/glemieux/e3sm-tests/pr5555-pr1109-restartfix.fates.pm-cpu.gnu.C9a524415e4-F97aa4ecd

@glemieux
Copy link
Contributor

Adding HR, LITHR, and SOMHR to the fates ERS test shows all three coming back with restart DIFFs.

Perlmutter location: /pscratch/sd/g/glemieux/e3sm-tests/pr5555-pr1109-restartfix-moreHR-fates.fates.pm-cpu.gnu.C9a524415e4-F97aa4ecd

@glemieux
Copy link
Contributor

glemieux commented Oct 28, 2023

Writing out some diagnostic statements, I'm seeing that after restart, the bc_in%tot_het_resp and site%soil_het_resp look correct in the assignments happening here:

sites(s)%soil_het_resp = bc_in(s)%tot_het_resp

and in wrap_update_hifreq_hist on the hlm-side here. This is happening at timestep 0001-01-12_00:30:00, at the end of which we get the FATES dynamics call. After that, I'm seeing that while the soil_het_resp is a match across the runs during the next time step, a few of the sites in tot_het_resp start to diverge compared to the base. This then seems to propogate into the soil_het_resp on the next timestep (before the fates dynamic run). As such, I think there might be an issue with the restart for HR.

Perlmutter location: /global/homes/g/glemieux/scratch/e3sm-tests/pr5555-pr1109-restartfix-moreHR-fates-diag1.fates.pm-cpu.gnu.C9a524415e4-F97aa4ecd

@glemieux
Copy link
Contributor

glemieux commented Oct 31, 2023

Tracing back the calculation of HR on the hlm-side and seeing that decomp_cascade_hr_vr is an input to this calculation, I'm wondering if the decomp_cpool_sourcesink update on the fates side could be contributing to the issue in some manner. Or given that ctsm seems to not have this issue, if the problem is with the order of operations in calling the UpdateLitterFluxes that contain the decomp value update based on the fates litter flux bc_out values.

@glemieux glemieux closed this Oct 31, 2023
@glemieux glemieux reopened this Oct 31, 2023
@glemieux
Copy link
Contributor

At @rgknox suggestion, I overrode the code on the elm-side in the UpdateLitterFluxes subroutine to avoid having the bc_out litter fluxes affect the decomp_cpools_sourcesink calculation, with the intent of determining if this affects the restart. The COMPARE_base_rest still failed, so we can likely close out this as a possible avenue of investigation.

@rgknox
Copy link
Contributor Author

rgknox commented Nov 3, 2023

This PR is superceeded by #1114

@rgknox rgknox closed this Nov 3, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

E3SM api 27 restart issue
2 participants