Improve interpSigma support for regionally cropped WACCM files#12
Improve interpSigma support for regionally cropped WACCM files#12LeoLumos wants to merge 1 commit into
Conversation
|
@LeoLumos - Thanks for this addition! I do think using subset inputs will be very useful. I have a couple questions before running my own tests. Can you confirm this does change results with the full file? Did you run any of the tox testing? thanks, |
|
Thanks for reviewing this and for the helpful questions. I confirmed that the modified interpSigma implementation successfully processes regionally cropped WACCM subset files generated through the NCAR/UCAR subsetting workflow, which previously failed with broadcasting errors. However, I have not yet completed a strict regression comparison against the original full global WACCM files to verify whether the outputs remain numerically identical to the original implementation in the standard use case. My main testing so far has focused on ensuring compatibility with subset inputs and verifying that the generated ICON/BCON files appear structurally correct. I also have not run the tox test suite yet, since I am still relatively new to the project workflow and GitHub contribution process. I can look into running those tests next. One thing I did notice during testing is that the BCON files generated from the modified implementation are significantly smaller in file size compared to those produced by the original implementation, even though the dimensions and variables appear consistent in ncdump and ncview. That made me uncertain whether there may still be scientific or metadata-related differences that require further validation. |
|
Thanks for contributing and don't feel like you have to do the testing. I just wanted to know where I should start. That's a great heads up on the file sizes. I'll do some tests and then merge. Can you post your test input file for me to use in my checks? |
|
Sorry for not being able to reply in time, I tried to upload my files, but they were too big. I'm trying to transfer them to Google Drive, but it may also take a day. Or you can go to NCAR's website to download, the global CESM output file I use is from https://doi.org/10.5065/XS0R-QE86, the region is from https://www.acom.ucar.edu/cesm/subset.shtml, and the delineated range is in the picture. |
|
I'm happy to do the download. I'd like to also understand your concatenation method. I'm reading this as a concatenation (likely nco's ncrcat or cdo) of multiple subset outputs. I want to replicate the final file. |
|
I directly filled in the time and spatial ranges on the website https://www.acom.ucar.edu/cesm/subset.shtml. It looks like NCAR automatically uses NCO to process the global CESM output, and after processing, they send the resulting files to my email. You can try using the ranges I provided above (in the picture) to replicate the process. |
barronh
left a comment
There was a problem hiding this comment.
Thanks for these updates that definitely solve the issue.
With the original code, I was able to reproduce the failure and with the new code I saw that it worked. The new code also works with the old files, which is good. So, all of this works.
Steps to test:
- I downloaded a file as you described (45E,0N,160E,60N).
- I defined a horizontal domain over the East Asia. I started with a the 12US1 definition, and defined a 12CN1 that uses 97E (instead of -97E) as the central longitude to center the domain within the WACCM subset space.
- For the vertical coordinate, I used the default 35 layer configuration.
- I ran the waccm_example.py file to point to the downloaded file and use the 12CN1 domain.
- With the original code, I got an error during the PseudoNetCDF.from_ncvs command.
- With the new code, I got a successful simulation and reviewed output files for structure and figures for basic reasonability.
As part of my review, I tried to understand these updates to see where they may be appropriate for other models. Several of the updates seemed unnecessary, so I tested without them. Without those updates, the code was still fully functional and the outputs were identical. It is possible that my configuration or input file affected my testing. Perhaps they would be necessary under different testing conditions.
Below are a list of updates that seem unnecessary:
- The updates for P0, hyai, hybi, hyam, hybm did not change the operation or the output. P0 is available on the file as a non-dimensional variable, which works with the old code. The hybrid coordinate variables all had just one dimension (lev or ilev), which worked with the old code.
- The updates for divide-by-zero or nan did not change the operation or the output. Several of the divide-by-zero catches seem irrelevant given that we are working with sum of pressure type variables. More comments are within.
- The dimension check did not affect operation or output because no variables have (time, lev) without an additional dimension.
To try to understand if the issue is the testing conditions, I have attached a CDL header (ncdump -h) of my input and output. The input, which came directly from the processor email link, has the dimensionality of all variables that might be different than yours. The output contains the IOAPI parameters defining the horizontal and vertical coordinates, which may be different than yours.
Can you post similar text files for me to further understand the issues that these updates intend to address?
cesm-20260515140335763006.cdl.txt
cesm-20260515140335763006_12CN1_BCON.cdl.txt
| for key, var in self.variables.items() | ||
| if var.dimensions[:2] == ('time', 'lev') | ||
| if ( | ||
| len(var.dimensions) >= 2 |
There was a problem hiding this comment.
No problem here, but I am curious. Did this actually come up as a problem? When I run the code, this is an unnecessary check. In my test, there is no variable with dimensions (time, lev).
There was a problem hiding this comment.
To be perfectly honest, I added this out of habit as a defensive programming measure while I was debugging the broader issue, and I simply forgot to remove it once the code was working. You are completely right that standard Python slicing [:2] safely handles tuples of any length without throwing an IndexError, making this explicit length check redundant. I am totally fine with reverting this specific line back to the original if var.dimensions[:2] == ('time', 'lev') to keep the codebase clean.
| pnorm = pweight.sum(1) | ||
|
|
||
| # Prevent divide-by-zero during weighted averaging. | ||
| pnorm = np.where(pnorm == 0, 1e-10, pnorm) |
There was a problem hiding this comment.
Is there a reason to use the ternary np.where vs pnorm = np.maximum(pnorm, 1e-10)? I'm always a bit hesitant to use an exact value where statement.
There was a problem hiding this comment.
I initially used np.where as a quick patch to suppress a divide-by-zero warning during early testing, but you're absolutely right that checking exact floating-point equality (== 0) is risky. np.maximum is definitely the more robust approach.
| self.variables[key][:][:, :, None, ...] * pweight | ||
| ).sum(1) / pnorm | ||
|
|
||
| outf = pnc.PseudoNetCDFFile.from_ncvs(**outvars) |
There was a problem hiding this comment.
Thanks for identifying the issue and providing this fix. I'm making a rather long comment to document the issue for posterity. Any action item is on my side.
For from_ncvs to work, the elements of outvars should have had their own dimensions and attributes... but they don't in my run either. It turns out this is a result of variables from the subsetter having the _FillValue property. This creates masked variables, which have problems. This is related to a decision that I made long ago in PseudoNetCDF. @shoyer correctly cautioned me against this approach... I continue to pay the price for my choice to continue down this path!
PseudoNetCDFVariable has a subclass PseudoNetCDFMaskedVariable that handles variables whose results are masked. Both versions are intended to inherit properties from their parent objects when undergoing math. This does not work with applying methods like sum with PseudoNetCDFMaskedVariable. Internally, numpy.ma.MaskedArray sum method internally creates an unmasked array and views it as a PseudoNetCDFMaskedVariable. When this happens, the oth element of __array_finalize__ is a numpy.ndarray object and does not have PseudoNetCDF properties. As a result, the new PseudoNetCDFMaskedVariable is created without appropriate dimensions or ncattrs. As a result, the result of the math in outvars is not appropriate for from_ncvs.
Either arrays or PseudoNetCDFVariable should work fine with PseudoNetCDFFile.from_arrays, which does much the same as your explicit outf build. Similarly, it would also be very easy to just add the appropriate dimensions and properties within the existing loop that generates outvars. Your explicit build is nice because it has fewer external dependencies which can go wrong.
There was a problem hiding this comment.
Thanks for sharing the detailed background! I'm glad the explicit build approach serves as a solid and reliable workaround here.
| itershape = [newshape[0]] + newshape[3:] | ||
| sigma = (pedges - vgtop) / (psfc - vgtop) | ||
|
|
||
| # Protect against divide-by-zero or invalid sigma calculations. |
There was a problem hiding this comment.
pressure should always be positive, vgtop should always be at or below the minimum pressure, and psfc must always be greater than vgtop.
Can you help me understand when you are getting a divide by zero error? Or plus/minus infinity? I'm using the standard example.
I ask because if there is a case I am missing, it will have implications elsewhere. So, just trying to nail this down.
There was a problem hiding this comment.
You are absolutely right about the physics—psfc must always be greater than vgtop in any valid atmospheric column. The reason I added this was to suppress a Numpy RuntimeWarning (divide-by-zero/invalid value) that popped up during my runs. Given your previous comment about the subsetter introducing _FillValue and masked variables, I strongly suspect this warning was just triggered by those masked/invalid grid cells (where psfc might be missing or evaluated as 0), rather than valid data points. It was purely a defensive measure to silence the warning. If the _FillValue issue you mentioned will naturally resolve this, or if you prefer to keep the warnings active to catch genuine data errors, I am more than happy to remove this block and revert to the original calculation.
| hyam = self.variables['hyam'][:][slicer] | ||
| hybi = self.variables['hybi'][:][slicer] | ||
| hyai = self.variables['hyai'][:][slicer] | ||
|
|
There was a problem hiding this comment.
_get_1d would have the effect of taking a meta variable and, if more than 1-dimensional, assuming the first dimension was meaningless. In my tests, I used 2022-03-01 to 2022-03-05 from the download tool. In the results I got, hybm/hyam/hybi/hyai all had just dimension lev, so the tool works with or without these updates.
Again, no adverse outcome from including these changes. I'm highlighting this because I wonder if my testcase is not the same as yours. Perhaps it would be necessary if I was querying cesm more like you? Or the vertical coordinates would be more finnicky if I was using your destination grid.
Can you provide me a bit more details:
- date range
- destination vertical coordinates
- destination gdnam and associated GRIDDESC description.
- a the header of your input (
ncdump -h cesm.nc)
There was a problem hiding this comment.
Thanks for digging into this! Similar to the dimension check we discussed earlier, this _get_1d function was another defensive measure.
To help you understand the exact conditions of my run, I have attached the ncdump -h headers for both the input and output (cesm-20260507070159806849.txt and cesm-20260507070159806849_27CN_BCON.txt), along with the GRIDDESC file you requested.
cesm-20260507070159806849.txt
cesm-20260507070159806849_27CN_BCON.txt
GRIDDESC.txt
| ) | ||
| ] | ||
|
|
||
| # Manually construct the output file to avoid broadcast errors from from_ncvs |
There was a problem hiding this comment.
The flake8 pep enforcement fails because this comment line goes beyond the 80 character limit. Before merge, this needs to be either shortened or split into two lines. I'd propose simply deleting the word broadcast, because the underlying problem is before the broadcasting.
There was a problem hiding this comment.
Good catch on the flake8 warning! You are completely right, removing the word 'broadcast' is the perfect fix and makes the comment more accurate anyway.

This PR improves interpSigma support for regionally cropped WACCM files.
Main changes: