Skip to content

Changes for on disk processing with GEOS-Chem specific speciation mechanism#2

Open
roarkmd wants to merge 10 commits into
barronh:mainfrom
roarkmd:main
Open

Changes for on disk processing with GEOS-Chem specific speciation mechanism#2
roarkmd wants to merge 10 commits into
barronh:mainfrom
roarkmd:main

Conversation

@roarkmd
Copy link
Copy Markdown

@roarkmd roarkmd commented Mar 27, 2026

  1. Created geoschem14.6.2 speciation mechanism handling. 1) Added mechrc file and mechanism imports. 2) Created parameters to pass speciation mechanism to utility functions.
  2. Implemented functions to read files from disk rather than AWS buckets.
  3. Added scipy as a requirement.
  4. Added parameter "mode" to discriminate between 2D and 3D files. Previously, the first layer was kept and all other layers were silently dropped.
  5. Added example smoke_geos_chem.py that uses argparse script parameters to process files on disk. Consider adding drop_zero_values to utils and internal functionality to write molecular weight files in a similar way to how the CB6 mechanism was handled.

…a custom mechanism, other than cb6.

Custom mechanisms can be uses as long as there is an appropriate mechanism molecular weight file listed in the running directory.
Previously, gd_file only read the first layer of the file and silently dropped other layers if present. The mode parameter can be set to "3D" to not drop any files. gd2hemco_fast_3D is a modification of gd2hemco_fast that allows layers in 3D premerged files to undergo bilinear interpolation without dropping/ignoring layers above the first layer. For SMOKE to hemco usage, levels should be specified as the midpoints of the vertical layers. (SMOKE's default is to report VGLVLs as the endpoints (inclusive) of each model layer.)
…gine option to specify the backend used to open the netcdf file
…s in the p22hemco function. Added saftey mechanism by removing default speciation mechanisms from unitconvert options.
…s with zero emissions. Added arguments to set variables in run script using argparse.
Copy link
Copy Markdown
Owner

@barronh barronh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@roarkmd - Thanks for the great updates! I'm sorry that it took so long for me to review.

After looking thru the code, there are a couple updates I'd propose. Primarily, I'd propose having gd2hemco_fast_3D replace gd2hemco_fast. That would require a few other updates to make sure the flow all works.

I also have a larger question. Please take a look at my comment on the expected response of gd2hemco_fast_3D in the 3D case. I have questions that I think we need to at least talk thru before making the change.

Comment thread cmaq2hemco/mechrc/core.py
break
else:
raise IOError('file not found')
raise IOError(f'File not found: {hcpath}')
Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great descriptive addition.

__all__ = ['writeconfig']


cq2gc = {
Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should probably be named gc2gc. I see that BC is spit into BCPI and BCPO (and similar for OC). If I understand it correctly, everything else just a pass thru.

'XYLE': [['XYLE', '']],
}
# ignore special species, inventory meta variables, HAP tracers, UNK/UNR
for key in ['NH3_FERT','HFLUX', 'VOC_INV','NMOG','CH4_INV','ACROLEIN','BUTADIENE13','ETHYLBENZ','UNK','UNR']:
Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For flake8 compatibility, we're going to need spaces after commas.

Comment thread cmaq2hemco/utils.py

return outf

def gd2hemco_fast_3D(path, gf, elat, elon, lev=None, verbose=0, gc=None, nr=None):
Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I need to review this further. It looks like a near duplicate of gd2hemco_fast. If the 3d functions can be integrated without a new function, that would be preferable.

Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When I read the gd2hemco_fast and gd2hemco_fast_3D, it looks like gd2hemco_fast_3D can do 3D and it can do 2D. For 2D files, I'd expect results to be identical but with an extra dimension from gd2hemco_fast_3D. If so, it could easily replace gd2hemco_fast entirely.

I'm a little confused about what gd2hemco_fast_3D would do with 3D files. The code seems to anticipate two cases:

  • If lev is not provided, then it seems pretty clear.
    • The default GEOS-Chem levels from _deflevs are used. There is an assumption that the input vertical structure matches the GEOS-Chem structure.
    • When HEMCO receives the file, I believe it makes the same assumption.
  • If lev is provided, then I'm not sure it will work.
    • If lev is a subset of _deflev starting at the surface, then it all makes sense to me.
    • If lev is a definition based on the input, then I think it will be wrong.
      • In this case, I am guessing that lev is some function of vglvls that are generally similar to the (A+B*Ps) used in geos-chem.
      • However, the vglvls will be according to WRF configuration and will not match GEOS-Chem.
      • I believe that HEMCO will assume a vertical structure that matches GEOS-Chem.
        • Within HEMCO, I am not aware of any vertical interpolation capabilities.
        • That would mean that first k-levels of GEOS-Chem will be assumed to match the first k-levels of the input... that most certainly could be wrong.
      • For example, our WRF configuration has a first layer depth of 20m compared to GEOS-Chem's 120m.
    • Can you confirm this understanding?

Comment thread cmaq2hemco/utils.py

return outf

def unitconvert(key, val, unit, area=None, inplace=True, gc=None, nr=None):
Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good addition to allow the pass thru.

Comment thread cmaq2hemco/utils.py


def gd_file(ef):
def gd_file(ef, mode=None):
Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This might be obviated if gd2hemco_fast can include 2d or 3d.

@@ -0,0 +1,176 @@
#!/home/mroark/miniforge3/envs/cmaq2hemco/bin/python
Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great to include an example! Probably remove the shebang that is mroark specific.

Comment thread cmaq2hemco/utils.py
return dzbriggs

def open_gdfile(
date, tmpl, engine='scipy'
Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In particular, I typically would prefer to use the netcdf4 engine in most cases. I'm guessing that you introduced the engine keyword to supporting the direct open of gz files (i.e, gzip, BytesIO). Instead of the engine keyword, I'd prefer to see more general support for kwargs that get passed thru. In the general kwargs case, you could "default" engine to scipy if opening from gz. The format support from the netcdf4 engine is broader and is preferrable in many cases.

Comment thread cmaq2hemco/utils.py
ef = ef.isel(
LAY=0, drop=True
).rename(TSTEP='time')
if mode == '3D':
Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If having gd2hemco_fast_3D replace gd2hemco_fast, I'd simply set this to check if there are more than one layer and proceed accordingly. If 1, use isel and drop. If not, keep them.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants