Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RCAL-965 Provide conversion from TVAC/FPS models to ScienceRawModel #455

Open
wants to merge 18 commits into
base: main
Choose a base branch
from

Conversation

stscieisenhamer
Copy link
Collaborator

@stscieisenhamer stscieisenhamer commented Jan 30, 2025

Resolves RCAL-965

This PR addresses the issue that TVAC/FPS data cannot be run through the pipeline, mostly due to the fact that the TVAC-related datamodels are frozen. A new method, ScienceRawModel.from_tvac_raw is introduced to do the conversion.

Tasks

  • Update or add relevant roman_datamodels tests.
  • Update relevant docstrings and / or docs/ page.
  • Does this PR change any API used downstream? (If not, label with no-changelog-entry-needed.)
News fragment change types:
  • changes/<PR#>.feature.rst: new feature
  • changes/<PR#>.bugfix.rst: fixes an issue
  • changes/<PR#>.doc.rst: documentation change
  • changes/<PR#>.removal.rst: deprecation or removal of public API
  • changes/<PR#>.misc.rst: infrastructure or miscellaneous change

@stscieisenhamer stscieisenhamer requested a review from a team as a code owner January 30, 2025 20:02
Copy link

codecov bot commented Jan 30, 2025

Codecov Report

Attention: Patch coverage is 94.38202% with 5 lines in your changes missing coverage. Please review.

Project coverage is 97.19%. Comparing base (087a60d) to head (909c3cf).
Report is 102 commits behind head on main.

Files with missing lines Patch % Lines
tests/test_models.py 91.11% 4 Missing ⚠️
src/roman_datamodels/datamodels/_utils.py 96.66% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #455      +/-   ##
==========================================
- Coverage   97.56%   97.19%   -0.37%     
==========================================
  Files          30       37       +7     
  Lines        2788     3354     +566     
==========================================
+ Hits         2720     3260     +540     
- Misses         68       94      +26     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@stscieisenhamer stscieisenhamer requested review from WilliamJamieson and a team January 30, 2025 20:12
@schlafly
Copy link
Collaborator

schlafly commented Feb 3, 2025

Just trying to figure out how this works... if I do strun roman_elp tvac.asdf, when or where does from_tvac_raw get called?

Re schema differences, I think we should try to hit the things that romancal looks for but we don't need to copy everything over just because. The guide window xstart is an interesting case. romancal looks for it, but TVAC doesn't actually use guide windows and so if the code runs I would be fine with it without messing with guide window things.

@stscieisenhamer
Copy link
Collaborator Author

@schlafly This is all invoked in the dq_init_step. See romancal PR#1596

@stscieisenhamer
Copy link
Collaborator Author

initial regression run

Copy link
Collaborator

@PaulHuwe PaulHuwe left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am a little surprised more changes aren't needed. I notice that there are regression test failures - are these expected?

# Define how to recursively copy all attributes.
def node_update(raw, other):
"""Implement update to directly access each value"""
for key in other.keys():
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is rather similar to:
https://github.com/spacetelescope/roman_datamodels/blob/main/src/roman_datamodels/datamodels/_datamodels.py#L177

and I seem to recall other spots in the code doing similar recursive updating. Maybe it is best to make a utility function that these methods can call?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed. There is a subtle difference now, but will look for a way to handle more generically.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And methinks have accomplished it.

@schlafly
Copy link
Collaborator

schlafly commented Feb 6, 2025

Thanks, I think this looks reasonable. Can you add some documentation of what exactly we're trying to accomplish here, though? Probably as a docstring in from_tvac_raw, but potentially also in the RTD if there's a reasonable place.

i.e., something like:
romancal supports processing a selection of files which use an outdated schema. It supports these with a bespoke method that converts the files to the new format when they are read in dq_init. This conversion does not do a detailed mapping between all of the new and old metadata, but instead opportunistically looks for fields with common names and assigns them. Other metadata with non-matching names is simply copied in place. This allows processing to proceed and preserves the original metadata, but the resulting files have duplicates of many entries.
If that sounds right?

I think the biggest practical issue is that a lot of the guide window stuff changed name and we'll now have those bogus fields twice, which is ugly but practical. That said, since guide windows aren't actually being used for TVAC---I think? @tddesjardins ? --- it's hard for me to justify writing code to do the conversion of all of the unused fields.

Have you reviewed the schema to see if that's the biggest area where things got renamed? I went through Tyler's L2 schema doc to try to make sure, and I think that's the case, but it would be good to have another set of eyes on it.

@stscieisenhamer
Copy link
Collaborator Author

@schlafly I've updated the docstrings in both the roman_datamodel's api and the romancal high-level description of the dq_init step.

@stscieisenhamer
Copy link
Collaborator Author

For all reviewers: I believe all the comments have been addressed. Ready for full review

@@ -23,6 +24,78 @@ class FilenameMismatchWarning(UserWarning):
"""


def _node_update(to_node, from_node, extras=None, extras_key=None, ignore=None):
"""Copy node contents from an existing node to an existing node
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
"""Copy node contents from an existing node to an existing node
"""Copy node contents from an existing node to another existing node


How the copy occurs depends on existence of keys in `to_node`

If key exists in `to_node`, contents is converted from `from_node` stnode type to
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
If key exists in `to_node`, contents is converted from `from_node` stnode type to
If key exists in `to_node`, contents are converted from `from_node` stnode type to

Copy link
Collaborator

@PaulHuwe PaulHuwe left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM outside of the one question I posed

continue
if key in to_node:
if isinstance(to_node[key], Mapping):
_node_update(getattr(to_node, key), getattr(from_node, key), extras=extras, extras_key=extras_key)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will this potentially cause a value to accidentally move things to extras if, say, a sub-keyword happens to have the same name as a root level keyword that is in extras? Since the extra block isn't examining the hierarchy feeding it?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants