Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Hackday mteodoro sky variance readnoise #1556

Open
wants to merge 22 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
22 commits
Select commit Hold shift + click to select a range
55fc57a
Rename ResampleData class weight type parameter.
mairanteodoro Dec 6, 2024
1159562
Add var_sky array to mosaic and enable usage in drizzle weight image.
mairanteodoro Dec 9, 2024
6d9f437
Add changelog entry.
mairanteodoro Dec 11, 2024
d7df709
Update docs and docstring.
mairanteodoro Dec 11, 2024
a3e8634
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Dec 11, 2024
25e4a16
Remove unnecessary code.
mairanteodoro Dec 11, 2024
9f6df6f
Fix terminology in resample docs.
mairanteodoro Dec 12, 2024
d277a66
Update step spec.
mairanteodoro Dec 16, 2024
3ba72ff
Add default value for inverse of variance sky if var_sky array is not…
mairanteodoro Dec 16, 2024
5900cf0
Remove unnecessary check.
mairanteodoro Dec 16, 2024
bc62c2b
Prevent zero division issues.
mairanteodoro Dec 17, 2024
5717c3b
Add description of calculation of VAR_SKY.
mairanteodoro Dec 30, 2024
e8925c6
Fix creating base image.
mairanteodoro Jan 10, 2025
7173bf3
Fix missing stacklevel keyword in call to warnings.warn().
mairanteodoro Jan 10, 2025
f850013
Fix unit test by adding non-zero background level.
mairanteodoro Jan 16, 2025
fadc18d
Add var_sky to model error calculation.
mairanteodoro Jan 16, 2025
047350b
Merge branch 'main' into hackday_mteodoro_sky_variance_readnoise
mairanteodoro Jan 29, 2025
c448778
Address review comments.
mairanteodoro Jan 29, 2025
7c3a31c
Fix check style issue.
mairanteodoro Jan 29, 2025
fcb2c17
Improve docs and logs.
mairanteodoro Jan 29, 2025
6507bfd
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Jan 29, 2025
332943c
Remove escape characters.
mairanteodoro Jan 30, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions changes/1556.resample.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
Allow resample to use inverse sky variance when building the drizzle weight map.
15 changes: 13 additions & 2 deletions docs/roman/resample/arguments.rst
Original file line number Diff line number Diff line change
Expand Up @@ -77,10 +77,21 @@ image.
The weighting type for each input image.
If `weight_type=ivm` (the default), the scaling value
will be determined per-pixel using the inverse of the read noise
(VAR_RNOISE) array stored in each input image. If the VAR_RNOISE array does
not exist, the variance is set to 1 for all pixels (equal weighting).
(``VAR_RNOISE``) array stored in each input image. If the ``VAR_RNOISE`` array does
not exist, the weight is set to 1 for all pixels (equal weighting).
If `weight_type=exptime`, the scaling value will be set equal to the
exposure time found in the image header.
If `weight_type=ivsky`, the scaling value will be determined per-pixel
using the inverse of the sky variance (``VAR_SKY``) array calculated in the
resample step for each input image. ``VAR_SKY`` is given by the following equation:

.. math::

\text{VAR_SKY} = \text{VAR_RNOISE} + \text{VAR_POISSON} \, \frac{ med(\text{DATA}) }{ \text{DATA} },

where :math:`\text{DATA}` and :math:`med(\text{DATA})` correspond to the data array and its median, respectively.
If the ``VAR_SKY`` array does not exist (which implies missing ``VAR_RNOISE`` and/or ``VAR_POISSON``),
the weight is set to 1 for all pixels (equal weighting).

``--in_memory`` (bool, default=True)
If set to `False`, write output datamodel to disk.
Expand Down
12 changes: 9 additions & 3 deletions romancal/outlier_detection/tests/test_outlier_detection.py
Original file line number Diff line number Diff line change
Expand Up @@ -103,9 +103,11 @@ def test_outlier_do_detection_write_files_to_custom_location(tmp_path, base_imag
img_1 = base_image()
img_1.meta.filename = "img1_cal.asdf"
img_1.meta.background.level = 0
img_1.data[:] = 0.01
img_2 = base_image()
img_2.meta.filename = "img2_cal.asdf"
img_2.meta.background.level = 0
img_2.data[:] = 0.01
input_models = ModelLibrary([img_1, img_2])

outlier_step = OutlierDetectionStep(
Expand Down Expand Up @@ -135,11 +137,13 @@ def test_find_outliers(tmp_path, base_image, on_disk):
cr_value = 100
source_value = 10
err_value = 10 # snr=1
sky_value = 0.5

imgs = []
for i in range(3):
img = base_image()
img.data[42, 72] = source_value
img.data[:] = sky_value
img.data[42, 72] += source_value
img.err[:] = err_value
img.meta.filename = str(tmp_path / f"img{i}_suffix.asdf")
img.meta.observation.observation_id = str(i)
Expand Down Expand Up @@ -200,14 +204,16 @@ def test_identical_images(tmp_path, base_image, caplog):
"""
Test that OutlierDetection does not flag any outliers in the DQ array if images are identical.
"""
background_level = 0.01
img_1 = base_image()
img_1.meta.filename = "img1_suffix.asdf"
img_1.meta.background.level = 0
img_1.meta.background.level = background_level
# add outliers
img_1_input_coords = np.array(
[(5, 45), (25, 25), (45, 85), (65, 65), (85, 5)], dtype=[("x", int), ("y", int)]
)
img_1.data[img_1_input_coords["x"], img_1_input_coords["y"]] = 100000
img_1.data[:] = background_level
img_1.data[img_1_input_coords["x"], img_1_input_coords["y"]] += 100

img_2 = img_1.copy()
img_2.meta.filename = "img2_suffix.asdf"
Expand Down
6 changes: 4 additions & 2 deletions romancal/regtest/test_resample.py
Original file line number Diff line number Diff line change
Expand Up @@ -60,6 +60,7 @@ def test_resample_single_file(rtdata, ignore_asdf_paths):
"var_poisson",
"var_rnoise",
"var_flat",
"var_sky",
]
)
}"""
Expand All @@ -76,6 +77,7 @@ def test_resample_single_file(rtdata, ignore_asdf_paths):
np.sum(~np.isnan(getattr(resample_out, x))) for x in [
"var_poisson",
"var_rnoise",
"var_sky",
]
)
}"""
Expand All @@ -94,14 +96,14 @@ def test_resample_single_file(rtdata, ignore_asdf_paths):
np.isnan(getattr(resample_out, x)),
np.equal(getattr(resample_out, x), 0)
)
) > 0 for x in ["var_poisson", "var_rnoise", "var_flat"]
) > 0 for x in ["var_poisson", "var_rnoise", "var_flat", "var_sky"]
)

}"""
)
assert all(
np.sum(np.isnan(getattr(resample_out, x)))
for x in ["var_poisson", "var_rnoise", "var_flat"]
for x in ["var_poisson", "var_rnoise", "var_flat", "var_sky"]
)

step.log.info(
Expand Down
12 changes: 7 additions & 5 deletions romancal/resample/resample.py
Original file line number Diff line number Diff line change
Expand Up @@ -51,7 +51,7 @@ def __init__(
pixfrac=1.0,
kernel="square",
fillval="INDEF",
wht_type="ivm",
weight_type="ivm",
good_bits="0",
pscale_ratio=1.0,
pscale=None,
Expand Down Expand Up @@ -86,13 +86,15 @@ def __init__(
)

self.input_models = input_models
# add sky variance array
resample_utils.add_var_sky_array(self.input_models)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will always re-compute var_sky (even if the input contains a var_sky). If that's the goal (always have resample compute var_sky) what about something like:

  • if weight is ivsky compute var_sky for each model
  • when resampling var_sky compute it (if it wasn't computed before)

That way for nonivsky weighting resample won't need to make a separate pass through all the models here (in add_var_sky_array).

self.output_filename = output
self.pscale_ratio = pscale_ratio
self.single = single
self.pixfrac = pixfrac
self.kernel = kernel
self.fillval = fillval
self.weight_type = wht_type
self.weight_type = weight_type
self.good_bits = good_bits
self.in_memory = kwargs.get("in_memory", True)
if "target" in input_models.asn:
Expand Down Expand Up @@ -178,6 +180,8 @@ def __init__(
self.blank_output["individual_image_cal_logs"] = [
model.meta.cal_logs for model in models
]
# add sky variance array
self.blank_output["var_sky"] = np.zeros_like(self.blank_output.var_flat)
for i, m in enumerate(models):
self.input_models.shelve(m, i, modify=False)

Expand Down Expand Up @@ -374,11 +378,11 @@ def resample_many_to_one(self):
self.resample_variance_array("var_rnoise", output_model)
self.resample_variance_array("var_poisson", output_model)
self.resample_variance_array("var_flat", output_model)
self.resample_variance_array("var_sky", output_model)

# Make exposure time image
exptime_tot = self.resample_exposure_time(output_model)

# TODO: fix unit here
output_model.err = np.sqrt(
np.nansum(
[
Expand All @@ -392,7 +396,6 @@ def resample_many_to_one(self):

self.update_exposure_times(output_model, exptime_tot)

# TODO: fix RAD to expect a context image datatype of int32
output_model.context = output_model.context.astype(np.uint32)

return ModelLibrary([output_model])
Expand Down Expand Up @@ -471,7 +474,6 @@ def resample_variance_array(self, name, output_model):

# We now have a sum of the inverse resampled variances. We need the
# inverse of that to get back to units of variance.
# TODO: fix unit here
output_variance = np.reciprocal(inverse_variance_sum)

setattr(output_model, name, output_variance)
Expand Down
2 changes: 1 addition & 1 deletion romancal/resample/resample_step.py
Original file line number Diff line number Diff line change
Expand Up @@ -58,7 +58,7 @@ class ResampleStep(RomanStep):
pixfrac = float(default=1.0) # change back to None when drizpar reference files are updated
kernel = string(default='square') # change back to None when drizpar reference files are updated
fillval = string(default='INDEF' ) # change back to None when drizpar reference files are updated
weight_type = option('ivm', 'exptime', None, default='ivm') # change back to None when drizpar ref update
weight_type = option('ivm', 'exptime', 'ivsky', None, default='ivm') # change back to None when drizpar ref update
output_shape = int_list(min=2, max=2, default=None) # [x, y] order
crpix = float_list(min=2, max=2, default=None)
crval = float_list(min=2, max=2, default=None)
Expand Down
48 changes: 47 additions & 1 deletion romancal/resample/resample_utils.py
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,7 @@
from stcal.alignment.util import wcs_from_footprints

from romancal.assign_wcs.utils import wcs_bbox_from_shape
from romancal.datamodels.library import ModelLibrary

log = logging.getLogger(__name__)
log.setLevel(logging.DEBUG)
Expand Down Expand Up @@ -111,7 +112,7 @@
model : object
The input model.
weight_type : str, optional
The type of weight to use. Allowed values are 'ivm' or 'exptime'.
The type of weight to use. Allowed values are 'ivm', 'exptime', or 'ivsky'.
mairanteodoro marked this conversation as resolved.
Show resolved Hide resolved
Defaults to None.
good_bits : str, optional
The good bits to use for building the mask. Defaults to None.
Expand Down Expand Up @@ -163,6 +164,22 @@
elif weight_type == "exptime":
exptime = model.meta.exposure.exposure_time
result = exptime * dqmask
elif weight_type == "ivsky":
if (
hasattr(model, "var_sky")
and model.var_sky is not None
and model.var_sky.shape == model.data.shape
):
with np.errstate(divide="ignore", invalid="ignore"):
inv_sky_variance = model.var_sky**-1
inv_sky_variance[~np.isfinite(inv_sky_variance)] = 0
else:
warnings.warn(

Check warning on line 177 in romancal/resample/resample_utils.py

View check run for this annotation

Codecov / codecov/patch

romancal/resample/resample_utils.py#L177

Added line #L177 was not covered by tests
"var_sky array is not available. Setting drizzle weight map to 1",
stacklevel=2,
)
inv_sky_variance = 1.0

Check warning on line 181 in romancal/resample/resample_utils.py

View check run for this annotation

Codecov / codecov/patch

romancal/resample/resample_utils.py#L181

Added line #L181 was not covered by tests
result = inv_sky_variance * dqmask
mairanteodoro marked this conversation as resolved.
Show resolved Hide resolved
elif weight_type is None:
result = np.ones(model.data.shape, dtype=model.data.dtype) * dqmask
else:
Expand Down Expand Up @@ -402,3 +419,32 @@
ymax = min(data_shape[0] - 1, int(y2 + 0.5))

return xmin, xmax, ymin, ymax


def add_var_sky_array(input_models: ModelLibrary):
"""
Add sky variance array to each model of a ModelLibrary.

Parameters
----------
input_models : ModelLibrary
A library of models to which the sky variance array will be added.

Returns
-------
None
"""
with input_models:
ref_img = input_models.borrow(index=0)
input_models.shelve(model=ref_img, index=0)
Comment on lines +438 to +439
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
ref_img = input_models.borrow(index=0)
input_models.shelve(model=ref_img, index=0)

Is ref_img used?

for i, img in enumerate(input_models):
try:
ok_data = img.data != 0
img["var_sky"] = np.empty_like(img.data)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this could be simplified by starting with something like:

img["var_sky"] = img.var_rnoise.copy()

since all pixels are either:

  • copied from var_sky if ~(img.data != 0)
  • computed below

img["var_sky"][ok_data] = img.var_rnoise[ok_data] + img.var_poisson[
ok_data
] / img.data[ok_data] * np.median(img.data)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If data contains a nan won't this cause every var_sky pixel (for each index wither data != 0) to be nan?

img["var_sky"][~ok_data] = img.var_rnoise[~ok_data]
except (AttributeError, KeyError, TypeError, ValueError) as e:
raise ValueError("Input model contains invalid data array.") from e

Check warning on line 449 in romancal/resample/resample_utils.py

View check run for this annotation

Codecov / codecov/patch

romancal/resample/resample_utils.py#L448-L449

Added lines #L448 - L449 were not covered by tests
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since this raises an exception (if for instance var_rnoise isn't available won't this stop resample from proceeding if ivsky is selected and var_rnoise isn't available? That behavior makes sense to me but is at odds with the documentation (which says thew weight will be 1 for all pixels in this case).

Also I suggest checking for var_rnoise and var_poisson instead of using the try/except (since the try/except will also trigger for other reasons.

input_models.shelve(img, i, modify=True)
13 changes: 10 additions & 3 deletions romancal/resample/tests/test_resample.py
Original file line number Diff line number Diff line change
Expand Up @@ -100,6 +100,7 @@ def create_image(self):
},
)
# data from WFISim simulation of SCA #01
l2.data[:] = 0.01
l2.meta.filename = self.filename
l2.meta["wcs"] = create_wcs_object_without_distortion(
fiducial_world=self.fiducial_world,
Expand Down Expand Up @@ -305,7 +306,7 @@ def test_resampledata_init(exposure_1):
pixfrac=pixfrac,
kernel=kernel,
fillval=fillval,
wht_type=wht_type,
weight_type=wht_type,
good_bits=good_bits,
pscale_ratio=pscale_ratio,
pscale=pscale,
Expand Down Expand Up @@ -694,15 +695,19 @@ def get_footprint(model, index):
)


@pytest.mark.parametrize("weight_type", ["ivm", "exptime"])
@pytest.mark.parametrize("weight_type", ["ivm", "exptime", "ivsky"])
def test_resampledata_do_drizzle_default_single_exposure_weight_array(
exposure_1,
weight_type,
):
"""Test that resample methods return non-empty weight arrays."""

# adding a few zero flux pixels
for i, model in enumerate(exposure_1):
model.data[10 + i, 40 - i] = 0
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would you help me understand how setting pixels offset by the index of the exposure in exposure_1 results in the resampled pixel at [10, 40] being equal to 0? Is each exposure offset by 1 pixel in X and 1 in Y?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just added a random pixel with zero flux. I was not interested in the position of the pixel, just that we did have a pixel with zero flux.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm still confused by this. What's the need for a random zero flux pixel per image? If I comment out this code the test doesn't fail.


input_models = ModelLibrary(exposure_1)
resample_data = ResampleData(input_models, wht_type=weight_type)
resample_data = ResampleData(input_models, weight_type=weight_type)

output_models_many_to_one = resample_data.resample_many_to_one()
output_models_many_to_many = resample_data.resample_many_to_many()
Expand All @@ -712,6 +717,8 @@ def test_resampledata_do_drizzle_default_single_exposure_weight_array(
many_to_one_model = output_models_many_to_one.borrow(0)
assert np.any(many_to_one_model.weight > 0)
assert np.any(many_to_many_model.weight > 0)
assert many_to_one_model.data[10, 40] == 0
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is this line testing? The referenced pixel at 10, 40 (seen as the single highlighted pixel in the upper left) falls in the "fill" region of the output image.
Screenshot 2025-02-03 at 1 47 59 PM

assert many_to_many_model.data[10, 40] == 0
output_models_many_to_many.shelve(many_to_many_model, 0, modify=False)
output_models_many_to_one.shelve(many_to_one_model, 0, modify=False)

Expand Down
Loading