Issues/25 - Adapt sidecar to snappl infrastructure including save to DB. #46

wmwv · 2025-12-18T22:37:53Z

Now Uses snappl for image searching, finding, and saving of DB candidates.

Saving of database candidates is a separate script as an afterburner to the pipeline.

Closes #25

The config object isn't used yet so it's not yet processed.

…B saving.

…ve a test DB.

…ists.

…ist.

…rames.

rknop · 2026-02-05T20:47:11Z

config/sidecar_config.yaml

+      diaobject_provenance_tag: "nov2025_test3"
+      diaobject_process: "sidecar"
+
+    paths:


Ideally, paths should not be under photometry. If you look at the nov2025_container_config.yaml, it already has system.paths.images and system.paths.temp_dir ; ideally your code should just use that, and have no paths here.

Is dia_out_dir something that we're going to want to save to the database? Is it just temporary stuff? Or something in between?

Something in between. In production, dia_out_dir is a tempdir. In development, it's helpful to have a persistence temporary directory.

Right, dia_out_dir is not something that we want to save to the database.

rknop · 2026-02-05T20:51:44Z

sidecar/database.py

+    try:
+        diaobj.save_object()
+    except RuntimeError as e:
+        SNLogger.info(e)


Do you really want to just give an info (not even an error) here?

Note that DiaObject.save_object will not save a new object if there exists an object of the given provenance within 1" (by default).

WAIT. Maybe it doesn't. Maybe it returns an error right now. Is that why you're just moving on when you get exceptions here? If so, we should put an issue into snappl to add an exists_ok flag or some such to DiaOjbect.save_object().

(What I'm worried about is some exception that comes from something other than the object already existing.)

Yes, when a candidate that already exists to the database, you get a generic RuntimeError:

RuntimeError: Error response from server: Error, there already exists a provenance for tag nov2025_test4 and process sidecar

so I have to catch that broad error.

I share your worry about hiding some other error and would love to have a more specific error class returned. I have created a snappl issue:
Roman-Supernova-PIT/snappl#168

rknop · 2026-02-05T20:52:59Z

sidecar/pipeline.py

-        SIMS_DIR
-        + "/RomanTDS/images/simple_model/{band}/{pointing}/Roman_TDS_simple_model_{band}_{pointing}_{sca}.fits.gz"
-    )
+    BASE_PATH = "/global/cfs/cdirs/lsst/shared/external/roman-desc-sims/Roman_data/" "RomanTDS/images/simple_model/"


What is BASE_PATH used for? Obviously this is something that will need to go away and be replaced either by something that comes out of the config, or replaced by nothing because the database interface handles it in the background.

Same comment for INPUT_TRUTH_PATTERN below. Truth handling is a bit thorny.

Also, above, the environment variable SIMS_DIR -- what is that used for? That's also something that should go away, and should be handled by snappl in the background.

BASE_PATH is unused.

I would very much like to remove all of this separate truth-file handling in here. Both to make it more clear which parts are the fundamental subtraction and candidate identification vs which parts are really helpful right now for calculating efficiencies and purity.

The capability that I don't know how to do in snappl is to read the specific realized flux truth catalogs for each image for any transient or variable. These are the ".txt" files in the "RomanTDS/truth/..." hierarchy.

My understanding is that snappl will currently return the SNANA ideal truth lightcurve, but not the realized fluxes at the individual image catalog level. If I am wrong, could you point me to the relevant snappl function?

Another confusing thing I'm doing with the truth information is using the truth information for stars to reject known stars. We will have a star catalog available, so this is conceptually fair, but using the simulation truth files is not quite the representative thing to do. So we need a way to get a star catalog (in snappl?). This is actually easier for real data because we could always just send API call to online Gaia.

Can you add an issue to snappl to add a star catalog interface, and implement Gaia? I've done that before, so that's fast for me to implement.

Of course, Gaia doesn't go very deep, so it'll only get the brighter stars. Do you know of existing star catalogs out there that we might use? Legacy survey? I've got a thing set up for LS4 that I could put in pretty quickly as well.

Done, and a good idea.
Roman-Supernova-PIT/snappl#169

There is still (a) a need to use simulated star catalogs for OpenUniverse 2024 and other simulated surveys; and (b) wanting to have both nominal infinite SNR flux and realized flux for testing with simulations and source injection. So some of this awkwardness here will remain for a bit.

Although, maybe we should put this in snappl too. Getting the realized flux image-per-image is useful for all of the photometry routines for checking.

Roman-Supernova-PIT/snappl#170

rknop · 2026-02-05T20:54:50Z

sidecar/pipeline.py

-    args = parser.parse_args()
+    parser.add_argument("-t", "--temp-dir", type=str, default=None, help="Temporary directory.")
+    parser.add_argument("-o", "--output-dir", type=str, default=None, help="Output path")
+


How do you pass in the provenance tag and process for the saved DiaObjects? Is that hardcoded somewhere? I don't see a command-line option for it.

....I see the arguments below in save_candidates_to_database.py.

What is the pattern for running sidecar? It looks like you have moire than one thing with a main. Does it read its own output files or some such?

(Coming into this, knowing nothing about sidecar, I would have guessed that there was just one executable, and saving candidates to the database was just another step that would be done after image subtraction and residual detection.)

Here's an example of how I ran save_candidates_to_database.py once all of the subtractions were done.

python sidecar/save_candidates_to_database.py \ --image-collection snpitdb --image-provenance-tag ou2024 --image-process load_ou2024_image \ --diaobject-provenance-tag nov2025_test4 --diaobject-process sidecar \ --threshold 200 \ --threshold-column peak_value \ --data-records run_scripts/nov_R_images.csv \ --output-dir /dia_out_dir

I just made up a provenance tag. I think this diaobject-provenance-tag connects to the other question you asked about the orchestration for what triggers sidecar.

But, yes, I agree now. I will keep save_candidates_to_database.py as a (clearly documented) alternative to load a bunch of candidates for testing.

But the standard call to loading to database will be in sidecar/pipeline.py with an option to control.

rknop · 2026-02-05T20:57:11Z

sidecar/pipeline.py

+        if "template_pointing" not in data_records.columns:
+            data_records = find_templates_for_pointings(
+                image_collection=image_collection,
+                science_pointing=data_records["science_pointing"],


Looking ahead -- in the database, pointing is being replaced with observation_id, and will be a string instead of an int.

It would probably be worth at some point thinking about the tools that decide what to subtract. Do you maintain the things-to-subtract data records by hand right now?

About this whole section -- I haven't paid close enough attention to how sidecar works in general. I would have to figure out how all of this works in the "manually given images" vs. "images from the database" case.

rknop · 2026-02-05T21:01:14Z

sidecar/save_candidates_to_database.py

+       help="Specify directory for output products."
+    )
+    parser.add_argument(
+        "--threshold",


Things like threshold ought to be something that goes into the provenance. Consider (eventually) putting this in the config.

rknop · 2026-02-05T21:02:34Z

sidecar/subtraction.py

-    psf = config.getPSF_Image(size, x, y, **kwargs)
-    psf.write(str(psf_path))
+def get_imsim_psf(x, y, pointing, sca, band, psf_type="ou24PSF", **kwargs):
+    """Return PSF for image as a 2D numpy array.  Will be of size of PSF model."""


Thought is going to be required both here and in snappl to figure out how to get the right PSFs for images.

Ideally, each image subclass should have a way of telling you what kind of PSF to get. We might still want to be able to override that, at least for tests, but there should be defaults in there.

I don't think snappl supports this kind of thing yet.

In any event, I would rename this function, because it's not always going to get an imsim PSF any more.

rknop · 2026-02-05T21:04:38Z

sidecar/util.py

+    )
+
+    image_list = image_collection.find_images(ra=ra, dec=dec, dbclient=dbclient)
+    entries = [(im.pointing, im.band, im.sca, im.exptime, im.mjd) for im in image_list]


Note that as of snappl 0.38, pointing is gone and replaced by the string observation_id

(You need snappl 0.39 for the database with ASDF files in it that I put up.)

pyproject.toml

wmwv · 2026-02-05T22:41:14Z

Thanks for taking a look. Half the comment I can address easily and will do so.

wmwv · 2026-02-05T22:45:59Z

The other half touch upon some development vs. production tension here.

I agree with the long-term in-production goals. I'm afraid that making running require going through all the steps, including saving to database will be a real hassle and impediment to fast-turnaround development. E.g., as we test DIASource saving, I don't want to redo pixel analysis every time, so I want to make it easy to just run the last "save-to-database" step, even though that does make it easy to get the provenance wrong. I would very much welcome suggestions for how to make it easier to just run a set of steps, or the last step in some easy and consistent way.

Right now it's a two-step process because all the things before the database can be run, re-run, tested without changing long-term state. Re-running overwrites previous images and catalogs, which is what I expect to happen. Once we start interacting with the database, though, a second run will not overwrite, it will fail.

Things being a two-step process is one part of why it's helpful for dia_out_dir to be defined and persistent.

We're honestly still at the point where we need to do debugging and looking at pixels to check performance of subtraction and real-bogus.

wmwv requested a review from a team as a code owner December 18, 2025 22:37

github-actions bot added documentation Improvements or additions to documentation installation testing labels Dec 18, 2025

wmwv added 26 commits February 5, 2026 11:21

Add outline of Config loading.

b70ab8d

The config object isn't used yet so it's not yet processed.

Use SNLogger instead of print.

5b92563

Add snappl.psf.

9ab14ed

Add ImageCollection.

a184e58

Add missing image_collection arg

48729e3

First steps toward using new ImageCollection.

b25b59b

Remove obsolete snpit_utils package.

5d6a777

Take first steps toward using snappl ImageCollection

083cc97

Make test_ra_dec_query work.

759d0ac

Made get_center_and_corners work.

b1e215b

Pass collection for get_templates

2e1ed5b

Update partial get_image_info_for_ra_dec.

213d538

Fix get_templates_for_points test to correct size of image.

e6d7282

Fixed get_earliest_template. filter->band everywhere.

dd847a7

Remove unused base_image_location kwargs from make_data_records_...

7bcdd67

Make all test_query pass

0d80f3e

Use data_records object instead of path.

3022a5d

Adding provenance_tag, process.

0f9e569

Black reformat tests.

6ca40a8

Black reformat sidecar package.

7affbe8

Generalize image_collection calling.

7a55a51

Add test to get image and get PSF.

379adab

Explicitly calculate skysub, detmask, psf file names. No ImageInfo.

0872f39

Add nov 2025 example to README.md

d9fbdc2

Clean up getting a PSF and testing PSF.

baade1d

Start test DIAObject saving.

49d8e42

wmwv added 20 commits February 5, 2026 11:21

Test that input CSV files are read properly.

1dfd6ce

Update to R-band slightly better cleaned score test catalog to test D…

3b38efb

…B saving.

Black format test python files.

841ca71

Find templates for csv file with just science pointings.

c6e4fc6

Bug fix and test finding templates for science images.

6fbb096

Create simple index for find_template dataframe.

fec0a3a

Allow either pointing, sca, band or science_{prointing,sca,band}.

c96ae9b

Start new script to add catalog of detections to database.

e677df9

Minor linter cleanup.

6e366af

Full draft of save_candidates_to_database

d78efc1

Add filter option for threshold.

8df2cae

Pass through threshold, threshold_column args.

05783dd

Skip subtractions where templates can't be found.

ef2ae20

Handle case where there are no unmatched transients.

10a0fe2

Skip saving catalog to DIA object test by default because we don't ha…

b869916

…ve a test DB.

Add 'debug' and 'regular' queue examples of SLURM scripts and input l…

b385372

…ists.

Add towncrier entry.

aebf934

Add data_records option to save_candidates_to_database script

726beb7

Catch errors in saving duplicate candidates and carry on to rest of l…

fc215a5

…ist.

Skip empty dataframes in matching to avoid concatenating empty data f…

96309ae

…rames.

wmwv force-pushed the issues/25 branch from 6568f1b to 96309ae Compare February 5, 2026 19:27

Satisfy linter.

304194a

rknop reviewed Feb 5, 2026

View reviewed changes

Bound snappl to 'pointing' version.

3af876f

wmwv mentioned this pull request Feb 5, 2026

If DIAObject exists, don't throw a RuntimeError Roman-Supernova-PIT/snappl#168

Open

Remove sidecar-specific path config.

cdfa3a4

wmwv force-pushed the issues/25 branch from 6181c0a to cdfa3a4 Compare February 6, 2026 03:10

Pull truth files relative to system.ou24.tds_base, or None.

c706e1b

Issues/25 - Adapt sidecar to snappl infrastructure including save to DB. #46

Are you sure you want to change the base?

Issues/25 - Adapt sidecar to snappl infrastructure including save to DB. #46

Uh oh!

Conversation

wmwv commented Dec 18, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

wmwv Feb 6, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

wmwv commented Feb 5, 2026

Uh oh!

wmwv commented Feb 5, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

wmwv Feb 6, 2026 •

edited

Loading