1029 create custom handling for hi goodtimes jobs by subagonsouth · Pull Request #1034 · IMAP-Science-Operations-Center/sds-data-manager

subagonsouth · 2025-12-11T19:48:17Z

Change Summary

Overview

This PR implements custom handling for Hi Goodtimes jobs, which require L1B DE data from multiple repoints (N=7 total, including the target repoint).

Key Changes

Batch Starter (batch_starter.py)

Added special handling when a Hi L1B DE file triggers Hi Goodtimes jobs
When triggered, expands to multiple target repoints in range [T-N+1, T+N-1] where T is the trigger repoint and N=7
Each target repoint gets its own job submission with normal dependency checking

Dependency Module (dependency.py)

New Configuration:

HI_GOODTIMES_NUM_NEAREST_REPOINTS = 7 - Total repoints needed for Goodtimes processing

New Functions:

get_hi_goodtimes_target_repoints() - Returns target repoints in range [T-N+1, T+N-1]
_extend_hi_goodtimes_l1b_de_dependencies() - Extends L1B DE dependencies to include N nearest repoints
get_n_nearest_files_by_repoint() - Generic function to find N nearest files by repoint number
get_n_nearest_files_by_date() - Generic function to find N nearest files by date
_get_available_repoints() - Query helper for available repoints in database
_get_inprogress_repoints() - Query helper for repoints with INPROGRESS jobs
_get_inprogress_dates() - Query helper for dates with INPROGRESS jobs
_check_pointing_exists() - Check if a pointing exists in the pointing table

Modified Functions:

get_files() - Now accepts repoint as either int or list[int] to query multiple repoints
get_upstream_dependency_inputs() - Updated to accept list of repoints
get_jobs() - Calls _extend_hi_goodtimes_l1b_de_dependencies() for Hi Goodtimes jobs

Hi Goodtimes Job Logic:

When processing a Goodtimes job for target repoint T, find N-1 nearest L1B DE repoints
Skip job submission if any of the N nearest repoints have INPROGRESS L1B DE jobs
If fewer than N/2 future repoints exist, verify pointing T+N-1 exists before proceeding
Extend the L1B DE dependency to include files from all N repoints

Test Coverage

Unit tests for all new helper functions (_get_inprogress_repoints, _get_inprogress_dates, _check_pointing_exists, get_hi_goodtimes_target_repoints)
Unit tests for get_n_nearest_files_by_repoint and get_n_nearest_files_by_date
Integration tests for Hi Goodtimes multi-repoint handling
Tests for INPROGRESS job skipping behavior
Tests for edge cases (only past repoints, only future repoints)

Closes: #1029

Copilot

Pull request overview

This PR implements custom handling for Hi Goodtimes jobs to support multi-repoint dependencies. Hi Goodtimes ancillary processing requires data from multiple repoints (past and future), unlike typical jobs that only need data from a single repoint. The implementation allows the repoint parameter to accept either a single integer or a list of integers in dependency query functions, and adds special logic to expand repoint ranges for Hi Goodtimes jobs both when triggering jobs and when retrieving dependencies.

Key changes:

Extended get_upstream_dependency_inputs and get_files functions to accept a list of repoint numbers
Added Hi Goodtimes-specific logic to expand single repoints into ranges when querying dependencies
Implemented multi-repoint job triggering when Hi L1B DE files arrive

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 9 comments.

File	Description
sds_data_manager/lambda_code/SDSCode/pipeline_lambdas/dependency.py	Added configuration constants for Hi Goodtimes repoint ranges; modified `get_upstream_dependency_inputs`, `get_files`, and `get_jobs` to support list of repoints; added special handling to expand repoint ranges for Hi Goodtimes jobs
sds_data_manager/lambda_code/SDSCode/pipeline_lambdas/batch_starter.py	Refactored DependencyConfig import; added special handling in `s3_processing_event` to trigger multiple Hi Goodtimes jobs across a range of repoints when a single Hi L1B DE file arrives
tests/lambda_endpoints/test_dependency_api.py	Added fixture for Hi L1B DE files across multiple repoints; added tests for querying files with single and multiple repoints; added test for Hi Goodtimes multi-repoint dependency retrieval
tests/lambda_endpoints/test_batch_starter.py	Added test to verify Hi Goodtimes jobs are triggered for multiple repoints when a single Hi L1B DE file arrives

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2025-12-11T19:54:35Z

tests/lambda_endpoints/test_batch_starter.py

+def test_hi_goodtimes_multi_repoint_trigger(
+    mock_submit_all_jobs,
+    mock_get_dependencies,
+    session,
+    s3_client,
+    monkeypatch,
+):
+    """Test Hi Goodtimes multi-repoint trigger logic in s3_processing_event.
+
+    When a Hi L1B DE file with repoint N arrives, it should trigger goodtimes
+    jobs for repoints [N-M, N+M] where M is determined by the configuration.
+    """
+    # Monkeypatch configuration values for testing
+    monkeypatch.setattr(dependency, "HI_GOODTIMES_NUM_PAST_REPOINTS", 1)
+    monkeypatch.setattr(dependency, "HI_GOODTIMES_NUM_FUTURE_REPOINTS", 2)
+
+    mock_get_dependencies.side_effect = [
+        [
+            {
+                "data_source": "hi",
+                "data_type": "ancillary",
+                "descriptor": "45sensor-goodtimes",
+                "relationship": "UPSTREAM",
+            }
+        ],
+        [],
+    ]
+
+    # L1B DE and spice files are added to the DB table by the
+    # hi_l1b_de_repoint_files fixture


The test function is missing the hi_l1b_de_repoint_files fixture parameter. The comment on line 2332 states that "L1B DE and spice files are added to the DB table by the hi_l1b_de_repoint_files fixture", but the fixture is not included in the function parameters. This means the Hi L1B DE files will not be present in the database when the test runs, causing the test to not properly validate the multi-repoint trigger logic. Add hi_l1b_de_repoint_files to the function parameters after monkeypatch.

Copilot · 2025-12-11T19:54:36Z

sds_data_manager/lambda_code/SDSCode/pipeline_lambdas/dependency.py

+    ):
+        repoint_param = list(
+            range(
+                repoint - HI_GOODTIMES_NUM_PAST_REPOINTS,


The range calculation can produce invalid negative or zero repoint numbers. When repoint - HI_GOODTIMES_NUM_PAST_REPOINTS is less than 1 (e.g., repoint=1 and NUM_PAST=1), the range will start at 0, which may be invalid if repoint numbers start at 1. Add validation to ensure the range starts at a valid repoint number, such as using max(1, repoint - HI_GOODTIMES_NUM_PAST_REPOINTS) to prevent querying for invalid repoints.

Suggested change

repoint - HI_GOODTIMES_NUM_PAST_REPOINTS,

max(1, repoint - HI_GOODTIMES_NUM_PAST_REPOINTS),

sds_data_manager/lambda_code/SDSCode/pipeline_lambdas/batch_starter.py

Copilot · 2025-12-11T19:54:36Z

tests/lambda_endpoints/test_batch_starter.py

+        target_repoint = call_args.args[4]
+        repoints_submitted.add(target_repoint)
+
+    # Verify we submitted for repoints 1, 2, 3


The comment says "Verify we submitted for repoints 1, 2, 3" but should include repoint 4 to match the assertion on line 2425.

Suggested change

# Verify we submitted for repoints 1, 2, 3

# Verify we submitted for repoints 1, 2, 3, 4

Copilot · 2025-12-11T19:54:37Z

tests/lambda_endpoints/test_batch_starter.py

+        lambda_handler(events, context)
+
+    # Process the event
+    # lambda_handler(events, {})


Remove the commented-out code. The lambda_handler is already called on line 2397, making this line redundant.

Suggested change

# lambda_handler(events, {})

sds_data_manager/lambda_code/SDSCode/pipeline_lambdas/batch_starter.py

Copilot · 2025-12-11T19:54:37Z

sds_data_manager/lambda_code/SDSCode/pipeline_lambdas/batch_starter.py

+                # the correct way to calculate the repoint affected by the new file
+                # that triggered here.
+                for target_repoint in range(
+                    repoint - dependency.HI_GOODTIMES_NUM_FUTURE_REPOINTS,


The range calculation can produce invalid negative or zero repoint numbers. When repoint - HI_GOODTIMES_NUM_FUTURE_REPOINTS is less than 1 (e.g., repoint=1 and NUM_FUTURE=2), the range will include repoint 0 or negative numbers, which are likely invalid. Add validation to ensure target_repoint values are >= 1, or use max(1, repoint - HI_GOODTIMES_NUM_FUTURE_REPOINTS) as the range start to prevent generating jobs for invalid repoints.

Suggested change

repoint - dependency.HI_GOODTIMES_NUM_FUTURE_REPOINTS,

max(1, repoint - dependency.HI_GOODTIMES_NUM_FUTURE_REPOINTS),

sds_data_manager/lambda_code/SDSCode/pipeline_lambdas/dependency.py

Copilot · 2025-12-11T19:54:38Z

tests/lambda_endpoints/test_batch_starter.py

+        if call_args[0][1]["descriptor"] == "45sensor-goodtimes"
+    ]
+
+    # Should have 3 calls for repoints 1, 2, 3


The comment says "Should have 3 calls" but the assertion checks for 4 calls. Update the comment to match the assertion.

Suggested change

# Should have 3 calls for repoints 1, 2, 3

# Should have 4 calls for repoints 1, 2, 3, 4

lacoak21

Looks good! Just a couple of questions related to missing repointing data.

tests/lambda_endpoints/test_batch_starter.py

lacoak21 · 2025-12-15T17:17:53Z

tests/lambda_endpoints/test_dependency_api.py

+        }
+    ],
+)
+def test_get_jobs_hi_goodtimes_multi_repoint(


What would happen if you called it like

science_files = get_files( session, dependency=dep, start_date=datetime(2024, 1, 1), end_date=datetime(2024, 1, 2), repoint=[1, 2], )

And there was only data for repoint 1?

maxinelasp · 2025-12-22T18:30:04Z

sds_data_manager/lambda_code/SDSCode/pipeline_lambdas/dependency.py

+    # Special handling for Hi Goodtimes - needs L1B DE from multiple repoints
+    # Pass a list of repoints instead of a single repoint
+    repoint_param = repoint
+    if (


This is a duplicate if-statement from batch starter, is it possible to combine the checks into one place?

maxinelasp · 2025-12-22T18:32:23Z

sds_data_manager/lambda_code/SDSCode/pipeline_lambdas/batch_starter.py

+                    submit_all_jobs(
+                        session,
+                        job,
+                        trigger_start_time,


Does the start and end time need to be changed to match repoint?

Not here. This is just submitting a potential job for each of the repoints that the trigger file would be used in. It is in the dependency code where the start/end time need to get updated when gathering upstream dependencies for a job.

…ointings

…tarter

…dependency_api

Copilot

Pull request overview

Copilot reviewed 4 out of 4 changed files in this pull request and generated 3 comments.

Comments suppressed due to low confidence (1)

sds_data_manager/lambda_code/SDSCode/pipeline_lambdas/dependency.py:1414

Docstring says this function “replaces the L1B DE files with files from the extended repoint range”, but the implementation currently extends the existing L1B DE input (adds nearest files to the existing lists). Consider updating the docstring wording to reflect the actual behavior (extend/augment rather than replace) to avoid confusion for future maintainers.

    Hi Goodtimes jobs require L1B DE data from N repoints total (target plus
    N-1 nearest). This function takes an existing ProcessingInputCollection
    (from get_upstream_dependency_inputs) and replaces the L1B DE files with
    files from the extended repoint range.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

sds_data_manager/lambda_code/SDSCode/pipeline_lambdas/dependency.py

sds_data_manager/lambda_code/SDSCode/pipeline_lambdas/batch_starter.py

…jobs

subagonsouth · 2026-03-03T17:42:16Z

sds_data_manager/lambda_code/SDSCode/pipeline_lambdas/batch_starter.py

+                trigger_is_hi_l1b_de
+                and repoint is not None
+                and job["data_source"] == "hi"
+                and job["data_type"] == "l1c"


I think we decided that the ENA goodtimes would be l1b products.

Suggested change

and job["data_type"] == "l1c"

and job["data_type"] == "l1b"

Copilot

Pull request overview

Copilot reviewed 4 out of 4 changed files in this pull request and generated 3 comments.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

sds_data_manager/lambda_code/SDSCode/pipeline_lambdas/dependency.py

sds_data_manager/lambda_code/SDSCode/pipeline_lambdas/batch_starter.py

subagonsouth · 2026-03-03T17:44:44Z

sds_data_manager/lambda_code/SDSCode/pipeline_lambdas/dependency.py


+    # Special handling for Hi Goodtimes - extend L1B DE to N nearest repoints
+    if (
+        data_type == "l1c"


We decided to make Goodtimes L1B

Suggested change

data_type == "l1c"

data_type == "l1b"

laspsandoval · 2026-03-05T18:41:29Z

sds_data_manager/lambda_code/SDSCode/pipeline_lambdas/batch_starter.py

+                        f"Submitting Hi Goodtimes job for repoint {target_repoint} "
+                        f"(triggered by repoint {repoint} file)"
+                    )
+                    submit_all_jobs(


I think this answers my question from yesterday.

Reprocess L1B DE specifically -> s3_processing_event fires and fan-out happens normally.

Reprocess Goodtimes directly goes straight to submit_all_jobs, bypasses s3_processing_event entirely -> no fan-out.

lacoak21

This looks great Tim. Have you tested in dev out of curiosity?

lacoak21 · 2026-03-05T22:26:41Z

sds_data_manager/lambda_code/SDSCode/pipeline_lambdas/batch_starter.py

-                filter_dependencies,
+
+            # Check if trigger file is Hi L1B DE
+            trigger_is_hi_l1b_de = (


Is l1b de the only dependency? Im just wondering whether you needed to handle a case where you had to reprocess goodtimes because a SPICE kernel got updated.

lacoak21 · 2026-03-05T22:57:53Z

sds_data_manager/lambda_code/SDSCode/pipeline_lambdas/dependency.py

    return records


+def get_n_nearest_files_by_repoint(


@subagonsouth I reviewed part of this but had to stop here. I will resume in the morning.

lacoak21 · 2026-03-06T15:19:01Z

sds_data_manager/lambda_code/SDSCode/pipeline_lambdas/dependency.py

+        repoint,
+        skip_if_inprogress=True,
+    )
+    if l1b_de_records is None:


Uhoh I think you are right that other jobs dont check for upstream jobs in progress. The CRID check only handles it when an upstream , upstream job is still processing. Wow how could I have missed this :(. Ill make sure to keep this in mind when reviewing Tenzin's refactor.

lacoak21 · 2026-03-06T15:28:02Z

sds_data_manager/lambda_code/SDSCode/pipeline_lambdas/dependency.py

+        return []
+
+    distances = np.abs(other_repoints - repoint)
+    sort_indices = np.lexsort((other_repoints, distances))


Whoa np.lexsort is so useful.

lacoak21 · 2026-03-06T15:39:08Z

sds_data_manager/lambda_code/SDSCode/pipeline_lambdas/dependency.py

-    if upstream_dependencies_output is None:
-        logger.info(
-            f"No dependencies found for {start_date=} - {end_date=}: {dependencies}"
+    # Use a single session for all database operations


subagonsouth requested review from Copilot, lacoak21 and maxinelasp December 11, 2025 19:48

subagonsouth self-assigned this Dec 11, 2025

subagonsouth added this to IMAP Dec 11, 2025

Copilot started reviewing on behalf of subagonsouth December 11, 2025 19:48 View session

Copilot AI reviewed Dec 11, 2025

View reviewed changes

lacoak21 approved these changes Dec 15, 2025

View reviewed changes

subagonsouth marked this pull request as draft December 17, 2025 22:37

maxinelasp approved these changes Dec 22, 2025

View reviewed changes

subagonsouth added 7 commits February 27, 2026 13:47

Modify dependency.get_files() to be able to query across multiple rep…

3aab2f2

…ointings

Add special handling for hi goodtimes in dependency.get_jobs

8feb51c

Add test coverage for custom handling of Hi Goodtimes jobs in batch_s…

8cdb478

…tarter

Fix change in monkeypatch due to different import

a16508d

Add mock for get_dependencies

79861fe

Address copilot PR review

d09c6c6

Update configuration variables to be the requested +/- 3 repoints

dc8f6cc

subagonsouth force-pushed the 1029-create-custom-handling-for-hi-goodtimes-jobs branch from c52018c to dc8f6cc Compare February 27, 2026 22:57

subagonsouth added 2 commits March 2, 2026 12:10

Add function for getting n nearest pointings/days

38064ff

Add modify custom handling of hi goodtimes jobs in batch starter and …

661b2cd

…dependency_api

subagonsouth requested a review from Copilot March 2, 2026 23:54

Copilot started reviewing on behalf of subagonsouth March 2, 2026 23:54 View session

Copilot AI reviewed Mar 3, 2026

View reviewed changes

subagonsouth and others added 2 commits March 3, 2026 10:30

Refactor and improvents

075938a

Merge branch 'dev' into 1029-create-custom-handling-for-hi-goodtimes-…

1f9a085

…jobs

subagonsouth marked this pull request as ready for review March 3, 2026 17:36

subagonsouth requested review from Copilot, lacoak21 and maxinelasp March 3, 2026 17:36

Copilot started reviewing on behalf of subagonsouth March 3, 2026 17:37 View session

subagonsouth commented Mar 3, 2026

View reviewed changes

Copilot AI reviewed Mar 3, 2026

View reviewed changes

subagonsouth commented Mar 3, 2026

View reviewed changes

subagonsouth requested review from laspsandoval and tech3371 March 4, 2026 20:56

subagonsouth added 4 commits March 4, 2026 16:20

Switch hi goodtimes to L1B product

196125d

merge origin branch in

1417327

Copilot feedback changes

fdf5580

Fix incorrect keyword

b123995

laspsandoval approved these changes Mar 5, 2026

View reviewed changes

lacoak21 approved these changes Mar 6, 2026

View reviewed changes

maxinelasp mentioned this pull request Mar 10, 2026

BUG - MAG L1D processing running inconsistently #1165

Open

8 tasks

	repoint - HI_GOODTIMES_NUM_PAST_REPOINTS,
	max(1, repoint - HI_GOODTIMES_NUM_PAST_REPOINTS),

	# Verify we submitted for repoints 1, 2, 3
	# Verify we submitted for repoints 1, 2, 3, 4

	repoint - dependency.HI_GOODTIMES_NUM_FUTURE_REPOINTS,
	max(1, repoint - dependency.HI_GOODTIMES_NUM_FUTURE_REPOINTS),

	# Should have 3 calls for repoints 1, 2, 3
	# Should have 4 calls for repoints 1, 2, 3, 4

Conversation

subagonsouth commented Dec 11, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Change Summary

Overview

Key Changes

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Dec 11, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Dec 11, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Copilot AI Dec 11, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Dec 11, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Copilot AI Dec 11, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Copilot AI Dec 11, 2025

Choose a reason for hiding this comment

Uh oh!

lacoak21 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Uh oh!

Uh oh!

subagonsouth Mar 3, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

lacoak21 left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

subagonsouth commented Dec 11, 2025 •

edited

Loading

subagonsouth Mar 3, 2026 •

edited

Loading