File: Surya/downstream_examples/solar_flare_forcasting/dataset.py
Location:Method _get_index_data
Issue 1 — Incorrect label lookup
In _get_index_data, the label is currently retrieved using:
data["label"] = self.index.loc[reference_timestamp, "label_max"]
This line should be corrected to:
data["label"] = self.flare_index.loc[reference_timestamp, "label_max"]
Issue 2 — Unintended modification of self.index
In the constructor, line ~63, the code merges the flare index into the base index:
self.index = self.index.join(self.flare_index, how="inner", validate="one_to_one")
This operation should not modify self.index, because:
The merge removes timestamps that are missing from self.flare_index.
For sequential SDO input timestamps (e.g., [-60, 0, 60] minutes relative to reference), this breaks the dataset loader.
When a timestamp required for an input sequence is absent in self.flare_index, the merged index drops it, leading to missing files and invalid sample sequences.
Therefore, this line should be removed entirely.