-
-
Notifications
You must be signed in to change notification settings - Fork 0
Add Absolute Positional Encoding in Dataloader #7
Conversation
0366aa7
to
2162acc
Compare
Sounds good! Just to check: is the plan to move the position encoding out of |
There isn't any position encoding in |
Oops, sorry, I clearly haven't had enough coffee yet! I keep getting confused this morning!
Yeah, I agree: might as well leave them there (at least, for now!) |
2905da7
to
3072747
Compare
3072747
to
03b5cb2
Compare
Issues on merging the spatial and temporal values together, the x and y shapes do not match Spatial ones are the 32 ID x and y coords in an array, and the actual spatial features would be along the diagonal. So should just need to slice that and make the spatial one smaller
Have to add support for 4D tensor as well now to support GSP and PV correctly
|
||
TIME_DIM = 2 | ||
HEIGHT_DIM = 3 | ||
WIDTH_DIM = 4 | ||
# For GSP and PV, have an ID dimension | ||
ID_DIM = 3 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
could you work these out from the xr.Dataset?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't know if it would be easy to get from the xr.Dataset? The plan with these would be to replace them with NamedTensors when those are fully supported.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Two style comments, apart from that LGTM 👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! Excited to see how this stuff improves model performance!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
one tiny comment. LGTM!
x_max = -np.inf | ||
y_min = np.inf | ||
y_max = -np.inf | ||
for lat in [15, 70]: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Further down the line, it might be nice if users could specify geographic bounds in nowcasting_dataset
config; and then those geographic bounds would be saved to disk in the config yaml and used here. But that's definitely for another PR; and definitely not super-important! I've started a new issue: openclimatefix/nowcasting_dataset#266
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds good!
Pull Request
Description
This adds absolute position encoding in the dataloader part, so that it will generate position encodings for the past and future timesteps, which can then be used downstream in the models for querying the output from Perciever IO
Fixes #4
Somewhat relates to openclimatefix/nowcasting_dataset#229 in terms of how data is stored on disk, and loaded into a format. As such, this PR is somewhat blocked until that PR is merged.
How Has This Been Tested?
Unit tests
Checklist: