-
-
Notifications
You must be signed in to change notification settings - Fork 0
Conversation
This will be changed more, but serves as a baseline. I don't particularly want to add perceiver-model as a dependency for this repo, so having some code duplication is probably more acceptable. Additionally, since this will be extended and modified more than the version in perceiver-model, I think it should be okay
Update pre-commit to ignore certain conventions
@JackKelly @peterdudfield For the datetime features, if we are switching to computing them on the fly, would we want them to be computed here? Or computed in the |
Good questions! It would be good to be able to use the datetime encodings for CNN models (as well as for self-attention models). So I guess there are two slightly separate issues:
|
In terms of where to put the code... My initial guess would be to put the "datetime encoding" in I'm less sure about where to put code that computes the attention-specific encoding of the "full" position. Perhaps that code should also live in Maybe one way to distinguish between
(Sorry for not thinking more about this earlier!) And, also, I'm increasingly thinking that maybe we should create a new repo for But I really don't have strong feelings about any of this. What do you guys think? |
I think being able to encode the full 'position' for any architecture would be useful too, essentially doing what CoordConv does, which can help CNNs too, but yeah, I agree they are slightly separate! The code as it currently is does them completely separately and just concatenates them at the end |
I think wherever we put the encoding for the datetime we should also put the encoding for space, just so that there is one place where all this encoding comes from. As for where, I don't mind too much. For splitting up I still do like keeping the dataloader code near the code that generates the data the dataloader is loading, but if we can setup automated testing that can make sure changes to |
Sounds good to me! Do you have any concerns about this approach, @peterdudfield? |
I'm always a fan of breaking repos up. But we should be sure there is an easy way to check that 'nowcasting-dataloder' can be trigger when 'nowcasting-dataset' runs. Do you either know a good way for this? I'm personally ok with torch being in dataset, and having it as an optional thing. But if we do want to split it up, we need a common place where the interface is defined i.e how these files are structures. Like what is in these .nc files. It feels like the interface is pretty fluid at the moment, so it might be better to not split until its a bit more settled. General we should also be careful, does 'utils' depend on 'dataset' or the otherway round. |
import pytest | ||
|
||
def test_fourier_encoding(): | ||
pass |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
todo?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, this PR is very much not done!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just wanted to get more thoughts on the design before I actually finished this, incase we want to move it elsewhere, I can make it more simplified, etc.
The easiest way would be to install the
As for the interface being fluid, yeah, that's a bit of a concern that I have too, but I don't think its too difficult, the .nc files are defined by |
Sounds like its worth giving a go. Perhaps can copy things out to nowcasting-dataloder, and then get various CI working. And if its all ok, then it can removed from nowcasting-dataset |
Sounds good! I'll start on that and move this PR over to that repo once its created |
Its started here: https://github.com/openclimatefix/nowcasting_dataloader |
I'll move this PR over soon, so closing for now |
Pull Request
Description
This adds the positional encoders to fix #30 as well as utilities to subselect Fourier Features for different modalities from one "main" position encoding. Also gives the option of using absolute or relative positional encodings as well.
Fixes #30
How Has This Been Tested?
Unit tests
Checklist: