-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
33 transforms group #51
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've added one suggestion.
The tests all passed on my machine.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks good so far, a couple of suggested changes. Finishing for now and will pick up again!
configs/datasets.yaml
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thinking about this more in the interest of cutting down experiment space
here are 6 different drop % pairs: (0,0), (0,0.5), (0.5,0), (0.5,0.5), (0.25,0.75), (0.75,0.25)
what purpose do we think each of these serve, do we need all of them?
do we get anything further from the last two that we don't already have from 0.5,0.5?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think my rough line of thinking on this is:
- I'd like to test how much of an effect having none vs some overlapping data there is, given our theory of data similarity
- I'd like to also account for the fact that data imbalance may have an effect on transfer success
- We should make sure that in our experiment these two things are fully independent of each other. If we got rid of the last two groups, we'd only be testing data imbalance in the case where there is overlap between the observations in both datasets
That said, this isn't a particularly strong justification, so I'm open to removing them if we have a need to cut down the experiment space!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
let's see how long things take to run but sounds reasonable to keep for now, thanks!
I've added some new commits that somewhat change the PR in line with in person discussion. We now have:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The Pytest tests all passed for me. I also had a look at the generate scripts files and think I found one tiny issue there. I only ran the generate_metrics_scripts.py
file, but think the issue is across all three files.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The tests all pass for me
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks good to me! tests pass and debug made sense.
minor comment on removal of a file then good to go
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we can also remove transforms.yaml
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Have removed!
This PR:
compose_transforms
function tomodsim2.utils.config
, which draws on the previous code for loading transforms, but allows forNone
to be supplied as an input (and will returnNone
in this case)opts2dmpairArgs
function inscripts/utils.py
functionDMPair.A.setup()
andDMPair.B.setup()
may need to be called. DMPair.compute_similarity() will therefore throw an error if transforms do not match those supplied to DMPair() (if transforms are not None)