Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

move lengthscale prior to dist_spec #890

Merged
merged 11 commits into from
Dec 20, 2024
Merged

move lengthscale prior to dist_spec #890

merged 11 commits into from
Dec 20, 2024

Conversation

sbfnk
Copy link
Contributor

@sbfnk sbfnk commented Dec 12, 2024

Description

This PR closes #878.

In order for there to be a prior choice that applies across applications it rescales the lengthscale with half the length of the time series as done in Riutort-Mayol et al. It also fixes the minimum to zero (for now) which was previously the default but changeable.

Todo before merging:

  • test with different data sets that this works and doesn't affect our ability to fit the model
  • update documentation

Todo after merging (future PRs):

Initial submission checklist

  • My PR is based on a package issue and I have explicitly linked it.
  • I have tested my changes locally (using devtools::test() and devtools::check()).
  • I have added or updated unit tests where necessary.
  • I have updated the documentation if required and rebuilt docs if yes (using devtools::document()).
  • I have followed the established coding standards (and checked using lintr::lint_package()).
  • I have added a news item linked to this PR.

After the initial Pull Request

  • I have reviewed Checks for this PR and addressed any issues as far as I am able.

This comment was marked as outdated.

This comment was marked as outdated.

@sbfnk
Copy link
Contributor Author

sbfnk commented Dec 12, 2024

test with different data sets that this works and doesn't affect our ability to fit the model

The vignettes show no meaningful change in results or number of divergent transitions. Benchmarks also show no significant change in performance. If this looks correct to people we can probably merge and explore further building on this.

This comment was marked as outdated.

@jamesmbaazam
Copy link
Contributor

@seabbs Do you want to take this one?

@seabbs seabbs self-requested a review December 13, 2024 14:12
@sbfnk
Copy link
Contributor Author

sbfnk commented Dec 16, 2024

Testing this on a few data sets I encountered no issues but an eye would have to be kept on this in the near future.

@seabbs
Copy link
Contributor

seabbs commented Dec 17, 2024

@sbfnk can you plot the change in the prior here if you get a chance?

Copy link
Contributor

@seabbs seabbs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From a technical stand point I think this is correct.

From a scientific standpoint I have a lot of reservations about:

  1. How do people reason on this prior. It means as they fit to new data I would expect they need to constantly update the prior - that feels really quite odd to me
  2. This is really quite a long lengthscale in many settings and I would expect it to negatively impact identifiability and performance during long periods of constant Rt (i.e locking in constant Rt beyond our prior expectations)
  3. Setting the max to the length of the time series as the default seems likely to lead to non-identifiability to me (support for a range of very nearly the same lengthscales)
  4. This has moved the prior to be normal vs lognormal unless I am missing a transformation. If there is a transformation what does this mean for the inverse gamma option

I know you have done some testing but I don't really believe our testing is deep enough to catch above.

My preference would be we update interface but keep the current defaults. I realise this could be a problem as it means we need to rescale in the model (I think)

NEWS.md Outdated Show resolved Hide resolved
R/opts.R Outdated Show resolved Hide resolved
@sbfnk
Copy link
Contributor Author

sbfnk commented Dec 17, 2024

  1. How do people reason on this prior. It means as they fit to new data I would expect they need to constantly update the prior - that feels really quite odd to me

I tend to agree with you here, and I hadn't thought about it like this. Just noting (not taking away from your point) that it's what is done in the Riutort-Mayol paper.

  1. This is really quite a long lengthscale in many settings and I would expect it to negatively impact identifiability and performance during long periods of constant Rt (i.e locking in constant Rt beyond our prior expectations)
  2. Setting the max to the length of the time series as the default seems likely to lead to non-identifiability to me (support for a range of very nearly the same lengthscales)

These could be easily address by changing the exact prior (which I mean to choose to roughly mirror the current prior with the examples). As far as I can see in the paper the example choices are

Gamma(1.2, 0.2.)
InverseGamma(2, 0.5)
Normal(0, 2)

  1. This has moved the prior to be normal vs lognormal unless I am missing a transformation. If there is a transformation what does this mean for the inverse gamma option

This wasn't intended - not sure what happened there. The Inverse Gamma option has gone (as the lengthsacle can now be fixed).

I know you have done some testing but I don't really believe our testing is deep enough to catch above.

Completely agree.

My preference would be we update interface but keep the current defaults. I realise this could be a problem as it means we need to rescale in the model (I think)

We could do this in preprocessing - just would be a bit tricky as we'd have to implement transformation by multiplying with a constant for all the distributions used. Not impossible, though, perhaps could even be done at dist_spec() level.

Copy link
Contributor

This is how benchmark results would change (along with a 95% confidence interval in relative change) if 13889e8 is merged into main:

  • ✔️default: 23.4s -> 22.1s [-16.35%, +5.25%]
  • ✔️no_delays: 27.5s -> 27.5s [-18.93%, +18.77%]
  • 🚀random_walk: 11.4s -> 9.79s [-26.19%, -1.72%]
  • ✔️stationary: 13.4s -> 13.6s [-16.03%, +18.93%]
  • ✔️uncertain: 36.4s -> 38.8s [-13.46%, +26.53%]
    Further explanation regarding interpretation and methodology can be found in the documentation.

@seabbs
Copy link
Contributor

seabbs commented Dec 19, 2024

Just noting (not taking away from your point) that it's what is done in the Riutort-Mayol paper.

Yup, I intentionally didn't implement it like this at the time. My view then was that they weren't really trying to implement something for reuse but just for their experiments and so they might not have thought about the implications of use across rolling periods and different datasets etc.

These could be easily addressed by changing the exact prior (which I mean to choose to roughly mirror the current prior with the examples)

Yes agree

  • just would be a bit tricky as we'd have to implement transformation by multiplying with a constant for all the distributions used.

Yes, agree. Its not ideal. Wouldn't it be divide by a constant? The reason I said in model is that this makes it much more flexible (as some prior distributions might be hard to rescale). If we wanted a quick fix this would also be much easier (though at the cost of some efficiency).

Having sat on this I am happy if you want to set this issue aside for this PR (i.e. make it a new issue and gather some feedback thoughts on if this is really a problem for people).

@sbfnk
Copy link
Contributor Author

sbfnk commented Dec 19, 2024

Wouldn't it be divide by a constant? The reason I said in model is that this makes it much more flexible (as some prior distributions might be hard to rescale). If we wanted a quick fix this would also be much easier (though at the cost of some efficiency)

I already did this in 922f0c9 - I also have a version where the scaling happens in R, which I'll put in a separate issue/PR combo

Copy link
Contributor

@seabbs seabbs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice looks good to me.

@sbfnk sbfnk added this pull request to the merge queue Dec 19, 2024
@github-merge-queue github-merge-queue bot removed this pull request from the merge queue due to failed status checks Dec 19, 2024
@seabbs seabbs enabled auto-merge December 20, 2024 14:34
@seabbs seabbs added this pull request to the merge queue Dec 20, 2024
Merged via the queue into main with commit 89124a7 Dec 20, 2024
10 of 13 checks passed
@seabbs seabbs deleted the ls-prior branch December 20, 2024 17:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Switch length scale prior to new interface
3 participants