Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Large scale root-finding #93

Open
GJBoth opened this issue Nov 13, 2024 · 7 comments
Open

Large scale root-finding #93

GJBoth opened this issue Nov 13, 2024 · 7 comments
Labels
question User queries

Comments

@GJBoth
Copy link

GJBoth commented Nov 13, 2024

I have a problem where I'm looking for a fixed point for a [1000, 1, 1000, 3] complex field. I started by splitting it into a real and imaginary part to allow implicit differentiation, so essentially I'm working with a [1000, 1, 1000, 3, 2] sized field. Fixed point iteration works excellent, and so does implicit differentiation.

I can rewrite this problem more efficiently as a root finding problem (some things cancel out), but using find_root quickly goes out of memory for any method I'm trying - for example, Newton is trying to allocate 77TB(!) of memory. I'm trying to understand if this is expected behaviour - the Jacobian is big here, but it doesn't need to be explicitly calculated, does it (matrix free / JVP style)? Am I missing something here?

Thanks!

@patrick-kidger
Copy link
Owner

By default, Optimistix will materialize the Jacobian. This is because the matrix-materializing linear solvers are generally both faster (for problems that fit in-memory) and more numerically stable, than the matrix-free ones.

In this case you may wish to use a matrix-free linear solver, e.g. optimistix.Newton(linear_solver=lineax.CG()). (Make sure your problem has the right structure for the linear solver you use, see https://docs.kidger.site/lineax/examples/no_materialisation/ for more.)

@patrick-kidger patrick-kidger added the question User queries label Nov 14, 2024
@GJBoth
Copy link
Author

GJBoth commented Nov 18, 2024

Thanks, this was very helpful - I was unaware this was optimistix default behaviour.

Additionally, do you know if there's a difference between the jaxopt matrix-free solvers and the optimistix ones? In my problem, implicit differentiation works fine in jaxopt (using NormalCG, I checked and the gradients are correct), but optimistix returns nans. I can take some time to write something reproducible, but was wondering if I'm missing something obvious?

@patrick-kidger
Copy link
Owner

The tools on this page might be a useful starting point. But I don't think there should be a serious difference between the two tech stacks. A MWE might be necessary I'm afraid!

En passant note that implicit differentiation also involves a linear solve, separate to the one already discussed, and you can adjust this with adjoint=ImplicitAdjoint(linear_solver=lineax.Foo()).

@GJBoth
Copy link
Author

GJBoth commented Nov 18, 2024

Allright, I'll get to it then! My money's on a divide by zero somewhere (I do some padding), but it doesn't seem to be an issue for jaxopt, so that's what I'm surprised about? I'll get back to you in a few days once I have time to do this.

And yes, I'm aware of that - I noticed jaxopt used NormalCG in the implicit, but both CG nor NormalCG in the adjoint of optimistix returned NaNs.

I'm wondering if it makes sense to add a warning about optimistix using matrix-materialising solvers by default? For users coming from jaxopt who are not too aware of the inner workings of implicit diff, optimistix will fail where jaxopt succeeded.

@patrick-kidger
Copy link
Owner

Allright, I'll get to it then! My money's on a divide by zero somewhere (I do some padding), but it doesn't seem to be an issue for jaxopt, so that's what I'm surprised about? I'll get back to you in a few days once I have time to do this.

Thank you!

I'm wondering if it makes sense to add a warning about optimistix using matrix-materialising solvers by default? For users coming from jaxopt who are not too aware of the inner workings of implicit diff, optimistix will fail where jaxopt succeeded.

That makes sense to me! Where would be best? (The documentation? Maybe a warning in code if the input is large? I'mt not sure.) If you have thoughts I'd be happy to take a PR on wherever you would have found it most useful.

@GJBoth
Copy link
Author

GJBoth commented Nov 20, 2024

A warning in the code could be a good idea, although it would require us to decide on a size limit - which would be fairly system dependent...

Looking at the documentation, how about adding a section to the FAQ? Or even under Misc in its own section?

@patrick-kidger
Copy link
Owner

I think adding this to the FAQ sounds good to me!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question User queries
Projects
None yet
Development

No branches or pull requests

2 participants