-
Notifications
You must be signed in to change notification settings - Fork 22
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
L-1 procrustes #115
Comments
Thanks for the nice words! We don't have any implementation to support
Are you looking for the implementation of |
Thank you so much for the resources! I really appreciate it. I guess that second paper is the most general, but doesn't guarantee optimality. I am happy to make a pull request some time if you think this would be relevant to your project. Main pieces for projected gradient are in section 3.3 & Eq. 14 of this paper (same author as your #2.) |
Wow! This is going to be a nice improvement!! Thanks. |
@choltz95 were you going to tackle this issue? |
Hi yes - sorry for the delayed response. Still potentially interested. I ended up doing iteratively reweighted least squares to solve the l1 problem for my application since I did not really need a high quality solution. Let me take another look at the papers originally posted in this thread and get back to you. |
Hey - just started working on a prototype. I implemented a preliminary version of PGD for the smoothened l1 problem (huber) from the Trendafilov & Watson paper in Jax here. I saw that the convergence is mediocre and needs many iterations on even some very simple testcases. I tried some simple things like gridding over geometric step lengths and early stopping. Did you find any other papers that discuss nicer solutions to this problem? There could also be some bugs. When I initialize using solution to l-2 procrustes, result looks a little more reasonable. I am thinking of trying a projected admm-type algorithm to solve the problem directly very quickly. I had some success on a similar problem. I feel that neither is ideal since the other methods in your library all have nice solutions, but it seems there is no way to get around iterating or potentially bad solutions for this one. |
Thank you for your efforts! I think for the implementation, maybe a good check is to reproduce the results listed in the paper you mentioned. I will also do more literature reading to find if we can get better algorithms on this problem. |
Sounds good. Yeah- those plots don't look right to me. Here is the result from my run: Likely there is some issue with the implementation of the descent direction since the foc (eq 15, left plot) should ideally go to zero monotonically. I'll get back to you when I have some time to debug. I agree next step should be reproduce their result. |
Hi, sorry for the late response. Quick follow-up. Took another look at my implementation. I think it may be possible that the implementation is okay - I reproduced the concrete numerical experiments in Example 2 of the paper to get similar error, even though the foc is nonzero. I think the issue may simply be that projected gradient method may not be ideal for this kind of problem with orthogonality constraint. I am thinking this: since there is already some api support for weighted least squares procrustes (with weights on the rows of A / the "samples"), it is probably easier to do something naive like iterative reweighting: https://en.wikipedia.org/wiki/Iteratively_reweighted_least_squares. This way we remain within the procrustes api and require minimal effort while supporting some notion of robust / l1-procrustes. I think there could be some problems like convergence is usually pretty slow, solution won't change much from least squares, but maybe it's better then projected gradient. Let me know if you would like me to proceed! |
Looks like a good feature to have. Please go ahead and make a pull request. Can you tag me when you are done? Thank you for making our package grow! @choltz95 |
Hi, Thanks for the project! I'm interested in solving procrustes using a more robust metric of dispersion than F norm.
I saw a couple papers online that solve e.g. smooth surrogates of l1, but wondering if you have any pointers in particular.
The text was updated successfully, but these errors were encountered: