title | section | abstract | layout | series | publisher | issn | id | month | tex_title | firstpage | lastpage | page | order | cycles | bibtex_author | author | date | address | container-title | volume | genre | issued | extras | |||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
On Finding Small Hyper-Gradients in Bilevel Optimization: Hardness Results and Improved Analysis |
Original Papers |
Bilevel optimization reveals the inner structure of otherwise oblique optimization problems, such as hyperparameter tuning, neural architecture search, and meta-learning. A common goal in bilevel optimization is to minimize a hyper-objective that implicitly depends on the solution set of the lower-level function. Although this hyper-objective approach is widely used, its theoretical properties have not been thoroughly investigated in cases where \textit{the lower-level functions lack strong convexity}. In this work, we first provide hardness results to show that the goal of finding stationary points of the hyper-objective for nonconvex-convex bilevel optimization can be intractable for zero-respecting algorithms. Then we study a class of tractable nonconvex-nonconvex bilevel problems when the lower-level function satisfies the Polyak-Ł{}ojasiewicz (PL) condition. We show a simple first-order algorithm can achieve complexity bounds of |
inproceedings |
Proceedings of Machine Learning Research |
PMLR |
2640-3498 |
chen24a |
0 |
On Finding Small Hyper-Gradients in Bilevel Optimization: Hardness Results and Improved Analysis |
947 |
980 |
947-980 |
947 |
false |
Chen, Lesi and Xu, Jing and Zhang, Jingzhao |
|
2024-06-30 |
Proceedings of Thirty Seventh Conference on Learning Theory |
247 |
inproceedings |
|