Skip to content

Latest commit

 

History

History
53 lines (53 loc) · 2.11 KB

2024-06-30-chen24a.md

File metadata and controls

53 lines (53 loc) · 2.11 KB
title section abstract layout series publisher issn id month tex_title firstpage lastpage page order cycles bibtex_author author date address container-title volume genre issued pdf extras
On Finding Small Hyper-Gradients in Bilevel Optimization: Hardness Results and Improved Analysis
Original Papers
Bilevel optimization reveals the inner structure of otherwise oblique optimization problems, such as hyperparameter tuning, neural architecture search, and meta-learning. A common goal in bilevel optimization is to minimize a hyper-objective that implicitly depends on the solution set of the lower-level function. Although this hyper-objective approach is widely used, its theoretical properties have not been thoroughly investigated in cases where \textit{the lower-level functions lack strong convexity}. In this work, we first provide hardness results to show that the goal of finding stationary points of the hyper-objective for nonconvex-convex bilevel optimization can be intractable for zero-respecting algorithms. Then we study a class of tractable nonconvex-nonconvex bilevel problems when the lower-level function satisfies the Polyak-Ł{}ojasiewicz (PL) condition. We show a simple first-order algorithm can achieve complexity bounds of $\tilde{\mathcal{O}}(\epsilon^{-2})$, $\tilde{\mathcal{O}}(\epsilon^{-4})$ and $\tilde{\mathcal{O}}(\epsilon^{-6})$ in the deterministic, partially stochastic, and fully stochastic setting respectively.
inproceedings
Proceedings of Machine Learning Research
PMLR
2640-3498
chen24a
0
On Finding Small Hyper-Gradients in Bilevel Optimization: Hardness Results and Improved Analysis
947
980
947-980
947
false
Chen, Lesi and Xu, Jing and Zhang, Jingzhao
given family
Lesi
Chen
given family
Jing
Xu
given family
Jingzhao
Zhang
2024-06-30
Proceedings of Thirty Seventh Conference on Learning Theory
247
inproceedings
date-parts
2024
6
30