Skip to content

Latest commit

 

History

History
47 lines (47 loc) · 1.86 KB

2024-06-30-diakonikolas24c.md

File metadata and controls

47 lines (47 loc) · 1.86 KB
title section abstract layout series publisher issn id month tex_title firstpage lastpage page order cycles bibtex_author author date address container-title volume genre issued pdf extras
Efficiently Learning One-Hidden-Layer ReLU Networks via SchurPolynomials
Original Papers
We study the problem of PAC learning a linear combination of $k$ ReLU activations under the standard Gaussian distribution on $\mathbb{R}^d$ with respect to the square loss. Our main result is an efficient algorithm for this learning task with sample and computational complexity $(dk/\epsilon)^{O(k)}$, where $\epsilon>0$ is the target accuracy. Prior work had given an algorithm for this problem with complexity $(dk/\epsilon)^{h(k)}$, where the function $h(k)$ scales super-polynomially in $k$. Interestingly, the complexity of our algorithm is near-optimal within the class of Correlational Statistical Query algorithms. At a high-level, our algorithm uses tensor decomposition to identify a subspace such that all the $O(k)$-order moments are small in the orthogonal directions. Its analysis makes essential use of the theory of Schur polynomials to show that the higher-moment error tensors are small given that the lower-order ones are.
inproceedings
Proceedings of Machine Learning Research
PMLR
2640-3498
diakonikolas24c
0
Efficiently Learning One-Hidden-Layer ReLU Networks via SchurPolynomials
1364
1378
1364-1378
1364
false
Diakonikolas, Ilias and Kane, Daniel M.
given family
Ilias
Diakonikolas
given family
Daniel M.
Kane
2024-06-30
Proceedings of Thirty Seventh Conference on Learning Theory
247
inproceedings
date-parts
2024
6
30