Skip to content

Latest commit

 

History

History
55 lines (55 loc) · 2.02 KB

2024-06-30-gordon24a.md

File metadata and controls

55 lines (55 loc) · 2.02 KB
title section abstract layout series publisher issn id month tex_title firstpage lastpage page order cycles bibtex_author author date address container-title volume genre issued pdf extras
Identification of mixtures of discrete product distributions in near-optimal sample and time complexity
Original Papers
We consider the problem of \emph{identifying,} from statistics, a distribution of discrete random variables $X_1 \ldots,X_n$ that is a mixture of $k$ product distributions. The best previous sample complexity for $n \in O(k)$ was $(1/\zeta)^{O(k^2 \log k)}$ (under a mild separation assumption parameterized by $\zeta$). The best known lower bound was $\exp(\Omega(k))$. It is known that $n\geq 2k-1$ is necessary and sufficient for identification. We show, for any $n\geq 2k-1$, how to achieve sample complexity and run-time complexity $(1/\zeta)^{O(k)}$. We also extend the known lower bound of $e^{\Omega(k)}$ to match our upper bound across a broad range of $\zeta$. Our results are obtained by combining (a) a classic method for robust tensor decomposition, (b) a novel way of bounding the condition number of key matrices called Hadamard extensions, by studying their action only on flattened rank-1 tensors.
inproceedings
Proceedings of Machine Learning Research
PMLR
2640-3498
gordon24a
0
Identification of mixtures of discrete product distributions in near-optimal sample and time complexity
2071
2091
2071-2091
2071
false
Gordon, Spencer L. and Jahn, Erik and Mazaheri, Bijan and Rabani, Yuval and Schulman, Leonard J.
given family
Spencer L.
Gordon
given family
Erik
Jahn
given family
Bijan
Mazaheri
given family
Yuval
Rabani
given family
Leonard J.
Schulman
2024-06-30
Proceedings of Thirty Seventh Conference on Learning Theory
247
inproceedings
date-parts
2024
6
30