title

section

abstract

layout

series

publisher

issn

id

month

tex_title

firstpage

lastpage

page

order

cycles

bibtex_author

author

date

address

container-title

volume

genre

issued

pdf

extras

The Best Arm Evades: Near-optimal Multi-pass Streaming Lower Bounds for Pure Exploration in Multi-armed Bandits

Original Papers

We give a near-optimal sample-pass trade-off for pure exploration in multi-armed bandits (MABs) via multi-pass streaming algorithms: any streaming algorithm with sublinear memory that uses the optimal sample complexity of $O(n/\Delta^2)$ requires $\Omega(\log{(1/\Delta)}/\log\log{(1/\Delta)})$ passes. Here, $n$ is the number of arms and $\Delta$ is the reward gap between the best and the second-best arms. Our result matches the $O(\log(1/\Delta))$ pass algorithm of Jin et al. [ICML’21] (up to lower order terms) that only uses $O(1)$ memory and answers an open question posed by Assadi and Wang [STOC’20].

inproceedings

Proceedings of Machine Learning Research

PMLR

2640-3498

assadi24a

0

The Best Arm Evades: Near-optimal Multi-pass Streaming Lower Bounds for Pure Exploration in Multi-armed Bandits

311

358

311-358

311

false

Assadi, Sepehr and Wang, Chen

given	family
Sepehr	Assadi

given	family
Chen	Wang

2024-06-30

Proceedings of Thirty Seventh Conference on Learning Theory

247

inproceedings

date-parts

2024

6

30

https://proceedings.mlr.press/v247/assadi24a/assadi24a.pdf

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

2024-06-30-assadi24a.md

2024-06-30-assadi24a.md

Files

2024-06-30-assadi24a.md

Latest commit

History

2024-06-30-assadi24a.md

File metadata and controls