Skip to content

Latest commit

 

History

History
50 lines (50 loc) · 1.77 KB

2024-06-30-chen24c.md

File metadata and controls

50 lines (50 loc) · 1.77 KB
title section abstract layout series publisher issn id month tex_title firstpage lastpage page order cycles bibtex_author author date address container-title volume genre issued pdf extras
Near-Optimal Learning and Planning in Separated Latent MDPs
Original Papers
We study computational and statistical aspects of learning Latent Markov Decision Processes (LMDPs). In this model, the learner interacts with an MDP drawn at the beginning of each epoch from an unknown mixture of MDPs. To sidestep known impossibility results, we consider several notions of $\delta$-separation of the constituent MDPs. The main thrust of this paper is in establishing a nearly-sharp \textit{statistical threshold} for the horizon length necessary for efficient learning. On the computational side, we show that under a weaker assumption of separability under the optimal policy, there is a quasi-polynomial algorithm with time complexity scaling in terms of the statistical threshold. We further show a near-matching time complexity lower bound under the exponential time hypothesis.
inproceedings
Proceedings of Machine Learning Research
PMLR
2640-3498
chen24c
0
Near-Optimal Learning and Planning in Separated Latent MDPs
995
1067
995-1067
995
false
Chen, Fan and Daskalakis, Constantinos and Golowich, Noah and Rakhlin, Alexander
given family
Fan
Chen
given family
Constantinos
Daskalakis
given family
Noah
Golowich
given family
Alexander
Rakhlin
2024-06-30
Proceedings of Thirty Seventh Conference on Learning Theory
247
inproceedings
date-parts
2024
6
30