Skip to content

Latest commit

 

History

History
56 lines (56 loc) · 2.37 KB

2024-06-30-dai24a.md

File metadata and controls

56 lines (56 loc) · 2.37 KB
title section abstract layout series publisher issn id month tex_title firstpage lastpage page order cycles bibtex_author author date address container-title volume genre issued pdf extras
Refined Sample Complexity for Markov Games with Independent Linear Function Approximation (Extended Abstract)
Original Papers
Markov Games (MG) is an important model for Multi-Agent Reinforcement Learning (MARL). It was long believed that the “curse of multi-agents” (i.e., the algorithmic performance drops exponentially with the number of agents) is unavoidable until several recent works (Daskalakis et al., 2023; Cui et al., 2023; Wang et al., 2023). While these works resolved the curse of multi-agents, when the state spaces are prohibitively large and (linear) function approximations are deployed, they either had a slower convergence rate of $O(T^{-1/4})$ or brought a polynomial dependency on the number of actions $A_{\max}$ – which is avoidable in single-agent cases even when the loss functions can arbitrarily vary with time. This paper first refines the AVLPR framework by Wang et al. (2023), with an insight of designing \textit{data-dependent} (i.e., stochastic) pessimistic estimation of the sub-optimality gap, allowing a broader choice of plug-in algorithms. When specialized to MGs with independent linear function approximations, we propose novel \textit{action-dependent bonuses} to cover occasionally extreme estimation errors. With the help of state-of-the-art techniques from the single-agent RL literature, we give the first algorithm that tackles the curse of multi-agents, attains the optimal $O(T^{-1/2})$ convergence rate, and avoids $\text{poly}(A_{\max})$ dependency simultaneously.
inproceedings
Proceedings of Machine Learning Research
PMLR
2640-3498
dai24a
0
Refined Sample Complexity for Markov Games with Independent Linear Function Approximation (Extended Abstract)
1260
1261
1260-1261
1260
false
Dai, Yan and Cui, Qiwen and Du, Simon S.
given family
Yan
Dai
given family
Qiwen
Cui
given family
Simon S.
Du
2024-06-30
Proceedings of Thirty Seventh Conference on Learning Theory
247
inproceedings
date-parts
2024
6
30