title

section

abstract

layout

series

publisher

issn

id

month

tex_title

firstpage

lastpage

page

order

cycles

bibtex_author

author

date

address

container-title

volume

genre

issued

pdf

extras

Stochastic Constrained Contextual Bandits via Lyapunov Optimization Based Estimation to Decision Framework

Original Papers

This paper studies the problem of stochastic constrained contextual bandits (CCB) under general realizability condition where the expected rewards and costs are within general function classes. We propose LOE2D, a Lyapunov Optimization Based Estimation to Decision framework with online regression oracles for learning reward/constraint. LOE2D establishes $\Tilde O(T^{\frac{3}{4}}U^{\frac{1}{4}})$ regret and constraint violation, which can be further refined to $\Tilde O(\min\{\sqrt{TU}/\varepsilon^2, T^{\frac{3}{4}}U^{\frac{1}{4}}\})$ when the Slater condition holds in the underlying offline problem with the Slater “constant” $ \varepsilon=\Omega(\sqrt{U/T}),$ where $U$ denotes the error bounds of online regression oracles. These results improve LagrangeCBwLC in two aspects: i) our results hold without any prior information while LagrangeCBwLC requires the knowledge of Slater constant to design a proper learning rate; ii) our results hold when $\varepsilon=\Omega(\sqrt{U/T})$ while LagrangeCBwLC requires a constant margin $\varepsilon=\Omega(1).$ These improvements stem from two novel techniques: violation-adaptive learning in E2D module and multi-step Lyapunov drift analysis in bounding constraint violation. The experiments further justify LOE2D outperforms the baseline algorithm.

inproceedings

Proceedings of Machine Learning Research

PMLR

2640-3498

guo24a

0

Stochastic Constrained Contextual Bandits via Lyapunov Optimization Based Estimation to Decision Framework

2204

2231

2204-2231

2204

false

Guo, Hengquan and Liu, Xin

given	family
Hengquan	Guo

given	family
Xin	Liu

2024-06-30

Proceedings of Thirty Seventh Conference on Learning Theory

247

inproceedings

date-parts

2024

6

30

https://proceedings.mlr.press/v247/guo24a/guo24a.pdf

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

2024-06-30-guo24a.md

2024-06-30-guo24a.md

Files

2024-06-30-guo24a.md

Latest commit

History

2024-06-30-guo24a.md

File metadata and controls