Julia Olkhovskaya, Jack Mayo, Tim Van Erven, Gergely Neu, Chen-Yu Wei · First- and Second-Order Bounds for Adversarial Linear Contextual Bandits · SlidesLive

Categories

EN

Log in Get an estimate

First- and Second-Order Bounds for Adversarial Linear Contextual Bandits

Dec 10, 2023

Speakers

About

We consider the adversarial linear contextual bandit setting, whichallows for the loss functions associated with each of K arms to changeover time without restriction. Assuming the d-dimensional contexts aredrawn from a fixed known distribution, the worst-case expected regretover the course of T rounds is known to scale as Õ(√(KdT)). Under the additional assumption that the density of the contextsis log-concave, we obtain a second-order bound of order (K√(d V_T)) in terms of the cumulative second moment of thelearner's losses V_T, and a closely related first-order bound of orderÕ(K√(d L_T^*)) in terms of the cumulative loss of the bestpolicy L_T^*. Since V_T or L_T^* may be significantly smaller thanT, these improve over the worst-case regret whenever the environmentis relatively benign. Our results are obtained using a truncated versionof the continuous exponential weights algorithm over the probabilitysimplex, which we analyse by exploiting a novel connection to the linearbandit setting without contexts.

Organizer

Store presentation

Should this presentation be stored for 1000 years?

How do we store presentations

Sharing

Recommended Videos

Presentations on similar topic, category or speaker