Dec 6, 2021
Speaker · 0 followers
Speaker · 0 followers
Speaker · 0 followers
Speaker · 0 followers
Speaker · 0 followers
Speaker · 0 followers
Much work has considered multi-armed bandit problems under linear (or kernelized) assumptions on the underlying reward function. This work considers bandit problems where the unknown underlying reward function is structured polynomials, including a two-layer neural network with polynomial activations and a low-rank generalized linear bandit problem.For the low-rank generalized linear bandit problem, we provide a minimax optimal algorithm in the dimension, refuting the conjectures in Jun et al (2019) and Lu et al (2021). Our algorithm is a unified zeroth-order optimization paradigm that applies in general, which attains optimal rates in several structured polynomial settings (in the dimension). We instantiate with the generalized low-rank bandit on how our techniques can be extended to the RL in the generative setting, resulting in improved sample complexity over prior approaches.Finally, we show that the standard optimistic algorithms (e.g., UCB) are sub-optimal by dimension factors. With the neural net setting (with polynomial activation functions), where, in the noise-free setting, we provide a bandit algorithm that has sample complexity equal to the intrinsic algebraic dimension. Again, we show that optimistic approaches have worse sample complexity, polynomial in the extrinsic dimension (which could be exponentially worse in the polynomial degree).Much work has considered multi-armed bandit problems under linear (or kernelized) assumptions on the underlying reward function. This work considers bandit problems where the unknown underlying reward function is structured polynomials, including a two-layer neural network with polynomial activations and a low-rank generalized linear bandit problem.For the low-rank generalized linear bandit problem, we provide a minimax optimal algorithm in the dimension, refuting the conjectures in Jun et al (2…
Account · 1.9k followers
Neural Information Processing Systems (NeurIPS) is a multi-track machine learning and computational neuroscience conference that includes invited talks, demonstrations, symposia and oral and poster presentations of refereed papers. Following the conference, there are workshops which provide a less formal setting.
Professional recording and live streaming, delivered globally.
Presentations on similar topic, category or speaker
Marcel Hirt, …
Total of 0 viewers voted for saving the presentation to eternal vault which is 0.0%
Total of 0 viewers voted for saving the presentation to eternal vault which is 0.0%
Total of 0 viewers voted for saving the presentation to eternal vault which is 0.0%
Liwang Zhu, …
Total of 0 viewers voted for saving the presentation to eternal vault which is 0.0%
Zhuqing Liu, …
Total of 0 viewers voted for saving the presentation to eternal vault which is 0.0%
Total of 0 viewers voted for saving the presentation to eternal vault which is 0.0%