Christoph Dann, Chen-Yu Wei, Julian Zimmert · Best of Both Worlds Policy Optimization · SlidesLive

Kategorie

CS

Přihlásit se Kontaktujte nás

Další

Živý přenos začne již brzy!

Živý přenos již skončil.

Prezentace ještě nebyla nahrána!

SlidesLive

title: Best of Both Worlds Policy Optimization

0:00 / 0:00

Nahlásit chybu
Nastavení
Playlisty
Záložky
Titulky Off
Rychlost přehrávání
Kvalita

Nastavení
Debug informace
Server sl-yoda-v2-stream-002-alpha.b-cdn.net
Velikost titulků Střední

Záložky

Server
sl-yoda-v2-stream-002-alpha.b-cdn.net
sl-yoda-v2-stream-002-beta.b-cdn.net
1001562353.rsc.cdn77.org
1075090661.rsc.cdn77.org

Titulky
Off
English

Rychlost přehrávání

Kvalita

Velikost titulků
Velké
Střední
Malé

Mode
Video Slideshow
Audio Slideshow
Slideshow
Video

Best of Both Worlds Policy Optimization

Best of Both Worlds Policy Optimization

24. července 2023

Řečníci

Christoph Dann

Řečník · 0 sledujících

Chen-Yu Wei

Řečník · 0 sledujících

Julian Zimmert

Řečník · 0 sledujících

O prezentaci

Policy optimization methods are popular reinforcement learning algorithms in practice and recent works have build theoretical foundation for them by proving √(T) regret bounds even when the losses are adversarial. Such bounds are tight in the worst case but often overly pessimistic. In this work, we show that by carefully designing the regularizer, bonus terms, and learning rates, one can achieve a more favorable polylog(T) regret bound when the losses are stochastic, without sacrificing the wor…

Organizátor

ICML 2023

Účet · 657 sledujících

Baví vás formát? Nechte SlidesLive zachytit svou akci!

Profesionální natáčení a streamování po celém světě.

Sdílení

Doporučená videa

Prezentace na podobné téma, kategorii nebo přednášejícího

Variational Curriculum Reinforcement Learning for Unsupervised Discovery of Skills

05:16

Variational Curriculum Reinforcement Learning for Unsupervised Discovery of Skills

Zhlédnout později

Oblíbené

Seongun Kim, …

ICML 2023 2 years ago

A Critical Revisit of Adversarial Robustness in 3D Point Cloud Recognition with Diffusion-Driven Purification

05:05

A Critical Revisit of Adversarial Robustness in 3D Point Cloud Recognition with Diffusion-Driven Purification

Zhlédnout později

Oblíbené

Jiachen Sun, …

ICML 2023 2 years ago

Detecting Adversarial Directions in Deep Reinforcement Learning to Make Robust Decisions

04:52

Detecting Adversarial Directions in Deep Reinforcement Learning to Make Robust Decisions

Zhlédnout později

Oblíbené

Ezgi Korkmaz, …

ICML 2023 2 years ago

The Triton Programming Language

24:11

The Triton Programming Language

Zhlédnout později

Oblíbené

Philippe Tillet

ICML 2023 2 years ago

Improving ℓ1-Certified Robustness via Randomized Smoothing by Leveraging Box Constraints

04:44

Improving ℓ1-Certified Robustness via Randomized Smoothing by Leveraging Box Constraints

Zhlédnout později

Oblíbené

Václav Voráček, …

ICML 2023 2 years ago

When is Realizability Sufficient for Off-Policy Reinforcement Learning?

04:36

When is Realizability Sufficient for Off-Policy Reinforcement Learning?

Zhlédnout později

Oblíbené

ICML 2023 2 years ago