Eddy Hudson, Garrett Warnell, Ishan Durugkar, Peter Stone · ABC: Adversarial Behavioral Cloning for Offline Mode-Seeking Imitation Learning · SlidesLive

Kategorie

CS

Přihlásit se Nezávazná poptávka

ABC: Adversarial Behavioral Cloning for Offline Mode-Seeking Imitation Learning

2. Prosinec 2022

Řečníci

O prezentaci

Given a dataset of interactions with an environment of interest, a viable method to extract an agent policy is to estimate the maximum likelihood policy indicated by this data. This approach is commonly referred to as behavioral cloning (BC). In this work, we describe a key disadvantage of BC that arises due to the maximum likelihood objective function; namely that BC is mean-seeking with respect to the state-conditional expert action distribution when the learner's policy is represented with a Gaussian. To address this issue, we develop a modified version of BC, Adversarial Behavioral Cloning (ABC), that exhibits mode-seeking behavior by incorporating elements of GAN (generative adversarial network) training. We evaluate ABC on toy domains and a domain based on Hopper from the DeepMind Control suite, and show that it outperforms BC by being mode-seeking in nature.

Organizátor

Uložení prezentace

Měla by být tato prezentace uložena po dobu 1000 let?

Jak ukládáme prezentace

Sdílení

Doporučená videa

Prezentace na podobné téma, kategorii nebo přednášejícího