Dez 6, 2021
Řečník · 0 sledujících
Řečník · 0 sledujících
In Multi-Agent Reinforcement Learning (MARL), multiple agents interact with a common environment and with each other, for solving a shared problem in sequential decision-making. Algorithms for MARL have a wealth of application in popular domains including gaming, robotics, and finance. In this work, we study a family of distributed nonlinear stochastic approximation schemes useful in MARL and derive a novel law of iterated logarithm. In particular, our result describes the convergence rate on almost every sample path where the algorithm converges. This result is the first of its kind in the distributed setup and provides deeper insights than the existing ones, which only discuss convergence rates in the expected or the CLT sense. Importantly, our result holds under significantly weaker assumptions: neither the gossip matrix needs to be doubly stochastic nor the stepsizes square summable. As an application, we show that, for the stepsize n^-γ with γ∈ (0, 1), the distributed TD(0) algorithm with linear function approximation has a convergence rate of 𝒪(√(n^-γlog n )) a.s.; for the 1/n type stepsize, it is 𝒪(√(n^-1loglog n)) a.s. These growth rates do not depend on the graph depicting the interactions among the different agents.In Multi-Agent Reinforcement Learning (MARL), multiple agents interact with a common environment and with each other, for solving a shared problem in sequential decision-making. Algorithms for MARL have a wealth of application in popular domains including gaming, robotics, and finance. In this work, we study a family of distributed nonlinear stochastic approximation schemes useful in MARL and derive a novel law of iterated logarithm. In particular, our result describes the convergence rate on al…
Účet · 1,9k sledujících
Neural Information Processing Systems (NeurIPS) is a multi-track machine learning and computational neuroscience conference that includes invited talks, demonstrations, symposia and oral and poster presentations of refereed papers. Following the conference, there are workshops which provide a less formal setting.
Professionelle Aufzeichnung und Livestreaming – weltweit.
Präsentationen, deren Thema, Kategorie oder Sprecher:in ähnlich sind
Kai Xu, …
Pro uložení prezentace do věčného trezoru hlasovalo 0 diváků, což je 0.0 %
Pro uložení prezentace do věčného trezoru hlasovalo 0 diváků, což je 0.0 %
Pro uložení prezentace do věčného trezoru hlasovalo 0 diváků, což je 0.0 %
Aditya Hegde, …
Pro uložení prezentace do věčného trezoru hlasovalo 0 diváků, což je 0.0 %
Pro uložení prezentace do věčného trezoru hlasovalo 0 diváků, což je 0.0 %
Pro uložení prezentace do věčného trezoru hlasovalo 0 diváků, což je 0.0 %