Jul 24, 2023
Speaker · 0 followers
Speaker · 0 followers
Speaker · 0 followers
Mean-field games have been used as a theoretical tool to obtain an approximate Nash equilibrium for symmetric and anonymous N-player games.However, limiting applicability, existing theoretical results assume variations of a “population generative model”, which allows arbitrary modifications of the population distribution by the learning algorithm.Moreover, learning algorithms typically work on abstract simulators with population instead of the N-player game.Instead, we show that N agents running policy mirror ascent converge to the Nash equilibrium of the regularized game within 𝒪(ε^-2) samples from a single sample trajectory without a population generative model, up to a standard 𝒪(1/√(N)) error due to the mean field.Taking a divergent approach from the literature, instead of working with the best-response map we first show that a policy mirror ascent map can be used to construct a contractive operator having the Nash equilibrium as its fixed point.We analyze single-path TD learning for N-agent games, proving sample complexity guarantees by only using a sample path from the N-agent simulator without a population generative model.Furthermore, we demonstrate that our methodology allows for independent learning by N agents with finite sample guarantees.Mean-field games have been used as a theoretical tool to obtain an approximate Nash equilibrium for symmetric and anonymous N-player games.However, limiting applicability, existing theoretical results assume variations of a “population generative model”, which allows arbitrary modifications of the population distribution by the learning algorithm.Moreover, learning algorithms typically work on abstract simulators with population instead of the N-player game.Instead, we show that N agents running…
Professional recording and live streaming, delivered globally.
Presentations on similar topic, category or speaker
Yi-Fan Zhang, …
Ondrej Biza, …