Jun 11, 2019
Speaker · 0 followers
Speaker · 0 followers
Speaker · 1 follower
Speaker · 2 followers
Speaker · 0 followers
Speaker · 0 followers
Speaker · 0 followers
Speaker · 0 followers
Speaker · 1 follower
Speaker · 0 followers
Speaker · 0 followers
Speaker · 0 followers
Speaker · 2 followers
Speaker · 1 follower
Speaker · 1 follower
Speaker · 0 followers
Speaker · 0 followers
Speaker · 0 followers
Speaker · 0 followers
Speaker · 0 followers
Speaker · 0 followers
Speaker · 0 followers
Speaker · 0 followers
Speaker · 0 followers
Speaker · 0 followers
Speaker · 1 follower
Speaker · 0 followers
Speaker · 0 followers
Speaker · 0 followers
Speaker · 0 followers
Speaker · 0 followers
Speaker · 0 followers
Regret Circuits: Composability of Regret Minimizers Regret minimization is a powerful tool for solving large-scale problems; it was recently used in breakthrough results for large-scale extensive-form game solving. This was achieved by composing simplex regret minimizers into an overall regret-minimization framework for extensive-form game strategy spaces. In this paper we study the general composability of regret minimizers. We derive a calculus for constructing regret minimizers for composite convex sets that are obtained from convexity-preserving operations on simpler convex sets. We show that local regret minimizers for the simpler sets can be combined with additional regret minimizers into an aggregate regret minimizer for the composite set. As one application, we show that the CFR framework can be constructed easily from our framework. We also show ways to include curtailing (constraining) operations into our framework. For one, they enables the construction of CFR generalization for extensive-form games with general convex strategy constraints that can cut across decision points. Game Theoretic Optimization via Gradient-based Nikaido-Isoda Function Computing the Nash equilibrium (NE) of multi-player games has witnessed renewed interest due to the recent advances in generative adversarial networks. For non-convex optimization formulations involving such games, a technique to analyze the quality of a solution is to resort to merit functions, among which the Nikaido-Isoda (NI) function is a suitable choice -- this function is positive and goes to zero only when each player is at their optima. However, computing equilibrium conditions efficiently is challenging. To this end, we introduce the Gradient-based Nikaido-Isoda (GNI) function which serves: (i) as a merit function, vanishing only at the first order stationary points of each player's optimization problem, and (ii) provides error bounds to a first-order NE. Specifically, for bilinear min-max games and multi-player quadratic games, a reformulation using the GNI function results in convex optimization objectives -- the application of gradient descent on such reformulations yield linear convergence to an NE. For the general case, we provide conditions under which the gradient descent provides sub-linear convergence. We show experiments on different multi-player game settings, demonstrating the effectiveness of our scheme against other popular game-theoretic optimization methods. Moment-Based Variational Inference for Markov Jump Processes We propose moment-based variational inference as a flexible framework for approximate smoothing of latent Markov jump processes. The main ingredient of our approach is to partition the set of all transitions of the latent process into classes. This allows to express the Kullback-Leibler divergence from the approximate to the posterior process in terms of a set of moment functions that arise naturally from the chosen partition. To illustrate possible choices of the partition, we consider special classes of jump processes that frequently occur in applications. We then extend the results to latent parameter inference and demonstrate the method on several examples. Understanding MCMC Dynamics as Flows on the Wasserstein Space It is known that the Langevin dynamics used in MCMC is the gradient flow of the KL divergence on the Wasserstein space, which helps convergence analysis and inspires recent particle-based variational inference methods (ParVIs). But no more MCMC dynamics is understood in this way. In this work, by developing novel concepts, we propose a theoretical framework that recognizes a general MCMC dynamics as the fiber-gradient Hamiltonian flow on the Wasserstein space of a fiber-Riemannian Poisson manifold. The conservation + convergence'' structure of the flow gives a clear picture on the behavior of general MCMC dynamics. We analyse existing MCMC instances under the framework. The framework also enables ParVI simulation of MCMC dynamics, which enriches the ParVI family with more efficient dynamics, and also adapts ParVI advantages to MCMCs. We develop two ParVI methods for a particular MCMC dynamics and demonstrate the benefits in experiments. LR-GLM: High-Dimensional Bayesian Inference Using Low-Rank Data Approximations Due to the ease of modern data collection, practitioners often face a large collection of covariates and the need to understand their relation to some response. Generalized linear models (GLMs) offer a particularly interpretable framework for this analysis. In the high-dimensional case without an overwhelming amount of data per parameter, we expect uncertainty to be non-trivial; a Bayesian approach allows coherent quantification of this uncertainty. Unfortunately existing methods for Bayesian inference in GLMs require running times roughly cubic in parameter dimension, thus limiting their applicability in increasingly widespread settings with tens of thousands of parameters. We propose to reduce time and memory costs with a low-rank approximation of the data. We show that our method, which we call LR-GLM, still provides a full Bayesian posterior approximation and admits running time reduced by a full factor of the parameter dimension. We theoretically establish the quality of our approximation via interpretable error bounds and show how the choice of rank allows a tunable computational-statistical trade-off. Experiments support our theory and demonstrate the efficacy of LR-GLM in on real, large-scale datasets. Amortized Monte Carlo Integration Current approaches to amortizing Bayesian inference focus solely on approximating the posterior distribution. Typically, this approximation is, in turn, used to calculate expectations for one or more target functions---a computational pipeline which is inefficient when the target function(s) are known upfront. In this paper, we address this inefficiency by introducing AMCI, a method for amortizing Monte Carlo integration directly. AMCI operates similarly to amortized inference but produces three distinct amortized proposals, each tailored to a different component of the overall expectation calculation. At run-time, samples are produced separately from each amortized proposal, before being combined to an overall estimate of the expectation. We show that while existing approaches are fundamentally limited in the level of accuracy they can achieve, AMCI can theoretically produce arbitrarily small errors for any integrable target function using only a single sample from each proposal at run-time. Furthermore, AMCI allows not only for amortizing over datasets but also amortizing over target functions. Stein Point Markov Chain Monte Carlo An important task in machine learning and statistics is the approximation of a probability measure by an empirical measure supported on a discrete point set. Stein Points are a class of algorithms for this task, which proceed by sequentially minimising a Stein discrepancy between the empirical measure and the target and, hence, require the solution of a non-convex optimisation problem to obtain each new point. This paper removes the need to solve this optimisation problem by, instead, selecting each new point based on a Markov chain sample path. This significantly reduces the computational cost of Stein Points and leads to a suite of algorithms that are straightforward to implement. The new algorithms are illustrated on a set of challenging Bayesian inference problems, and rigorous theoretical guarantees of consistency are established. Fast and Simple Natural-Gradient Variational Inference with Mixture of Exponential-family Approximations Natural-gradient methods enable fast and simple algorithms for variational inference, but due to computational difficulties, their use is mostly limited to minimal exponential-family (EF) approximations. In this paper, we extend the application of natural-gradient methods to estimate structured approximations such as mixture of EF distribution. Such approximations can fit complex, multimodal posterior distributions and are generally more accurate than unimodal EF approximations. By using a minimal conditional-EF representation of such approximations, we derive simple natural-gradient updates. Our empirical results demonstrate a faster convergence of our natural-gradient method compared to black-box gradient-based methods. Our work expands the scope of natural gradients for Bayesian inference and makes them more widely applicable than before. Particle Flow Bayes' Rule We present a particle flow realization of Bayes' rule, where an ODE-based neural operator is used to transport particles from a prior to its posterior after a new observation. We prove that such an ODE operator exists and its neural parameterization can be trained in a meta-learning framework, allowing this operator to reason about the effect of an individual observation on the posterior, and thus generalize across different priors, observations and to online Bayesian inference. We demonstrated the generalization ability of our particle flow Bayes operator in several canonical and high dimensional examples. Correlated Variational Auto-Encoders Variational Auto-Encoders (VAEs) are capable of learning latent representations for high dimensional data. However, due to the i.i.d. assumption, VAEs only optimize the singleton variational distributions and fail to account for the correlations between data points, which might be crucial for learning latent representations from dataset where a priori we know correlations exist. We propose Correlated Variational Auto-Encoders (CVAEs) that can take the correlation structure into consideration when learning latent representations with VAEs. CVAEs apply a prior based on the correlation structure. To address the intractability introduced by the correlated prior, we develop an approximation by average of a set of tractable lower bounds over all maximal acyclic subgraphs of the undirected correlation graph. Experimental results on matching and link prediction on public benchmark rating datasets and spectral clustering on a synthetic dataset show the effectiveness of the proposed method over baseline algorithms.Regret Circuits: Composability of Regret Minimizers Regret minimization is a powerful tool for solving large-scale problems; it was recently used in breakthrough results for large-scale extensive-form game solving. This was achieved by composing simplex regret minimizers into an overall regret-minimization framework for extensive-form game strategy spaces. In this paper we study the general composability of regret minimizers. We derive a calculus for constructing regret minimizers for composite…
Category · 10.8k presentations
Category · 2.4k presentations
The International Conference on Machine Learning (ICML) is the premier gathering of professionals dedicated to the advancement of the branch of artificial intelligence known as machine learning. ICML is globally renowned for presenting and publishing cutting-edge research on all aspects of machine learning used in closely related areas like artificial intelligence, statistics and data science, as well as important application areas such as machine vision, computational biology, speech recognition, and robotics. ICML is one of the fastest growing artificial intelligence conferences in the world. Participants at ICML span a wide range of backgrounds, from academic and industrial researchers, to entrepreneurs and engineers, to graduate students and postdocs.
Professional recording and live streaming, delivered globally.
Presentations on similar topic, category or speaker