Kernelized Wasserstein Natural Gradient

Michael Arbel; Arthur Gretton; Wuchen Li; Guido Montúfar

Back to ICLR

ICLR 2020

Kernelized Wasserstein Natural Gradient

Conference Paper Spotlight Presentations Artificial Intelligence · Machine Learning

Details

Abstract

Many machine learning problems can be expressed as the optimization of some cost functional over a parametric family of probability distributions. It is often beneficial to solve such optimization problems using natural gradient methods. These methods are invariant to the parametrization of the family, and thus can yield more effective optimization. Unfortunately, computing the natural gradient is challenging as it requires inverting a high dimensional matrix at each iteration. We propose a general framework to approximate the natural gradient for the Wasserstein metric, by leveraging a dual formulation of the metric restricted to a Reproducing Kernel Hilbert Space. Our approach leads to an estimator for gradient direction that can trade-off accuracy and computational cost, with theoretical guarantees. We verify its accuracy on simple examples, and show the advantage of using such an estimator in classification tasks on \texttt{Cifar10} and \texttt{Cifar100} empirically.

Authors

Keywords

kernel methods
natural gradient
information geometry
Wasserstein metric

Context

Venue: International Conference on Learning Representations
Archive span: 2013-2025
Indexed papers: 10294
Paper id: 620225699250569822