Arrow Research search

Author name cluster

Evgeny Burnaev

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

36 papers
2 author rows

Possible papers

36

ICLR Conference 2025 Conference Paper

A3D: Does Diffusion Dream about 3D Alignment?

  • Savva Victorovich Ignatyev
  • Nina Konovalova
  • Daniil Selikhanovych
  • Oleg Voynov
  • Nikolay Patakin
  • Ilya Olkov
  • Dmitry Senushkin
  • Alexey Artemov

We tackle the problem of text-driven 3D generation from a geometry alignment perspective. Given a set of text prompts, we aim to generate a collection of objects with semantically corresponding parts aligned across them. Recent methods based on Score Distillation have succeeded in distilling the knowledge from 2D diffusion models to high-quality representations of the 3D objects. These methods handle multiple text queries separately, and therefore the resulting objects have a high variability in object pose and structure. However, in some applications, such as 3D asset design, it may be desirable to obtain a set of objects aligned with each other. In order to achieve the alignment of the corresponding parts of the generated objects, we propose to embed these objects into a common latent space and optimize the continuous transitions between these objects. We enforce two kinds of properties of these transitions: smoothness of the transition and plausibility of the intermediate objects along the transition. We demonstrate that both of these properties are essential for good alignment. We provide several practical scenarios that benefit from alignment between the objects, including 3D editing and object hybridization, and experimentally demonstrate the effectiveness of our method.

ICML Conference 2025 Conference Paper

Inverse Bridge Matching Distillation

  • Nikita Gushchin
  • David Li 0004
  • Daniil Selikhanovych
  • Evgeny Burnaev
  • Dmitry Baranchuk
  • Alexander Korotin

Learning diffusion bridge models is easy; making them fast and practical is an art. Diffusion bridge models (DBMs) are a promising extension of diffusion models for applications in image-to-image translation. However, like many modern diffusion and flow models, DBMs suffer from the problem of slow inference. To address it, we propose a novel distillation technique based on the inverse bridge matching formulation and derive the tractable objective to solve it in practice. Unlike previously developed DBM distillation techniques, the proposed method can distill both conditional and unconditional types of DBMs, distill models in a one-step generator, and use only the corrupted images for training. We evaluate our approach for both conditional and unconditional types of bridge matching on a wide set of setups, including super-resolution, JPEG restoration, sketch-to-image, and other tasks, and show that our distillation technique allows us to accelerate the inference of DBMs from 4x to 100x and even provide better generation quality than used teacher model depending on particular setup. We provide the code at https: //github. com/ngushchin/IBMD

NeurIPS Conference 2024 Conference Paper

Adversarial Schrödinger Bridge Matching

  • Nikita Gushchin
  • Daniil Selikhanovych
  • Sergei Kholkin
  • Evgeny Burnaev
  • Alexander Korotin

The Schrödinger Bridge (SB) problem offers a powerful framework for combining optimal transport and diffusion models. A promising recent approach to solve the SB problem is the Iterative Markovian Fitting (IMF) procedure, which alternates between Markovian and reciprocal projections of continuous-time stochastic processes. However, the model built by the IMF procedure has a long inference time due to using many steps of numerical solvers for stochastic differential equations. To address this limitation, we propose a novel Discrete-time IMF (D-IMF) procedure in which learning of stochastic processes is replaced by learning just a few transition probabilities in discrete time. Its great advantage is that in practice it can be naturally implemented using the Denoising Diffusion GAN (DD-GAN), an already well-established adversarial generative modeling technique. We show that our D-IMF procedure can provide the same quality of unpaired domain translation as the IMF, using only several generation steps instead of hundreds.

ICML Conference 2024 Conference Paper

Disentanglement Learning via Topology

  • Nikita Balabin
  • Daria Voronkova
  • Ilya Trofimov
  • Evgeny Burnaev
  • Serguei Barannikov

We propose TopDis (Topological Disentanglement), a method for learning disentangled representations via adding a multi-scale topological loss term. Disentanglement is a crucial property of data representations substantial for the explainability and robustness of deep learning models and a step towards high-level cognition. The state-of-the-art methods are based on VAE and encourage the joint distribution of latent variables to be factorized. We take a different perspective on disentanglement by analyzing topological properties of data manifolds. In particular, we optimize the topological similarity for data manifolds traversals. To the best of our knowledge, our paper is the first one to propose a differentiable topological loss for disentanglement learning. Our experiments have shown that the proposed TopDis loss improves disentanglement scores such as MIG, FactorVAE score, SAP score, and DCI disentanglement score with respect to state-of-the-art results while preserving the reconstruction quality. Our method works in an unsupervised manner, permitting us to apply it to problems without labeled factors of variation. The TopDis loss works even when factors of variation are correlated. Additionally, we show how to use the proposed topological loss to find disentangled directions in a trained GAN.

NeurIPS Conference 2024 Conference Paper

Energy-Guided Continuous Entropic Barycenter Estimation for General Costs

  • Alexander Kolesov
  • Petr Mokrov
  • Igor Udovichenko
  • Milena Gazdieva
  • Gudmund Pammer
  • Anastasis Kratsios
  • Evgeny Burnaev
  • Alexander Korotin

Optimal transport (OT) barycenters are a mathematically grounded way of averaging probability distributions while capturing their geometric properties. In short, the barycenter task is to take the average of a collection of probability distributions w. r. t. given OT discrepancies. We propose a novel algorithm for approximating the continuous Entropic OT (EOT) barycenter for arbitrary OT cost functions. Our approach is built upon the dual reformulation of the EOT problem based on weak OT, which has recently gained the attention of the ML community. Beyond its novelty, our method enjoys several advantageous properties: (i) we establish quality bounds for the recovered solution; (ii) this approach seamlessly interconnects with the Energy-Based Models (EBMs) learning procedure enabling the use of well-tuned algorithms for the problem of interest; (iii) it provides an intuitive optimization scheme avoiding min-max, reinforce and other intricate technical tricks. For validation, we consider several low-dimensional scenarios and image-space setups, including non-Euclidean cost functions. Furthermore, we investigate the practical task of learning the barycenter on an image manifold generated by a pretrained generative model, opening up new directions for real-world applications. Our code is available at https: //github. com/justkolesov/EnergyGuidedBarycenters.

ICLR Conference 2024 Conference Paper

Energy-guided Entropic Neural Optimal Transport

  • Petr Mokrov
  • Alexander Korotin
  • Alexander Kolesov
  • Nikita Gushchin
  • Evgeny Burnaev

Energy-based models (EBMs) are known in the Machine Learning community for decades. Since the seminal works devoted to EBMs dating back to the noughties, there have been a lot of efficient methods which solve the generative modelling problem by means of energy potentials (unnormalized likelihood functions). In contrast, the realm of Optimal Transport (OT) and, in particular, neural OT solvers is much less explored and limited by few recent works (excluding WGAN-based approaches which utilize OT as a loss function and do not model OT maps themselves). In our work, we bridge the gap between EBMs and Entropy-regularized OT. We present a novel methodology which allows utilizing the recent developments and technical improvements of the former in order to enrich the latter. From the theoretical perspective, we prove generalization bounds for our technique. In practice, we validate its applicability in toy 2D and image domains. To showcase the scalability, we empower our method with a pre-trained StyleGAN and apply it to high-res AFHQ $512\times512$ unpaired I2I translation. For simplicity, we choose simple short- and long-run EBMs as a backbone of our Energy-guided Entropic OT approach, leaving the application of more sophisticated EBMs for future research. Our code is available at: https://github.com/PetrMokrov/Energy-guided-Entropic-OT

ICML Conference 2024 Conference Paper

Estimating Barycenters of Distributions with Neural Optimal Transport

  • Alexander Kolesov
  • Petr Mokrov
  • Igor Udovichenko
  • Milena Gazdieva
  • Gudmund Pammer
  • Evgeny Burnaev
  • Alexander Korotin

Given a collection of probability measures, a practitioner sometimes needs to find an "average" distribution which adequately aggregates reference distributions. A theoretically appealing notion of such an average is the Wasserstein barycenter, which is the primal focus of our work. By building upon the dual formulation of Optimal Transport (OT), we propose a new scalable approach for solving the Wasserstein barycenter problem. Our methodology is based on the recent Neural OT solver: it has bi-level adversarial learning objective and works for general cost functions. These are key advantages of our method since the typical adversarial algorithms leveraging barycenter tasks utilize tri-level optimization and focus mostly on quadratic cost. We also establish theoretical error bounds for our proposed approach and showcase its applicability and effectiveness in illustrative scenarios and image data setups. Our source code is available at https: //github. com/justkolesov/NOTBarycenters.

ICML Conference 2024 Conference Paper

Light and Optimal Schrödinger Bridge Matching

  • Nikita Gushchin
  • Sergei Kholkin
  • Evgeny Burnaev
  • Alexander Korotin

Schrödinger Bridges (SB) have recently gained the attention of the ML community as a promising extension of classic diffusion models which is also interconnected to the Entropic Optimal Transport (EOT). Recent solvers for SB exploit the pervasive bridge matching procedures. Such procedures aim to recover a stochastic process transporting the mass between distributions given only a transport plan between them. In particular, given the EOT plan, these procedures can be adapted to solve SB. This fact is heavily exploited by recent works giving rives to matching-based SB solvers. The cornerstone here is recovering the EOT plan: recent works either use heuristical approximations (e. g. , the minibatch OT) or establish iterative matching procedures which by the design accumulate the error during the training. We address these limitations and propose a novel procedure to learn SB which we call the optimal Schrödinger bridge matching. It exploits the optimal parameterization of the diffusion process and provably recovers the SB process (a) with a single bridge matching step and (b) with arbitrary transport plan as the input. Furthermore, we show that the optimal bridge matching objective coincides with the recently discovered energy-based modeling (EBM) objectives to learn EOT/SB. Inspired by this observation, we develop a light solver (which we call LightSB-M) to implement optimal matching in practice using the Gaussian mixture parameterization of the adjusted Schrödinger potential. We experimentally showcase the performance of our solver in a range of practical tasks.

ICLR Conference 2024 Conference Paper

Light Schrödinger Bridge

  • Alexander Korotin
  • Nikita Gushchin
  • Evgeny Burnaev

Despite the recent advances in the field of computational Schrödinger Bridges (SB), most existing SB solvers are still heavy-weighted and require complex optimization of several neural networks. It turns out that there is no principal solver which plays the role of simple-yet-effective baseline for SB just like, e.g., $k$-means method in clustering, logistic regression in classification or Sinkhorn algorithm in discrete optimal transport. We address this issue and propose a novel fast and simple SB solver. Our development is a smart combination of two ideas which recently appeared in the field: (a) parameterization of the Schrödinger potentials with sum-exp quadratic functions and (b) viewing the log-Schrödinger potentials as the energy functions. We show that combined together these ideas yield a lightweight, simulation-free and theoretically justified SB solver with a simple straightforward optimization objective. As a result, it allows solving SB in moderate dimensions in a matter of minutes on CPU without a painful hyperparameter selection. Our light solver resembles the Gaussian mixture model which is widely used for density estimation. Inspired by this similarity, we also prove an important theoretical result showing that our light solver is a universal approximator of SBs. Furthemore, we conduct the analysis of the generalization error of our light solver. The code for our solver can be found at https://github.com/ngushchin/LightSB.

NeurIPS Conference 2024 Conference Paper

Light Unbalanced Optimal Transport

  • Milena Gazdieva
  • Arip Asadulaev
  • Evgeny Burnaev
  • Alexander Korotin

While the continuous Entropic Optimal Transport (EOT) field has been actively developing in recent years, it became evident that the classic EOT problem is prone to different issues like the sensitivity to outliers and imbalance of classes in the source and target measures. This fact inspired the development of solvers that deal with the *unbalanced* EOT (UEOT) problem $-$ the generalization of EOT allowing for mitigating the mentioned issues by relaxing the marginal constraints. Surprisingly, it turns out that the existing solvers are either based on heuristic principles or heavy-weighted with complex optimization objectives involving several neural networks. We address this challenge and propose a novel theoretically-justified, lightweight, unbalanced EOT solver. Our advancement consists of developing a novel view on the optimization of the UEOT problem yielding tractable and a non-minimax optimization objective. We show that combined with a light parametrization recently proposed in the field our objective leads to a fast, simple, and effective solver which allows solving the continuous UEOT problem in minutes on CPU. We prove that our solver provides a universal approximation of UEOT solutions and obtain its generalization bounds. We give illustrative examples of the solver's performance.

ICLR Conference 2024 Conference Paper

Neural Optimal Transport with General Cost Functionals

  • Arip Asadulaev
  • Alexander Korotin
  • Vage Egiazarian
  • Petr Mokrov
  • Evgeny Burnaev

We introduce a novel neural network-based algorithm to compute optimal transport (OT) plans for general cost functionals. In contrast to common Euclidean costs, i.e., $\ell^1$ or $\ell^2$, such functionals provide more flexibility and allow using auxiliary information, such as class labels, to construct the required transport map. Existing methods for general cost functionals are discrete and do not provide an out-of-sample estimation. We address the challenge of designing a continuous OT approach for general cost functionals in high-dimensional spaces, such as images. We construct two example functionals: one to map distributions while preserving the class-wise structure and the other one to preserve the given data pairs. Additionally, we provide the theoretical error analysis for our recovered transport plans. Our implementation is available at \url{https://github.com/machinestein/gnot}

EAAI Journal 2024 Journal Article

Pose estimation in robotic electric vehicle plug-in charging tasks using auto-annotation and deep learning-based keypoint detector

  • Viktor Rakhmatulin
  • Miguel Altamirano Cabrera
  • Andrei Puchkov
  • Evgeny Burnaev
  • Dzmitry Tsetserukou

The rapid growth of electric vehicles and advancements in self-driving technologies necessitate the development of specialized infrastructure to support autonomy. Accurate pose estimation of electric vehicles sockets is crucial for efficient and reliable plug-in charging operations. This process is inherently complex due to several factors, including the textureless black color of the socket, environment-dependent lighting conditions, and the presence of small geometrical features. To address these challenges, we propose a comprehensive method that combines Deep Neural Network-based pose estimation and auto-annotation methodology. Auto-annotation facilitates the generation of diverse training data, enhancing the accuracy, robustness, and generalization capabilities of the deep learning model. The advantages of the proposed method were evaluated through a comprehensive series of experiments conducted in both simulation and controlled laboratory settings. In our evaluation, we implemented a robotic charging system consisting of a manipulator with a charging plug and a hand-eye monocular camera and conducted plug-in testing in three scenarios: (A) uniform lighting conditions, (B) dark, and (C) highly uneven illumination of the socket surface. The experimental results show that our method can achieve precise and reliable pose estimation with mean absolute errors less than 0. 73 mm and 0. 82 deg, and an average Insertion Success Rate of over 97. 5%.

NeurIPS Conference 2024 Conference Paper

Rethinking Optimal Transport in Offline Reinforcement Learning

  • Arip Asadulaev
  • Rostislav Korst
  • Alexander Korotin
  • Vage Egiazarian
  • Andrey Filchenkov
  • Evgeny Burnaev

We propose a novel algorithm for offline reinforcement learning using optimal transport. Typically, in offline reinforcement learning, the data is provided by various experts and some of them can be sub-optimal. To extract an efficient policy, it is necessary to \emph{stitch} the best behaviors from the dataset. To address this problem, we rethink offline reinforcement learning as an optimal transportation problem. And based on this, we present an algorithm that aims to find a policy that maps states to a \emph{partial} distribution of the best expert actions for each given state. We evaluate the performance of our algorithm on continuous control problems from the D4RL suite and demonstrate improvements over existing methods.

ICML Conference 2024 Conference Paper

Self-Supervised Coarsening of Unstructured Grid with Automatic Differentiation

  • Sergei Shumilin
  • Alexander Ryabov
  • Nikolay B. Yavich
  • Evgeny Burnaev
  • Vladimir Vanovskiy

Due to the high computational load of modern numerical simulation, there is a demand for approaches that would reduce the size of discrete problems while keeping the accuracy reasonable. In this work, we present an original algorithm to coarsen an unstructured grid based on the concepts of differentiable physics. We achieve this by employing $k$-means clustering, autodifferentiation and stochastic minimization algorithms. We demonstrate performance of the designed algorithm on two PDEs: a linear parabolic equation which governs slightly compressible fluid flow in porous media and the wave equation. Our results show that in the considered scenarios, we reduced the number of grid points up to 10 times while preserving the modeled variable dynamics in the points of interest. The proposed approach can be applied to the simulation of an arbitrary system described by evolutionary partial differential equations.

NeurIPS Conference 2023 Conference Paper

Building the Bridge of Schrödinger: A Continuous Entropic Optimal Transport Benchmark

  • Nikita Gushchin
  • Alexander Kolesov
  • Petr Mokrov
  • Polina Karpikova
  • Andrei Spiridonov
  • Evgeny Burnaev
  • Alexander Korotin

Over the last several years, there has been significant progress in developing neural solvers for the Schrödinger Bridge (SB) problem and applying them to generative modelling. This new research field is justifiably fruitful as it is interconnected with the practically well-performing diffusion models and theoretically grounded entropic optimal transport (EOT). Still, the area lacks non-trivial tests allowing a researcher to understand how well the methods solve SB or its equivalent continuous EOT problem. We fill this gap and propose a novel way to create pairs of probability distributions for which the ground truth OT solution is known by the construction. Our methodology is generic and works for a wide range of OT formulations, in particular, it covers the EOT which is equivalent to SB (the main interest of our study). This development allows us to create continuous benchmark distributions with the known EOT and SB solutions on high-dimensional spaces such as spaces of images. As an illustration, we use these benchmark pairs to test how well existing neural EOT/SB solvers actually compute the EOT solution. Our code for constructing benchmark pairs under different setups is available at: https: //github. com/ngushchin/EntropicOTBenchmark

NeurIPS Conference 2023 Conference Paper

Entropic Neural Optimal Transport via Diffusion Processes

  • Nikita Gushchin
  • Alexander Kolesov
  • Alexander Korotin
  • Dmitry P. Vetrov
  • Evgeny Burnaev

We propose a novel neural algorithm for the fundamental problem of computing the entropic optimal transport (EOT) plan between probability distributions which are accessible by samples. Our algorithm is based on the saddle point reformulation of the dynamic version of EOT which is known as the Schrödinger Bridge problem. In contrast to the prior methods for large-scale EOT, our algorithm is end-to-end and consists of a single learning step, has fast inference procedure, and allows handling small values of the entropy regularization coefficient which is of particular importance in some applied problems. Empirically, we show the performance of the method on several large-scale EOT tasks. The code for the ENOT solver can be found at https: //github. com/ngushchin/EntropicNeuralOptimalTransport

NeurIPS Conference 2023 Conference Paper

Extremal Domain Translation with Neural Optimal Transport

  • Milena Gazdieva
  • Alexander Korotin
  • Daniil Selikhanovych
  • Evgeny Burnaev

In many unpaired image domain translation problems, e. g. , style transfer or super-resolution, it is important to keep the translated image similar to its respective input image. We propose the extremal transport (ET) which is a mathematical formalization of the theoretically best possible unpaired translation between a pair of domains w. r. t. the given similarity function. Inspired by the recent advances in neural optimal transport (OT), we propose a scalable algorithm to approximate ET maps as a limit of partial OT maps. We test our algorithm on toy examples and on the unpaired image-to-image translation task. The code is publicly available at https: //github. com/milenagazdieva/ExtremalNeuralOptimalTransport

NeurIPS Conference 2023 Conference Paper

Intrinsic Dimension Estimation for Robust Detection of AI-Generated Texts

  • Eduard Tulchinskii
  • Kristian Kuznetsov
  • Laida Kushnareva
  • Daniil Cherniavskii
  • Sergey Nikolenko
  • Evgeny Burnaev
  • Serguei Barannikov
  • Irina Piontkovskaya

Rapidly increasing quality of AI-generated content makes it difficult to distinguish between human and AI-generated texts, which may lead to undesirable consequences for society. Therefore, it becomes increasingly important to study the properties of human texts that are invariant over text domains and various proficiency of human writers, can be easily calculated for any language, and can robustly separate natural and AI-generated texts regardless of the generation model and sampling method. In this work, we propose such an invariant of human texts, namely the intrinsic dimensionality of the manifold underlying the set of embeddings of a given text sample. We show that the average intrinsic dimensionality of fluent texts in natural language is hovering around the value $9$ for several alphabet-based languages and around $7$ for Chinese, while the average intrinsic dimensionality of AI-generated texts for each language is $\approx 1. 5$ lower, with a clear statistical separation between human-generated and AI-generated distributions. This property allows us to build a score-based artificial text detector. The proposed detector's accuracy is stable over text domains, generator models, and human writer proficiency levels, outperforming SOTA detectors in model-agnostic and cross-domain scenarios by a significant margin.

ICLR Conference 2023 Conference Paper

Kernel Neural Optimal Transport

  • Alexander Korotin
  • Daniil Selikhanovych
  • Evgeny Burnaev

We study the Neural Optimal Transport (NOT) algorithm which uses the general optimal transport formulation and learns stochastic transport plans. We show that NOT with the weak quadratic cost may learn fake plans which are not optimal. To resolve this issue, we introduce kernel weak quadratic costs. We show that they provide improved theoretical guarantees and practical performance. We test NOT with kernel costs on the unpaired image-to-image translation task.

ICLR Conference 2023 Conference Paper

Learning topology-preserving data representations

  • Ilya Trofimov
  • Daniil Cherniavskii
  • Eduard Tulchinskii
  • Nikita Balabin
  • Evgeny Burnaev
  • Serguei Barannikov

We propose a method for learning topology-preserving data representations (dimensionality reduction). The method aims to provide topological similarity between the data manifold and its latent representation via enforcing the similarity in topological features (clusters, loops, 2D voids, etc.) and their localization. The core of the method is the minimization of the Representation Topology Divergence (RTD) between original high-dimensional data and low-dimensional representation in latent space. RTD minimization provides closeness in topological features with strong theoretical guarantees. We develop a scheme for RTD differentiation and apply it as a loss term for the autoencoder. The proposed method "RTD-AE" better preserves the global structure and topology of the data manifold than state-of-the-art competitors as measured by linear correlation, triplet distance ranking accuracy, and Wasserstein distance between persistence barcodes.

ICLR Conference 2023 Conference Paper

Neural Optimal Transport

  • Alexander Korotin
  • Daniil Selikhanovych
  • Evgeny Burnaev

We present a novel neural-networks-based algorithm to compute optimal transport maps and plans for strong and weak transport costs. To justify the usage of neural networks, we prove that they are universal approximators of transport plans between probability distributions. We evaluate the performance of our optimal transport algorithm on toy examples and on the unpaired image-to-image translation.

JBHI Journal 2022 Journal Article

Analysis of Video Game Players’ Emotions and Team Performance: An Esports Tournament Case Study

  • Simon Abramov
  • Alexander Korotin
  • Andrey Somov
  • Evgeny Burnaev
  • Anton Stepanov
  • Dmitry Nikolaev
  • Maria A. Titova

Video gaming and eSports is a quickly developing industry already involving billions of players worldwide. Gaming and eSports tournaments require strong mental abilities to avoid severe stress and other negative consequences upon completing the game. In this article, we report on the impact of emotions on a team performance. For this reason, we collect audio recordings and game logs from the players in real conditions at an eSports tournament. This data is further used in trained machine learning models for analysis of players’ emotional conditions from the voice during the game. We considered recognition of several types of emotions as well as the background sounds. To do this, we trained 92. 7% accuracy classifier of six most common classes of emotions and sounds in eSports audio and applied it to eSports data. As a result, we demonstrate that there is an opportunity to measure the eSports team’s performance from the players’ emotional conditions obtained from the voice communication. We found that there is a strong correlation among the performance of the team, communication between the players, and emotional sentiment of communication. The teams achieve much better results when they had much more internal conversations during the game.

ICLR Conference 2022 Conference Paper

Generative Modeling with Optimal Transport Maps

  • Litu Rout
  • Alexander Korotin
  • Evgeny Burnaev

With the discovery of Wasserstein GANs, Optimal Transport (OT) has become a powerful tool for large-scale generative modeling tasks. In these tasks, OT cost is typically used as the loss for training GANs. In contrast to this approach, we show that the OT map itself can be used as a generative model, providing comparable performance. Previous analogous approaches consider OT maps as generative models only in the latent spaces due to their poor performance in the original high-dimensional ambient space. In contrast, we apply OT maps directly in the ambient space, e.g., a space of high-dimensional images. First, we derive a min-max optimization algorithm to efficiently compute OT maps for the quadratic cost (Wasserstein-2 distance). Next, we extend the approach to the case when the input and output distributions are located in the spaces of different dimensions and derive error bounds for the computed OT map. We evaluate the algorithm on image generation and unpaired image restoration tasks. In particular, we consider denoising, colorization, and inpainting, where the optimality of the restoration map is a desired attribute, since the output (restored) image is expected to be close to the input (degraded) one.

NeurIPS Conference 2022 Conference Paper

Kantorovich Strikes Back! Wasserstein GANs are not Optimal Transport?

  • Alexander Korotin
  • Alexander Kolesov
  • Evgeny Burnaev

Wasserstein Generative Adversarial Networks (WGANs) are the popular generative models built on the theory of Optimal Transport (OT) and the Kantorovich duality. Despite the success of WGANs, it is still unclear how well the underlying OT dual solvers approximate the OT cost (Wasserstein-1 distance, W1) and the OT gradient needed to update the generator. In this paper, we address these questions. We construct 1-Lipschitz functions and use them to build ray monotone transport plans. This strategy yields pairs of continuous benchmark distributions with the analytically known OT plan, OT cost and OT gradient in high-dimensional spaces such as spaces of images. We thoroughly evaluate popular WGAN dual form solvers (gradient penalty, spectral normalization, entropic regularization, etc. ) using these benchmark pairs. Even though these solvers perform well in WGANs, none of them faithfully compute W1 in high dimensions. Nevertheless, many provide a meaningful approximation of the OT gradient. These observations suggest that these solvers should not be treated as good estimators of W1 but to some extent they indeed can be used in variational problems requiring the minimization of W1.

ICML Conference 2022 Conference Paper

Representation Topology Divergence: A Method for Comparing Neural Network Representations

  • Serguei Barannikov
  • Ilya Trofimov
  • Nikita Balabin
  • Evgeny Burnaev

Comparison of data representations is a complex multi-aspect problem. We propose a method for comparing two data representations. We introduce the Representation Topology Divergence (RTD) score measuring the dissimilarity in multi-scale topology between two point clouds of equal size with a one-to-one correspondence between points. The two data point clouds can lie in different ambient spaces. The RTD score is one of the few topological data analysis based practical methods applicable to real machine learning datasets. Experiments show the agreement of RTD with the intuitive assessment of data representation similarity. The proposed RTD score is sensitive to the data representation’s fine topological structure. We use the RTD score to gain insights on neural networks representations in computer vision and NLP domains for various problems: training dynamics analysis, data distribution shift, transfer learning, ensemble learning, disentanglement assessment.

NeurIPS Conference 2022 Conference Paper

Wasserstein Iterative Networks for Barycenter Estimation

  • Alexander Korotin
  • Vage Egiazarian
  • Lingxiao Li
  • Evgeny Burnaev

Wasserstein barycenters have become popular due to their ability to represent the average of probability measures in a geometrically meaningful way. In this paper, we present an algorithm to approximate the Wasserstein-2 barycenters of continuous measures via a generative model. Previous approaches rely on regularization (entropic/quadratic) which introduces bias or on input convex neural networks which are not expressive enough for large-scale tasks. In contrast, our algorithm does not introduce bias and allows using arbitrary neural networks. In addition, based on the celebrity faces dataset, we construct Ave, celeba! dataset which can be used for quantitative evaluation of barycenter algorithms by using standard metrics of generative models such as FID.

NeurIPS Conference 2021 Conference Paper

BooVAE: Boosting Approach for Continual Learning of VAE

  • Evgenii Egorov
  • Anna Kuzina
  • Evgeny Burnaev

Variational autoencoder (VAE) is a deep generative model for unsupervised learning, allowing to encode observations into the meaningful latent space. VAE is prone to catastrophic forgetting when tasks arrive sequentially, and only the data for the current one is available. We address this problem of continual learning for VAEs. It is known that the choice of the prior distribution over the latent space is crucial for VAE in the non-continual setting. We argue that it can also be helpful to avoid catastrophic forgetting. We learn the approximation of the aggregated posterior as a prior for each task. This approximation is parametrised as an additive mixture of distributions induced by an encoder evaluated at trainable pseudo-inputs. We use a greedy boosting-like approach with entropy regularisation to learn the components. This method encourages components diversity, which is essential as we aim at memorising the current task with the fewest components possible. Based on the learnable prior, we introduce an end-to-end approach for continual learning of VAEs and provide empirical studies on commonly used benchmarks (MNIST, Fashion MNIST, NotMNIST) and CelebA datasets. For each dataset, the proposed method avoids catastrophic forgetting in a fully automatic way.

ICLR Conference 2021 Conference Paper

Continuous Wasserstein-2 Barycenter Estimation without Minimax Optimization

  • Alexander Korotin
  • Lingxiao Li
  • Justin Solomon 0001
  • Evgeny Burnaev

Wasserstein barycenters provide a geometric notion of the weighted average of probability measures based on optimal transport. In this paper, we present a scalable algorithm to compute Wasserstein-2 barycenters given sample access to the input measures, which are not restricted to being discrete. While past approaches rely on entropic or quadratic regularization, we employ input convex neural networks and cycle-consistency regularization to avoid introducing bias. As a result, our approach does not resort to minimax optimization. We provide theoretical analysis on error bounds as well as empirical evidence of the effectiveness of the proposed approach in low-dimensional qualitative scenarios and high-dimensional quantitative experiments.

NeurIPS Conference 2021 Conference Paper

Do Neural Optimal Transport Solvers Work? A Continuous Wasserstein-2 Benchmark

  • Alexander Korotin
  • Lingxiao Li
  • Aude Genevay
  • Justin M. Solomon
  • Alexander Filippov
  • Evgeny Burnaev

Despite the recent popularity of neural network-based solvers for optimal transport (OT), there is no standard quantitative way to evaluate their performance. In this paper, we address this issue for quadratic-cost transport---specifically, computation of the Wasserstein-2 distance, a commonly-used formulation of optimal transport in machine learning. To overcome the challenge of computing ground truth transport maps between continuous measures needed to assess these solvers, we use input-convex neural networks (ICNN) to construct pairs of measures whose ground truth OT maps can be obtained analytically. This strategy yields pairs of continuous benchmark measures in high-dimensional spaces such as spaces of images. We thoroughly evaluate existing optimal transport solvers using these benchmark measures. Even though these solvers perform well in downstream tasks, many do not faithfully recover optimal transport maps. To investigate the cause of this discrepancy, we further test the solvers in a setting of image generation. Our study reveals crucial limitations of existing solvers and shows that increased OT accuracy does not necessarily correlate to better results downstream.

NeurIPS Conference 2021 Conference Paper

Large-Scale Wasserstein Gradient Flows

  • Petr Mokrov
  • Alexander Korotin
  • Lingxiao Li
  • Aude Genevay
  • Justin M. Solomon
  • Evgeny Burnaev

Wasserstein gradient flows provide a powerful means of understanding and solving many diffusion equations. Specifically, Fokker-Planck equations, which model the diffusion of probability measures, can be understood as gradient descent over entropy functionals in Wasserstein space. This equivalence, introduced by Jordan, Kinderlehrer and Otto, inspired the so-called JKO scheme to approximate these diffusion processes via an implicit discretization of the gradient flow in Wasserstein space. Solving the optimization problem associated with each JKO step, however, presents serious computational challenges. We introduce a scalable method to approximate Wasserstein gradient flows, targeted to machine learning applications. Our approach relies on input-convex neural networks (ICNNs) to discretize the JKO steps, which can be optimized by stochastic gradient descent. Contrarily to previous work, our method does not require domain discretization or particle simulation. As a result, we can sample from the measure at each time step of the diffusion and compute its probability density. We demonstrate the performance of our algorithm by computing diffusions following the Fokker-Planck equation and apply it to unnormalized density sampling as well as nonlinear filtering.

NeurIPS Conference 2021 Conference Paper

Manifold Topology Divergence: a Framework for Comparing Data Manifolds.

  • Serguei Barannikov
  • Ilya Trofimov
  • Grigorii Sotnikov
  • Ekaterina Trimbach
  • Alexander Korotin
  • Alexander Filippov
  • Evgeny Burnaev

We propose a framework for comparing data manifolds, aimed, in particular, towards the evaluation of deep generative models. We describe a novel tool, Cross-Barcode(P, Q), that, given a pair of distributions in a high-dimensional space, tracks multiscale topology spacial discrepancies between manifolds on which the distributions are concentrated. Based on the Cross-Barcode, we introduce the Manifold Topology Divergence score (MTop-Divergence) and apply it to assess the performance of deep generative models in various domains: images, 3D-shapes, time-series, and on different datasets: MNIST, Fashion MNIST, SVHN, CIFAR10, FFHQ, market stock data, ShapeNet. We demonstrate that the MTop-Divergence accurately detects various degrees of mode-dropping, intra-mode collapse, mode invention, and image disturbance. Our algorithm scales well (essentially linearly) with the increase of the dimension of the ambient high-dimensional space. It is one of the first TDA-based methodologies that can be applied universally to datasets of different sizes and dimensions, including the ones on which the most recent GANs in the visual domain are trained. The proposed method is domain agnostic and does not rely on pre-trained networks.

IROS Conference 2021 Conference Paper

Random Fourier Features based SLAM

  • Yermek Kapushev
  • Anastasia Kishkun
  • Gonzalo Ferrer 0001
  • Evgeny Burnaev

This work is dedicated to simultaneous continuous-time trajectory estimation and mapping based on Gaussian Processes (GP). State-of-the-art GP-based models for Simultaneous Localization and Mapping (SLAM) are computationally efficient but can only be used with a restricted class of kernel functions. This paper provides the algorithm based on GP with Random Fourier Features (RFF) approximation for SLAM without any constraints. The advantages of RFF for continuous-time SLAM are that we can consider a broader class of kernels and, at the same time, maintain computational complexity at reasonably low level by operating in the Fourier space of features. The accuracy-speed trade-off can be controlled by the number of features. Our experimental results on synthetic and real-world benchmarks demonstrate the cases in which our approach provides better results compared to the current state-of-the-art.

ICLR Conference 2021 Conference Paper

Wasserstein-2 Generative Networks

  • Alexander Korotin
  • Vage Egiazarian
  • Arip Asadulaev
  • Alexander Safin
  • Evgeny Burnaev

We propose a novel end-to-end non-minimax algorithm for training optimal transport mappings for the quadratic cost (Wasserstein-2 distance). The algorithm uses input convex neural networks and a cycle-consistency regularization to approximate Wasserstein-2 distance. In contrast to popular entropic and quadratic regularizers, cycle-consistency does not introduce bias and scales well to high dimensions. From the theoretical side, we estimate the properties of the generative mapping fitted by our algorithm. From the practical side, we evaluate our algorithm on a wide range of tasks: image-to-image color transfer, latent space optimal transport, image-to-image style transfer, and domain adaptation.

ICML Conference 2020 Conference Paper

Bayesian Sparsification of Deep C-valued Networks

  • Ivan Nazarov
  • Evgeny Burnaev

With continual miniaturization ever more applications of deep learning can be found in embedded systems, where it is common to encounter data with natural representation in the complex domain. To this end we extend Sparse Variational Dropout to complex-valued neural networks and verify the proposed Bayesian technique by conducting a large numerical study of the performance-compression trade-off of C-valued networks on two tasks: image recognition on MNIST-like and CIFAR10 datasets and music transcription on MusicNet. We replicate the state-of-the-art result by Trabelsi et al. (2018) on MusicNet with a complex-valued network compressed by 50-100x at a small performance penalty.

ICML Conference 2018 Conference Paper

Anonymous Walk Embeddings

  • Sergey Ivanov
  • Evgeny Burnaev

The task of representing entire graphs has seen a surge of prominent results, mainly due to learning convolutional neural networks (CNNs) on graph-structured data. While CNNs demonstrate state-of-the-art performance in graph classification task, such methods are supervised and therefore steer away from the original problem of network representation in task-agnostic manner. Here, we coherently propose an approach for embedding entire graphs and show that our feature representations with SVM classifier increase classification accuracy of CNN algorithms and traditional graph kernels. For this we describe a recently discovered graph object, anonymous walk, on which we design task-independent algorithms for learning graph representations in explicit and distributed way. Overall, our work represents a new scalable unsupervised learning of state-of-the-art representations of entire graphs.

NeurIPS Conference 2018 Conference Paper

Quadrature-based features for kernel approximation

  • Marina Munkhoeva
  • Yermek Kapushev
  • Evgeny Burnaev
  • Ivan Oseledets

We consider the problem of improving kernel approximation via randomized feature maps. These maps arise as Monte Carlo approximation to integral representations of kernel functions and scale up kernel methods for larger datasets. Based on an efficient numerical integration technique, we propose a unifying approach that reinterprets the previous random features methods and extends to better estimates of the kernel approximation. We derive the convergence behavior and conduct an extensive empirical study that supports our hypothesis.