Author name cluster

Yusu Wang

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

14 papers

1 author row

NeurIPS Conference 2025 Conference Paper

Differentiable extensions with rounding guarantees for combinatorial optimization over permutations

Robert (Riley) Nerem
Zhishang Luo
Akbar Rafiey
Yusu Wang

Continuously extending combinatorial optimization objectives is a powerful technique commonly applied to the optimization of set functions. However, few such methods exist for extending functions on permutations, despite the fact that many combinatorial optimization problems, such as the quadratic assignment problem (QAP) and the traveling salesperson problem (TSP), are inherently optimization over permutations. We present Birkhoff Extension (BE), an almost-everywhere-differentiable continuous polytime-computable extension of any real-valued function on permutations to doubly stochastic matrices. Key to this construction is our introduction of a continuous variant of the well-known Birkhoff decomposition. Our extension has several nice properties making it appealing for optimization problems. First, BE provides a rounding guarantee, namely any solution to the extension can be efficiently rounded to a permutation without increasing the function value. Furthermore, an approximate solution in the relaxed case will give rise to an approximate solution in the space of permutations. Second, using BE, any real-valued optimization objective on permutations can be extended to an almost-everywhere-differentiable objective function over the space of doubly stochastic matrices. This makes our BE amenable to not only gradient-descent based optimization, but also unsupervised neural combinatorial optimization where training often requires a differentiable loss. Third, based on the above properties, we present a simple optimization procedure which can be readily combined with existing optimization approaches to offer local improvements (i. e. , the quality of the final solution is no worse than the initial solution). Finally, we also adapt our extension to optimization problems over a class of trees, such as Steiner tree and optimization-based hierarchical clustering. We present experimental results to verify our theoretical results on several combinatorial optimization problems related to permutations.

PDF Details

NeurIPS Conference 2025 Conference Paper

Effective Neural Approximations for Geometric Optimization Problems

Samantha Chen
Oren Ciolli
Anastasios Sidiropoulos
Yusu Wang

Neural networks offer a promising data-driven approach to tackle computationally challenging optimization problems. In this work, we introduce neural approximation frameworks for a family of geometric "extent measure" problems, including shape-fitting descriptors (e. g. minimum enclosing ball or annulus). Central to our approach is the \textit{alignment} of our neural model with a new variant of the classical $\varepsilon$-kernel technique from computational geometry. In particular, we develop a new relaxed-$\varepsilon$-kernel theory that maintains the approximation guarantees of the classical $\varepsilon$-kernels but with the crucial benefit that it can be easily implemented with \textit{bounded model complexity} (i. e, constant number of parameters) by the simple SumFormer neural network. This leads to a simple neural model to approximate objects such as the directional width of any input point set, and empirically shows excellent out-of-distribution generalization. Many geometric extent measures, such as the minimum enclosing spherical shell, cannot be directly captured by $\varepsilon$-kernels. To this end, we show that an encode-process-decode framework with our kernel approximating NN used as the ``process'' module can approximate such extent measures, again, with bounded model complexity where parameters scale only with the approximation error $\varepsilon$ and not the size of the input set. Empirical results on diverse point‐cloud datasets demonstrate the practical performance of our models.

PDF Details

JMLR Journal 2025 Journal Article

Enhancing Graph Representation Learning with Localized Topological Features

Zuoyu Yan
Qi Zhao
Ze Ye
Tengfei Ma
Liangcai Gao
Zhi Tang
Yusu Wang
Chao Chen

Representation learning on graphs is a fundamental problem that can be crucial in various tasks. Graph neural networks, the dominant approach for graph representation learning, are limited in their representation power. Therefore, it can be beneficial to explicitly extract and incorporate high-order topological and geometric information into these models. In this paper, we propose a principled approach to extract the rich connectivity information of graphs based on the theory of persistent homology. Our method utilizes the topological features to enhance the representation learning of graph neural networks and achieve state-of-the-art performance on various node classification and link prediction benchmarks. We also explore the option of end-to-end learning of the topological features, i.e., treating topological computation as a differentiable operator during learning. Our theoretical analysis and empirical study provide insights and potential guidelines for employing topological features in graph learning tasks. [abs] [ pdf ][ bib ] &copy JMLR 2025. ( edit, beta )

PDF Details

NeurIPS Conference 2025 Conference Paper

Seeds of Structure: Patch PCA Reveals Universal Compositional Cues in Diffusion Models

Qingsong Wang
Zhengchao Wan
Misha Belkin
Yusu Wang

Diffusion models transform random noise into images of remarkable fidelity, yet the structure of this noise-to-image map remains largely unexplored. We investigate this relationship using patch-wise Principal Component Analysis (PCA) and empirically demonstrate that low-frequency components of the initial noise predominantly influence the compositional structure of generated images. Our analyses reveal that noise seeds inherently contain universal compositional cues, evident when identical seeds produce images with similar structural attributes across different datasets and model architectures. Leveraging these insights, we develop and theoretically justify a simple yet effective Patch PCA denoiser that extracts underlying structure from noise using only generic natural image statistics. The robustness of these structural cues is observed to persist across both pixel-space models and latent diffusion models, highlighting their fundamental nature. Finally, we introduce a zero-shot editing method that enables injecting compositional control over generated images, providing an intuitive approach to guided generation without requiring model fine-tuning or additional training.

PDF Details

AAAI Conference 2024 Conference Paper

Learning Ultrametric Trees for Optimal Transport Regression

Samantha Chen
Puoya Tabaghi
Yusu Wang

Optimal transport provides a metric which quantifies the dissimilarity between probability measures. For measures supported in discrete metric spaces, finding the optimal transport distance has cubic time complexity in the size of the space. However, measures supported on trees admit a closed-form optimal transport that can be computed in linear time. In this paper, we aim to find an optimal tree structure for a given discrete metric space so that the tree-Wasserstein distance approximates the optimal transport distance in the original space. One of our key ideas is to cast the problem in ultrametric spaces. This helps us optimize over the space of ultrametric trees --- a mixed-discrete and continuous optimization problem --- via projected gradient decent over the space of ultrametric matrices. During optimization, we project the parameters to the ultrametric space via a hierarchical minimum spanning tree algorithm, equivalent to the closest projection to ultrametrics under the supremum norm. Experimental results on real datasets show that our approach outperforms previous approaches (e.g. Flowtree, Quadtree) in approximating optimal transport distances. Finally, experiments on synthetic data generated on ground truth trees show that our algorithm can accurately uncover the underlying trees.

PDF Details DOI

AAAI Conference 2024 Conference Paper

NN-Steiner: A Mixed Neural-Algorithmic Approach for the Rectilinear Steiner Minimum Tree Problem

Andrew B. Kahng
Robert R. Nerem
Yusu Wang
Chien-Yi Yang

Recent years have witnessed rapid advances in the use of neural networks to solve combinatorial optimization problems. Nevertheless, designing the "right" neural model that can effectively handle a given optimization problem can be challenging, and often there is no theoretical understanding or justification of the resulting neural model. In this paper, we focus on the rectilinear Steiner minimum tree (RSMT) problem, which is of critical importance in IC layout design and as a result has attracted numerous heuristic approaches in the VLSI literature. Our contributions are two-fold. On the methodology front, we propose NN-Steiner which is a novel mixed neural-algorithmic framework for computing RSMTs that leverages the celebrated PTAS algorithmic framework of Arora to solve this problem (and other geometric optimization problems). Our NN-Steiner replaces key algorithmic components within Arora's PTAS by suitable neural components. In particular, NN-Steiner only needs four neural network (NN) components that are called repeatedly within an algorithmic framework. Crucially, each of the four NN components is only of bounded size independent of input size, and thus easy to train. Furthermore, as the NN component is learning a generic algorithmic step, once learned, the resulting mixed neural-algorithmic framework generalizes to much larger instances not seen in training. Our NN-Steiner, to our best knowledge, is the first neural architecture of bounded size that has capacity to approximately solve RSMT (and variants). On the empirical front, we show how NN-Steiner can be implemented and demonstrate the effectiveness of our resulting approach, especially in terms of generalization, by comparing with state-of-the-art methods (both neural and non-neural based).

PDF Details DOI

NeurIPS Conference 2023 Conference Paper

Neural approximation of Wasserstein distance via a universal architecture for symmetric and factorwise group invariant functions

Samantha Chen
Yusu Wang

Learning distance functions between complex objects, such as the Wasserstein distance to compare point sets, is a common goal in machine learning applications. However, functions on such complex objects (e. g. , point sets and graphs) are often required to be invariant to a wide variety of group actions e. g. permutation or rigid transformation. Therefore, continuous and symmetric *product* functions (such as distance functions) on such complex objects must also be invariant to the *product* of such group actions. We call these functions symmetric and factor-wise group invariant functions (or SGFI functions} in short). In this paper, we first present a general neural network architecture for approximating SFGI functions. The main contribution of this paper combines this general NN with a sketching idea in order to develop a specific and efficient neural network which can approximate the $p$-th Wasserstein distance between point sets. Very importantly, the required model complexity is *independent* of the sizes of input point sets. On the theoretical front, to the best of our knowledge, this is the first result showing that there exists a neural network with the capacity to approximate Wasserstein distance with bounded model complexity. Our work provides an interesting integration of sketching ideas for geometric problems with universal approximation of symmetric functions. On the empirical front, we present a range of results showing that our newly proposed neural network architecture performs comparatively or better than other models (including a SOTA Siamese Autoencoder based approach). In particular, our NN generalizes significantly better and trains much faster than the SOTA Siamese AE. Finally, this line of investigation could be useful in exploring effective neural network design for solving a broad range of geometric optimization problems (e. g. , $k$-means in a metric space).

PDF Details

NeurIPS Conference 2022 Conference Paper

Neural Approximation of Graph Topological Features

Zuoyu Yan
Tengfei Ma
Liangcai Gao
Zhi Tang
Yusu Wang
Chao Chen

Topological features based on persistent homology capture high-order structural information so as to augment graph neural network methods. However, computing extended persistent homology summaries remains slow for large and dense graphs and can be a serious bottleneck for the learning pipeline. Inspired by recent success in neural algorithmic reasoning, we propose a novel graph neural network to estimate extended persistence diagrams (EPDs) on graphs efficiently. Our model is built on algorithmic insights, and benefits from better supervision and closer alignment with the EPD computation algorithm. We validate our method with convincing empirical results on approximating EPDs and downstream graph representation learning tasks. Our method is also efficient; on large and dense graphs, we accelerate the computation by nearly 100 times.

PDF Details

NeurIPS Conference 2021 Conference Paper

NN-Baker: A Neural-network Infused Algorithmic Framework for Optimization Problems on Geometric Intersection Graphs

Evan McCarty
Qi Zhao
Anastasios Sidiropoulos
Yusu Wang

Recent years have witnessed a surge of approaches to use neural networks to help tackle combinatorial optimization problems, including graph optimization problems. However, theoretical understanding of such approaches remains limited. In this paper, we consider the geometric setting, where graphs are induced by points in a fixed dimensional Euclidean space. We show that several graph optimization problems can be approximated by an algorithm that is polynomial in graph size n via a framework we propose, call the Baker-paradigm. More importantly, a key advantage of the Baker-paradigm is that it decomposes the input problem into (at most linear number of) small sub-problems of fixed sizes (independent of the size of the input). For the family of such fixed-size sub-problems, we can now design neural networks with universal approximation guarantees to solve them. This leads to a mixed algorithmic-ML framework, which we call NN-Baker that has the capacity to approximately solve a family of graph optimization problems (e. g, maximum independent set and minimum vertex cover) in time linear to input graph size, and only polynomial to approximation parameter. We instantiate our NN-Baker by a CNN version and GNN version, and demonstrate the effectiveness and efficiency of our approach via a range of experiments.

PDF Details

IJCAI Conference 2019 Conference Paper

Heuristic Search for Homology Localization Problem and Its Application in Cardiac Trabeculae Reconstruction

Xudong Zhang
Pengxiang Wu
Changhe Yuan
Yusu Wang
Dimitris Metaxas
Chao Chen

Cardiac trabeculae are fine rod-like muscles whose ends are attached to the inner walls of ventricles. Accurate extraction of trabeculae is important yet challenging, due to the background noise and limited resolution of cardiac images. Existing works proposed to handle this task by modeling the trabeculae as topological handles for better extraction. Computing optimal representation of these handles is essential yet very expensive. In this work, we formulate the problem as a heuristic search problem, and propose novel heuristic functions based on advanced topological techniques. We show in experiments that the proposed heuristic functions improve the computation in both time and memory.

PDF Details

NeurIPS Conference 2019 Conference Paper

Learning metrics for persistence-based summaries and applications for graph classification

Qi Zhao
Yusu Wang

Recently a new feature representation and data analysis methodology based on a topological tool called persistent homology (and its persistence diagram summary) has gained much momentum. A series of methods have been developed to map a persistence diagram to a vector representation so as to facilitate the downstream use of machine learning tools. In these approaches, the importance (weight) of different persistence features are usually pre-set. However often in practice, the choice of the weight-function should depend on the nature of the specific data at hand. It is thus highly desirable to learn a best weight-function (and thus metric for persistence diagrams) from labelled data. We study this problem and develop a new weighted kernel, called WKPI, for persistence summaries, as well as an optimization framework to learn the weight (and thus kernel). We apply the learned kernel to the challenging task of graph classification, and show that our WKPI-based classification framework obtains similar or (sometimes significantly) better results than the best results from a range of previous graph classification frameworks on a collection of benchmark datasets.

PDF Details

NeurIPS Conference 2016 Conference Paper

Graphons, mergeons, and so on!

Justin Eldridge
Mikhail Belkin
Yusu Wang

In this work we develop a theory of hierarchical clustering for graphs. Our modelling assumption is that graphs are sampled from a graphon, which is a powerful and general model for generating graphs and analyzing large networks. Graphons are a far richer class of graph models than stochastic blockmodels, the primary setting for recent progress in the statistical theory of graph clustering. We define what it means for an algorithm to produce the ``correct" clustering, give sufficient conditions in which a method is statistically consistent, and provide an explicit algorithm satisfying these properties.

PDF Details

NeurIPS Conference 2014 Conference Paper

Learning with Fredholm Kernels

Qichao Que
Mikhail Belkin
Yusu Wang

In this paper we propose a framework for supervised and semi-supervised learning based on reformulating the learning problem as a regularized Fredholm integral equation. Our approach fits naturally into the kernel framework and can be interpreted as constructing new data-dependent kernels, which we call Fredholm kernels. We proceed to discuss the noise assumption" for semi-supervised learning and provide evidence evidence both theoretical and experimental that Fredholm kernels can effectively utilize unlabeled data under the noise assumption. We demonstrate that methods based on Fredholm learning show very competitive performance in the standard semi-supervised learning setting. "

PDF Details

NeurIPS Conference 2011 Conference Paper

Data Skeletonization via Reeb Graphs

Xiaoyin Ge
Issam Safa
Mikhail Belkin
Yusu Wang

Recovering hidden structure from complex and noisy non-linear data is one of the most fundamental problems in machine learning and statistical inference. While such data is often high-dimensional, it is of interest to approximate it with a low-dimensional or even one-dimensional space, since many important aspects of data are often intrinsically low-dimensional. Furthermore, there are many scenarios where the underlying structure is graph-like, e. g, river/road networks or various trajectories. In this paper, we develop a framework to extract, as well as to simplify, a one-dimensional "skeleton" from unorganized data using the Reeb graph. Our algorithm is very simple, does not require complex optimizations and can be easily applied to unorganized high-dimensional data such as point clouds or proximity graphs. It can also represent arbitrary graph structures in the data. We also give theoretical results to justify our method. We provide a number of experiments to demonstrate the effectiveness and generality of our algorithm, including comparisons to existing methods, such as principal curves. We believe that the simplicity and practicality of our algorithm will help to promote skeleton graphs as a data analysis tool for a broad range of applications.

PDF Details