Arrow Research search

Author name cluster

Xiaoming Huo

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

13 papers
2 author rows

Possible papers

13

NeurIPS Conference 2025 Conference Paper

Kernel-based Equalized Odds: A Quantification of Accuracy-Fairness Trade-off in Fair Representation Learning

  • Yijin Ni
  • Xiaoming Huo

This paper introduces a novel kernel-based formulation of the Equalized Odds (EO) criterion, denoted as $\operatorname{EO}_k$, for fair representation learning (FRL) in supervised settings. The central goal of FRL is to mitigate discrimination regarding a sensitive attribute $S$ while preserving prediction accuracy for the target variable $Y$. Our proposed criterion enables a rigorous and interpretable quantification of three core fairness objectives: independence ($\widehat{Y} \perp S$), separation—also known as equalized odds ($\widehat{Y} \perp S \mid Y$), and calibration ($Y \perp S \mid \widehat{Y}$). Under both unbiased ($Y \perp S$) and biased ($Y \not \perp S$) conditions, we show that $\operatorname{EO}_k$ satisfies both independence and separation in the former, and uniquely preserves predictive accuracy while lower bounding independence and calibration in the latter, thereby offering a unified analytical characterization of the tradeoffs among these fairness criteria. We further define the empirical counterpart, $\widehat{\operatorname{EO}}_k$, a kernel-based statistic that can be computed in quadratic time, with linear-time approximations also available. A concentration inequality for $\widehat{\operatorname{EO}}_k$ is derived, providing performance guarantees and error bounds, which serve as practical certificates of fairness compliance. While our focus is on theoretical development, the results lay essential groundwork for principled and provably fair algorithmic design in future empirical studies.

NeurIPS Conference 2025 Conference Paper

PoGDiff: Product-of-Gaussians Diffusion Models for Imbalanced Text-to-Image Generation

  • Ziyan Wang
  • Sizhe Wei
  • Xiaoming Huo
  • Hao Wang

Diffusion models have made significant advancements in recent years. However, their performance often deteriorates when trained or fine-tuned on imbalanced datasets. This degradation is largely due to the disproportionate representation of majority and minority data in image-text pairs. In this paper, we propose a general fine-tuning approach, dubbed PoGDiff, to address this challenge. Rather than directly minimizing the KL divergence between the predicted and ground-truth distributions, PoGDiff replaces the ground-truth distribution with a Product of Gaussians (PoG), which is constructed by combining the original ground-truth targets with the predicted distribution conditioned on a neighboring text embedding. Experiments on real-world datasets demonstrate that our method effectively addresses the imbalance problem in diffusion models, improving both generation accuracy and quality.

ICLR Conference 2025 Conference Paper

Towards Domain Adaptive Neural Contextual Bandits

  • Ziyan Wang
  • Xiaoming Huo
  • Hao Wang 0014

Contextual bandit algorithms are essential for solving real-world decision making problems. In practice, collecting a contextual bandit's feedback from different domains may involve different costs. For example, measuring drug reaction from mice (as a source domain) and humans (as a target domain). Unfortunately, adapting a contextual bandit algorithm from a source domain to a target domain with distribution shift still remains a major challenge and largely unexplored. In this paper, we introduce the first general domain adaptation method for contextual bandits. Our approach learns a bandit model for the target domain by collecting feedback from the source domain. Our theoretical analysis shows that our algorithm maintains a sub-linear regret bound even adapting across domains. Empirical results show that our approach outperforms the state-of-the-art contextual bandit algorithms on real-world datasets. Code will soon be available at https://github.com/Wang-ML-Lab/DABand.

JMLR Journal 2024 Journal Article

Adjusted Wasserstein Distributionally Robust Estimator in Statistical Learning

  • Yiling Xie
  • Xiaoming Huo

We propose an adjusted Wasserstein distributionally robust estimator---based on a nonlinear transformation of the Wasserstein distributionally robust (WDRO) estimator in statistical learning. The classic WDRO estimator is asymptotically biased, while our adjusted WDRO estimator is asymptotically unbiased, resulting in a smaller asymptotic mean squared error. Further, under certain conditions, our proposed adjustment technique provides a general principle to de-bias asymptotically biased estimators. Specifically, we will investigate how the adjusted WDRO estimator is developed in the generalized linear model, including logistic regression, linear regression, and Poisson regression. Numerical experiments demonstrate the favorable practical performance of the adjusted estimator over the classic one. [abs] [ pdf ][ bib ] &copy JMLR 2024. ( edit, beta )

JMLR Journal 2024 Journal Article

Classification of Data Generated by Gaussian Mixture Models Using Deep ReLU Networks

  • Tian-Yi Zhou
  • Xiaoming Huo

This paper studies the binary classification of unbounded data from ${\mathbb R}^d$ generated under Gaussian Mixture Models (GMMs) using deep ReLU neural networks. We obtain — for the first time — non-asymptotic upper bounds and convergence rates of the excess risk (excess misclassification error) for the classification without restrictions on model parameters. While the majority of existing generalization analysis of classification algorithms relies on a bounded domain, we consider an unbounded domain by leveraging the analyticity and fast decay of Gaussian distributions. To facilitate our analysis, we give a novel approximation error bound for general analytic functions using ReLU networks, which may be of independent interest. Gaussian distributions can be adopted nicely to model data arising in applications, e.g., speeches, images, and texts; our results provide a theoretical verification of the observed efficiency of deep neural networks in practical classification problems. [abs] [ pdf ][ bib ] &copy JMLR 2024. ( edit, beta )

NeurIPS Conference 2024 Conference Paper

High-dimensional (Group) Adversarial Training in Linear Regression

  • Yiling Xie
  • Xiaoming Huo

Adversarial training can achieve robustness against adversarial perturbations and has been widely used in machine-learning models. This paper delivers a non-asymptotic consistency analysis of the adversarial training procedure under $\ell_\infty$-perturbation in high-dimensional linear regression. It will be shown that, under the restricted eigenvalue condition, the associated convergence rate of prediction error can achieve the minimax rate up to a logarithmic factor in the high-dimensional linear regression on the class of sparse parameters. Additionally, the group adversarial training procedure is analyzed. Compared with classic adversarial training, it will be proved that the group adversarial training procedure enjoys a better prediction error upper bound under certain group-sparsity patterns.

ICML Conference 2024 Conference Paper

Universal Consistency of Wide and Deep ReLU Neural Networks and Minimax Optimal Convergence Rates for Kolmogorov-Donoho Optimal Function Classes

  • Hyunouk Ko
  • Xiaoming Huo

In this paper, we prove the universal consistency of wide and deep ReLU neural network classifiers. We also give sufficient conditions for a class of probability measures for which classifiers based on neural networks achieve minimax optimal rates of convergence. The result applies to a wide range of known function classes. In particular, while most previous works impose explicit smoothness assumptions on the regression function, our framework encompasses more general settings. The proposed neural networks are either the minimizers of the $0$-$1$ loss that exhibit a benign overfitting behavior.

ICLR Conference 2023 Conference Paper

Approximation and non-parametric estimation of functions over high-dimensional spheres via deep ReLU networks

  • Namjoon Suh
  • Tian-Yi Zhou 0009
  • Xiaoming Huo

We develop a new approximation and estimation analysis of deep feed-forward neural networks (FNNs) with the Rectified Linear Unit (ReLU) activation. The functions of interests for the approximation and estimation are assumed to be from Sobolev spaces defined over the $d$-dimensional unit sphere with smoothness index $r>0$. In the regime where $r$ is in the constant order (i.e., $r=\mathcal{O}(1)$), it is shown that at most $d^d$ active parameters are required for getting $d^{-C}$ approximation rate for some constant $C>0$. In contrast, in the regime where the index $r$ grows in the order of $d$ (i.e., $r=\mathcal{O}(d)$) asymptotically, we prove the approximation error decays in the rate $d^{-d^{\beta}}$ with $0<\beta<1$ up to some constant factor independent of $d$. The required number of active parameters in the networks for the approximation increases polynomially in $d$ as $d\rightarrow{\infty}$. In addition to this, it is shown that bound on the excess risk has a $d^d$ factor, when $r=\mathcal{O}(1)$, whereas it has $d^{\mathcal{O}(1)}$ factor, when $r=\mathcal{O}(d)$. We emphasize our findings by making comparisons to the results on approximation and estimation errors of deep ReLU FNN when functions are from Sobolev spaces defined over $d$-dimensional cube. Here, we show that with the current state-of-the-art result, $d^{d}$ factor remain both in the approximation and estimation error, regardless of the order of $r$.

ICML Conference 2023 Conference Paper

Conformalization of Sparse Generalized Linear Models

  • Etash Kumar Guha
  • Eugène Ndiaye
  • Xiaoming Huo

Given a sequence of observable variables $\{(x_1, y_1), \ldots, (x_n, y_n)\}$, the conformal prediction method estimates a confidence set for $y_{n+1}$ given $x_{n+1}$ that is valid for any finite sample size by merely assuming that the joint distribution of the data is permutation invariant. Although attractive, computing such a set is computationally infeasible in most regression problems. Indeed, in these cases, the unknown variable $y_{n+1}$ can take an infinite number of possible candidate values, and generating conformal sets requires retraining a predictive model for each candidate. In this paper, we focus on a sparse linear model with only a subset of variables for prediction and use numerical continuation techniques to approximate the solution path efficiently. The critical property we exploit is that the set of selected variables is invariant under a small perturbation of the input data. Therefore, it is sufficient to enumerate and refit the model only at the change points of the set of active features and smoothly interpolate the rest of the solution via a Predictor-Corrector mechanism. We show how our path-following algorithm accurately approximates conformal prediction sets and illustrate its performance using synthetic and real data examples.

TMLR Journal 2023 Journal Article

Solving a Special Type of Optimal Transport Problem by a Modified Hungarian Algorithm

  • Yiling Xie
  • Yiling Luo
  • Xiaoming Huo

Computing the empirical Wasserstein distance in the Wasserstein-distance-based independence test is an optimal transport (OT) problem with a special structure. This observation inspires us to study a special type of OT problem and propose {\it a modified Hungarian algorithm} to solve it {\it exactly}. For the OT problem involving two marginals with $m$ and $n$ atoms ($m\geq n$), respectively, the computational complexity of the proposed algorithm is $\mathcal{O}(m^2n)$. Computing the empirical Wasserstein distance in the independence test requires solving this special type of OT problem, where $m=n^2$. The associated computational complexity of the proposed algorithm is $\mathcal{O}(n^5)$, while the order of applying the classic Hungarian algorithm is $\mathcal{O}(n^6)$. In addition to the aforementioned special type of OT problem, it is shown that the modified Hungarian algorithm could be adopted to solve a wider range of OT problems. Broader applications of the proposed algorithm are discussed---solving the one-to-many assignment problem and the many-to-many assignment problem. We conduct numerical experiments to validate our theoretical results. The experiment results demonstrate that the proposed modified Hungarian algorithm compares favorably with the Hungarian algorithm and the well-known Sinkhorn algorithm, and the network simplex algorithm.

ICLR Conference 2022 Conference Paper

A Non-Parametric Regression Viewpoint: Generalization of Overparametrized Deep RELU Network Under Noisy Observations

  • Namjoon Suh
  • Hyunouk Ko
  • Xiaoming Huo

We study the generalization properties of the overparameterized deep neural network (DNN) with Rectified Linear Unit (ReLU) activations. Under the non-parametric regression framework, it is assumed that the ground-truth function is from a reproducing kernel Hilbert space (RKHS) induced by a neural tangent kernel (NTK) of ReLU DNN, and a dataset is given with the noises. Without a delicate adoption of early stopping, we prove that the overparametrized DNN trained by vanilla gradient descent does not recover the ground-truth function. It turns out that the estimated DNN's $L_{2}$ prediction error is bounded away from $0$. As a complement of the above result, we show that the $\ell_{2}$-regularized gradient descent enables the overparametrized DNN achieve the minimax optimal convergence rate of the $L_{2}$ prediction error, without early stopping. Notably, the rate we obtained is faster than $\mathcal{O}(n^{-1/2})$ known in the literature.

IROS Conference 2011 Conference Paper

Multi-scale LPA* with low worst-case complexity guarantees

  • Yibiao Lu
  • Xiaoming Huo
  • Oktay Arslan
  • Panagiotis Tsiotras

In this paper we consider dynamic shortest path-planning problems on a graph with a single endpoint pair and with potentially changing edge weights over time. Several incremental algorithms exist in the literature that solve this problem, notably among them the Lifelong Planning A* (LPA*) algorithm. Although, in most cases, the LPA* algorithm requires a relatively small number of updates, in some other cases the amount of work required by the LPA* to find the optimal path can be overwhelming. To address this issue, in this paper we propose an extension of the baseline LPA* algorithm, by making efficient use of a multiscale representation of the environment.

IROS Conference 2011 Conference Paper

Solving shortest path problems with curvature constraints using beamlets

  • Oktay Arslan
  • Panagiotis Tsiotras
  • Xiaoming Huo

Auditory is a convenient and efficient way for Human-Robot Interaction, however implementing a sound source localization system based on TDOA method encounters many problems, such as noise of real environments, and resolution of nonlinear equations, switch between far field and near field and lack of microphones for geometric positioning localization method. In this paper, a new spectral weighting GCC-PHAT method is proposed to deal with noise. Furthermore, the time difference feature of sound source and its spatial distribution are analyzed. Based on prosperities of the distribution, a space grid matching (SGM) algorithm is proposed for localization step, which handles those problems that geometric positioning method faces effectively. Decision tree and valid feature detection algorithm are also proposed to reduce computational complexity and improve performance. Experiments are achieved in real environments on a mobile robot platform, in which 2016 sets of speech data are tested using four microphones in 3D space. More than 95% azimuth localization rate with error less than 5 degrees and approximate 90% horizontal distance localization rate are obtained.