Arrow Research search

Author name cluster

Bei Jiang

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

18 papers
2 author rows

Possible papers

18

TMLR Journal 2026 Journal Article

A Bayesian Bootstrap Framework for Mutual Information Neural Estimation: Bridging Classical Mutual Information Learning and Bayesian Nonparametric Learning

  • Forough Fazeli-Asl
  • Michael Minyi Zhang
  • Linglong Kong
  • Bei Jiang

In this work, we introduce a Bayesian bootstrap resampling framework for estimating mutual information (MI) via ``mutual information neural estimation'' (MINE), making MINE directly applicable in a Bayesian nonparametric learning (BNPL) framework. The resulting estimator shows low variability across batch sizes and high-dimensional settings, as demonstrated through extensive numerical studies. In particular, our proposed bootstrap version yields tighter and lower-variance estimates than the original MINE formulation, both theoretically and empirically. We further demonstrate its practical value in a downstream task by improving VAE-GAN training within BNPL, leading to higher-quality outputs. Beyond enabling MI-based BNPL, the proposed bootstrap estimator also performs competitively against leading frequentist state-of-the-art benchmarks. Overall, our findings establish the first principled framework for Bayesian bootstrap-based MI estimation and highlight its effectiveness as a reliable tool for future BNPL studies.

ICLR Conference 2025 Conference Paper

CBMA: Improving Conformal Prediction through Bayesian Model Averaging

  • Pankaj Bhagwat
  • Linglong Kong
  • Bei Jiang

Conformal prediction has emerged as a popular technique for facilitating valid predictive inference across a spectrum of machine learning models, under minimal assumption of exchangeability. Recently, Hoff (2023) showed that full conformal Bayes provides the most efficient prediction sets (smallest by expected volume) among all prediction sets that are valid at the $(1 - \alpha)$ level if the model is correctly specified. However, a critical issue arises when the Bayesian model itself may be mis-specified, resulting in prediction interval that might be suboptimal, even though it still enjoys the frequentist coverage guarantee. To address this limitation, we propose an innovative solution that combines Bayesian model averaging (BMA) with conformal prediction. This hybrid not only leverages the strengths of Bayesian conformal prediction but also introduces a layer of robustness through model averaging. Theoretically, we prove that the resulting prediction interval will converge to the optimal level of efficiency, if the true model is included among the candidate models. This assurance of optimality, even under potential model uncertainty, provides a significant improvement over existing methods, ensuring more reliable and precise uncertainty quantification.

ICML Conference 2025 Conference Paper

Differentially Private Analysis for Binary Response Models: Optimality, Estimation, and Inference

  • Ce Zhang
  • Yixin Han
  • Yafei Wang
  • Xiaodong Yan
  • Linglong Kong
  • Ting Li
  • Bei Jiang

Randomized response (RR) mechanisms constitute a fundamental and effective technique for ensuring label differential privacy (LabelDP). However, existing RR methods primarily focus on the response labels while overlooking the influence of covariates and often do not fully address optimality. To address these challenges, this paper explores optimal LabelDP procedures using RR mechanisms, focusing on achieving optimal estimation and inference in binary response models. We first analyze the asymptotic behaviors of RR binary response models and then optimize the procedure by maximizing the trace of the Fisher Information Matrix within the $\varepsilon$- and $(\varepsilon, \delta)$-LabelDP constraints. Our theoretical results indicate that the proposed methods achieve optimal LabelDP guarantees while maintaining statistical accuracy in binary response models under mild conditions. Furthermore, we develop private confidence intervals with nominal coverage for statistical inference. Extensive simulation studies and real-world applications confirm that our methods outperform existing approaches in terms of precise estimation, privacy protection, and reliable inference.

NeurIPS Conference 2025 Conference Paper

Intrinsic Benefits of Categorical Distributional Loss: Uncertainty-aware Regularized Exploration in Reinforcement Learning

  • Ke Sun
  • Yingnan Zhao
  • Enze Shi
  • Yafei Wang
  • Xiaodong Yan
  • Bei Jiang
  • Linglong Kong

The remarkable empirical performance of distributional reinforcement learning~(RL) has garnered increasing attention to understanding its theoretical advantages over classical RL. By decomposing the categorical distributional loss commonly employed in distributional RL, we find that the potential superiority of distributional RL can be attributed to a derived distribution-matching entropy regularization. This less-studied entropy regularization aims to capture additional knowledge of return distribution beyond only its expectation, contributing to an augmented reward signal in policy optimization. In contrast to the vanilla entropy regularization in MaxEnt RL, which explicitly encourages exploration by promoting diverse actions, the novel entropy regularization derived from categorical distributional loss implicitly updates policies to align the learned policy with (estimated) environmental uncertainty. Finally, extensive experiments verify the significance of this uncertainty-aware regularization from distributional RL on the empirical benefits over classical RL. Our study offers an innovative exploration perspective to explain the intrinsic benefits of distributional learning in RL.

NeurIPS Conference 2025 Conference Paper

Understanding Fairness and Prediction Error through Subspace Decomposition and Influence Analysis

  • Enze Shi
  • Pankaj Bhagwat
  • Zhixian Yang
  • Linglong Kong
  • Bei Jiang

Machine learning models have achieved widespread success but often inherit and amplify historical biases, resulting in unfair outcomes. Traditional fairness methods typically impose constraints at the prediction level, without addressing underlying biases in data representations. In this work, we propose a principled framework that adjusts data representations to balance predictive utility and fairness. Using sufficient dimension reduction, we decompose the feature space into target-relevant, sensitive, and shared components, and control the fairness–utility trade-off by selectively removing sensitive information. We provide a theoretical analysis of how prediction error and fairness gaps evolve as shared subspaces are added, and employ influence functions to quantify their effects on the asymptotic behavior of parameter estimates. Experiments on both synthetic and real-world datasets validate our theoretical insights and show that the proposed method effectively improves fairness while preserving predictive performance.

AAAI Conference 2024 Conference Paper

Analysis of Differentially Private Synthetic Data: A Measurement Error Approach

  • Yangdi Jiang
  • Yi Liu
  • Xiaodong Yan
  • Anne-Sophie Charest
  • Linglong Kong
  • Bei Jiang

Differentially private (DP) synthetic datasets have been receiving significant attention from academia, industry, and government. However, little is known about how to perform statistical inference using DP synthetic datasets. Naive approaches that do not take into account the induced uncertainty due to the DP mechanism will result in biased estimators and invalid inferences. In this paper, we present a class of maximum likelihood estimator (MLE)-based easy-to-implement bias-corrected DP estimators with valid asymptotic confidence intervals (CI) for parameters in regression settings, by establishing the connection between additive DP mechanisms and measurement error models. Our simulation shows that our estimator has comparable performance to the widely used sufficient statistic perturbation (SSP) algorithm in some scenarios but with the advantage of releasing a synthetic dataset and obtaining statistically valid asymptotic CIs, which can achieve better coverage when compared to the naive CIs obtained by ignoring the DP mechanism.

NeurIPS Conference 2024 Conference Paper

Distributional Reinforcement Learning with Regularized Wasserstein Loss

  • Ke Sun
  • Yingnan Zhao
  • Wulong Liu
  • Bei Jiang
  • Linglong Kong

The empirical success of distributional reinforcement learning (RL) highly relies on the choice of distribution divergence equipped with an appropriate distribution representation. In this paper, we propose \textit{Sinkhorn distributional RL (SinkhornDRL)}, which leverages Sinkhorn divergence—a regularized Wasserstein loss—to minimize the difference between current and target Bellman return distributions. Theoretically, we prove the contraction properties of SinkhornDRL, aligning with the interpolation nature of Sinkhorn divergence between Wasserstein distance and Maximum Mean Discrepancy (MMD). The introduced SinkhornDRL enriches the family of distributional RL algorithms, contributing to interpreting the algorithm behaviors compared with existing approaches by our investigation into their relationships. Empirically, we show that SinkhornDRL consistently outperforms or matches existing algorithms on the Atari games suite and particularly stands out in the multi-dimensional reward setting. \thanks{Code is available in \url{https: //github. com/datake/SinkhornDistRL}. }.

JMLR Journal 2024 Journal Article

Inference on High-dimensional Single-index Models with Streaming Data

  • Dongxiao Han
  • Jinhan Xie
  • Jin Liu
  • Liuquan Sun
  • Jian Huang
  • Bei Jiang
  • Linglong Kong

Traditional statistical methods are faced with new challenges due to streaming data. The major challenge is the rapidly growing volume and velocity of data, which makes storing such huge data sets in memory impossible. The paper presents an online inference framework for regression parameters in high-dimensional semiparametric single-index models with unknown link functions. The proposed online procedure updates only the current data batch and summary statistics of historical data instead of re-accessing the entire raw data set. At the same time, we do not need to estimate the unknown link function, which is a highly challenging task. In addition, a generalized convex loss function is used in the proposed inference procedure. To illustrate the proposed method, we use the Huber loss function and the negative log-likelihood of the logistic regression model. In this study, the asymptotic normality of the proposed online debiased Lasso estimators and the bounds of the proposed online Lasso estimators are investigated. To evaluate the performance of the proposed method, extensive simulation studies have been conducted. We provide applications to Nasdaq stock prices and financial distress data sets. [abs] [ pdf ][ bib ] &copy JMLR 2024. ( edit, beta )

NeurIPS Conference 2024 Conference Paper

Probing Social Bias in Labor Market Text Generation by ChatGPT: A Masked Language Model Approach

  • Lei Ding
  • Yang Hu
  • Nicole Denier
  • Enze Shi
  • Junxi Zhang
  • Qirui Hu
  • Karen D. Hughes
  • Linglong Kong

As generative large language models (LLMs) such as ChatGPT gain widespread adoption in various domains, their potential to propagate and amplify social biases, particularly in high-stakes areas such as the labor market, has become a pressing concern. AI algorithms are not only widely used in the selection of job applicants, individual job seekers may also make use of generative LLMs to help develop their job application materials. Against this backdrop, this research builds on a novel experimental design to examine social biases within ChatGPT-generated job applications in response to real job advertisements. By simulating the process of job application creation, we examine the language patterns and biases that emerge when the model is prompted with diverse job postings. Notably, we present a novel bias evaluation framework based on Masked Language Models to quantitatively assess social bias based on validated inventories of social cues/words, enabling a systematic analysis of the language used. Our findings show that the increasing adoption of generative AI, not only by employers but also increasingly by individual job seekers, can reinforce and exacerbate gender and social inequalities in the labor market through the use of biased and gendered language.

AAAI Conference 2024 Conference Paper

Responsible Bandit Learning via Privacy-Protected Mean-Volatility Utility

  • Shanshan Zhao
  • Wenhai Cui
  • Bei Jiang
  • Linglong Kong
  • Xiaodong Yan

For ensuring the safety of users by protecting the privacy, the traditional privacy-preserving bandit algorithm aiming to maximize the mean reward has been widely studied in scenarios such as online ride-hailing, advertising recommendations, and personalized healthcare. However, classical bandit learning is irresponsible in such practical applications as they fail to account for risks in online decision-making and ignore external system information. This paper firstly proposes privacy protected mean-volatility utility as the objective of bandit learning and proves its responsibility, because it aims at achieving the maximum probability of utility by considering the risk. Theoretically, our proposed responsible bandit learning is expected to achieve the fastest convergence rate among current bandit algorithms and generates more statistical power than classical normality-based test. Finally, simulation studies provide supporting evidence for the theoretical results and demonstrate stronger performance when using stricter privacy budgets.

ICML Conference 2024 Conference Paper

Sample Average Approximation for Conditional Stochastic Optimization with Dependent Data

  • Yafei Wang
  • Bo Pan
  • Mei Li
  • Jianya Lu
  • Lingchen Kong
  • Bei Jiang
  • Linglong Kong

Conditional Stochastic Optimization (CSO) is a powerful modelling paradigm for optimization under uncertainty. The existing literature on CSO is mainly based on the independence assumption of data, which shows that the solution of CSO is asymptotically consistent and enjoys a finite sample guarantee. The independence assumption, however, does not typically hold in many important applications with dependence patterns, such as time series analysis, operational control, and reinforcement learning. In this paper, we aim to fill this gap and consider a Sample Average Approximation (SAA) for CSO with dependent data. Leveraging covariance inequalities and independent block sampling technique, we provide theoretical guarantees of SAA for CSO with dependent data. In particular, we show that SAA for CSO retains asymptotic consistency and a finite sample guarantee under mild conditions. In addition, we establish the sample complexity $O(d / \varepsilon^4)$ of SAA for CSO, which is shown to be of the same order as independent cases. Through experiments on several applications, we verify the theoretical results and demonstrate that dependence does not degrade the performance of the SAA approach in real data applications.

NeurIPS Conference 2023 Conference Paper

Gaussian Differential Privacy on Riemannian Manifolds

  • Yangdi Jiang
  • Xiaotian Chang
  • Yi Liu
  • Lei Ding
  • Linglong Kong
  • Bei Jiang

We develop an advanced approach for extending Gaussian Differential Privacy (GDP) to general Riemannian manifolds. The concept of GDP stands out as a prominent privacy definition that strongly warrants extension to manifold settings, due to its central limit properties. By harnessing the power of the renowned Bishop-Gromov theorem in geometric analysis, we propose a Riemannian Gaussian distribution that integrates the Riemannian distance, allowing us to achieve GDP in Riemannian manifolds with bounded Ricci curvature. To the best of our knowledge, this work marks the first instance of extending the GDP framework to accommodate general Riemannian manifolds, encompassing curved spaces, and circumventing the reliance on tangent space summaries. We provide a simple algorithm to evaluate the privacy budget $\mu$ on any one-dimensional manifold and introduce a versatile Markov Chain Monte Carlo (MCMC)-based algorithm to calculate $\mu$ on any Riemannian manifold with constant curvature. Through simulations on one of the most prevalent manifolds in statistics, the unit sphere $S^d$, we demonstrate the superior utility of our Riemannian Gaussian mechanism in comparison to the previously proposed Riemannian Laplace mechanism for implementing GDP.

AAAI Conference 2023 Conference Paper

The Sufficiency of Off-Policyness and Soft Clipping: PPO Is Still Insufficient according to an Off-Policy Measure

  • Xing Chen
  • Dongcui Diao
  • Hechang Chen
  • Hengshuai Yao
  • Haiyin Piao
  • Zhixiao Sun
  • Zhiwei Yang
  • Randy Goebel

The popular Proximal Policy Optimization (PPO) algorithm approximates the solution in a clipped policy space. Does there exist better policies outside of this space? By using a novel surrogate objective that employs the sigmoid function (which provides an interesting way of exploration), we found that the answer is "YES", and the better policies are in fact located very far from the clipped space. We show that PPO is insufficient in "off-policyness", according to an off-policy metric called DEON. Our algorithm explores in a much larger policy space than PPO, and it maximizes the Conservative Policy Iteration (CPI) objective better than PPO during training. To the best of our knowledge, all current PPO methods have the clipping operation and optimize in the clipped policy space. Our method is the first of this kind, which advances the understanding of CPI optimization and policy gradient methods. Code is available at https://github.com/raincchio/P3O.

NeurIPS Conference 2022 Conference Paper

Conformalized Fairness via Quantile Regression

  • Meichen Liu
  • Lei Ding
  • Dengdeng Yu
  • Wulong Liu
  • Linglong Kong
  • Bei Jiang

Algorithmic fairness has received increased attention in socially sensitive domains. While rich literature on mean fairness has been established, research on quantile fairness remains sparse but vital. To fulfill great needs and advocate the significance of quantile fairness, we propose a novel framework to learn a real-valued quantile function under the fairness requirement of Demographic Parity with respect to sensitive attributes, such as race or gender, and thereby derive a reliable fair prediction interval. Using optimal transport and functional synchronization techniques, we establish theoretical guarantees of distribution-free coverage and exact fairness for the induced prediction interval constructed by fair quantiles. A hands-on pipeline is provided to incorporate flexible quantile regressions with an efficient fairness adjustment post-processing algorithm. We demonstrate the superior empirical performance of this approach on several benchmark datasets. Our results show the model’s ability to uncover the mechanism underlying the fairness-accuracy trade-off in a wide range of societal and medical applications.

NeurIPS Conference 2022 Conference Paper

Identification, Amplification and Measurement: A bridge to Gaussian Differential Privacy

  • Yi Liu
  • Ke Sun
  • Bei Jiang
  • Linglong Kong

Gaussian differential privacy (GDP) is a single-parameter family of privacy notions that provides coherent guarantees to avoid the exposure of sensitive individual information. Despite the extra interpretability and tighter bounds under composition GDP provides, many widely used mechanisms (e. g. , the Laplace mechanism) inherently provide GDP guarantees but often fail to take advantage of this new framework because their privacy guarantees were derived under a different background. In this paper, we study the asymptotic properties of privacy profiles and develop a simple criterion to identify algorithms with GDP properties. We propose an efficient method for GDP algorithms to narrow down possible values of an optimal privacy measurement, $\mu$ with an arbitrarily small and quantifiable margin of error. For non GDP algorithms, we provide a post-processing procedure that can amplify existing privacy guarantees to meet the GDP condition. As applications, we compare two single-parameter families of privacy notions, $\epsilon$-DP, and $\mu$-GDP, and show that all $\epsilon$-DP algorithms are intrinsically also GDP. Lastly, we show that the combination of our measurement process and the composition theorem of GDP is a powerful and convenient tool to handle compositions compared to the traditional standard and advanced composition theorems.

AAAI Conference 2022 Conference Paper

Sample Average Approximation for Stochastic Optimization with Dependent Data: Performance Guarantees and Tractability

  • Yafei Wang
  • Bo Pan
  • Wei Tu
  • Peng Liu
  • Bei Jiang
  • Chao Gao
  • Wei Lu
  • Shangling Jui

Sample average approximation (SAA), a popular method for tractably solving stochastic optimization problems, enjoys strong asymptotic performance guarantees in settings with independent training samples. However, these guarantees are not known to hold generally with dependent samples, such as in online learning with time series data or distributed computing with Markovian training samples. In this paper, we show that SAA remains tractable when the distribution of unknown parameters is only observable through dependent instances and still enjoys asymptotic consistency and finite sample guarantees. Specifically, we provide a rigorous probability error analysis to derive 1 - beta confidence bounds for the out-of-sample performance of SAA estimators and show that these estimators are asymptotically consistent. We then, using monotone operator theory, study the performance of a class of stochastic first-order algorithms trained on a dependent source of data. We show that approximation error for these algorithms is bounded and concentrates around zero, and establish deviation bounds for iterates when the underlying stochastic process is phi-mixing. The algorithms presented can be used to handle numerically inconvenient loss functions such as the sum of a smooth and non-smooth function or of non-smooth functions with constraints. To illustrate the usefulness of our results, we present several stochastic versions of popular algorithms such as stochastic proximal gradient descent (S-PGD), stochastic relaxed Peaceman– Rachford splitting algorithms (S-rPRS), and numerical experiment.

AAAI Conference 2022 Conference Paper

Word Embeddings via Causal Inference: Gender Bias Reducing and Semantic Information Preserving

  • Lei Ding
  • Dengdeng Yu
  • Jinhan Xie
  • Wenxing Guo
  • Shenggang Hu
  • Meichen Liu
  • Linglong Kong
  • Hongsheng Dai

With widening deployments of natural language processing (NLP) in daily life, inherited social biases from NLP models have become more severe and problematic. Previous studies have shown that word embeddings trained on humangenerated corpora have strong gender biases that can produce discriminative results in downstream tasks. Previous debiasing methods focus mainly on modeling bias and only implicitly consider semantic information while completely overlooking the complex underlying causal structure among bias and semantic components. To address these issues, we propose a novel methodology that leverages a causal inference framework to effectively remove gender bias. The proposed method allows us to construct and analyze the complex causal mechanisms facilitating gender information flow while retaining oracle semantic information within word embeddings. Our comprehensive experiments show that the proposed method achieves state-of-the-art results in genderdebiasing tasks. In addition, our methods yield better performance in word similarity evaluation and various extrinsic downstream NLP tasks.

NeurIPS Conference 2021 Conference Paper

Damped Anderson Mixing for Deep Reinforcement Learning: Acceleration, Convergence, and Stabilization

  • Ke Sun
  • Yafei Wang
  • Yi Liu
  • Yingnan Zhao
  • Bo Pan
  • Shangling Jui
  • Bei Jiang
  • Linglong Kong

Anderson mixing has been heuristically applied to reinforcement learning (RL) algorithms for accelerating convergence and improving the sampling efficiency of deep RL. Despite its heuristic improvement of convergence, a rigorous mathematical justification for the benefits of Anderson mixing in RL has not yet been put forward. In this paper, we provide deeper insights into a class of acceleration schemes built on Anderson mixing that improve the convergence of deep RL algorithms. Our main results establish a connection between Anderson mixing and quasi-Newton methods and prove that Anderson mixing increases the convergence radius of policy iteration schemes by an extra contraction factor. The key focus of the analysis roots in the fixed-point iteration nature of RL. We further propose a stabilization strategy by introducing a stable regularization term in Anderson mixing and a differentiable, non-expansive MellowMax operator that can allow both faster convergence and more stable behavior. Extensive experiments demonstrate that our proposed method enhances the convergence, stability, and performance of RL algorithms.