Author name cluster

Jiashuo Liu

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

14 papers

2 author rows

AAAI Conference 2026 Conference Paper

Error Slice Discovery via Manifold Compactness

Han Yu
Hao Zou
Jiashuo Liu
Renzhe Xu
Yue He
Xingxuan Zhang
Peng Cui

Despite the great performance of deep learning models in many areas, they still make mistakes and underperform on certain subsets of data, i.e. error slices. Given a trained model, it is important to identify its semantically coherent error slices that are easy to interpret, which is referred to as the error slice discovery problem. However, there is no proper metric of slice coherence without relying on extra information like predefined slice labels. Current evaluation of slice coherence requires access to predefined slices formulated by metadata like attributes or subclasses. Its validity heavily relies on the quality and abundance of metadata, where some possible patterns could be ignored. Besides, current algorithms cannot directly incorporate the constraint of coherence into their optimization objective due to absence of an explicit coherence metric, which could potentially hinder their effectiveness. In this paper, we propose manifold compactness, a coherence metric without reliance on extra information by incorporating the data geometry property into its design, and experiments on typical datasets empirically validate the rationality of the metric. Then we develop Manifold Compactness based error Slice Discovery (MCSD), a novel algorithm that directly treats risk and coherence as the optimization objective, and is flexible to be applied to models of various tasks. Extensive experiments on the benchmark and case studies on other typical datasets demonstrate the superiority of MCSD.

PDF Details DOI

ICLR Conference 2025 Conference Paper

Going Beyond Static: Understanding Shifts with Time-Series Attribution

Jiashuo Liu
Nabeel Seedat
Peng Cui 0001
Mihaela van der Schaar

Distribution shifts in time-series data are complex due to temporal dependencies, multivariable interactions, and trend changes. However, robust methods often rely on structural assumptions that lack thorough empirical validation, limiting their practical applicability. In order to support an empirically grounded inductive approach to research, we introduce our **T**ime-**S**eries **S**hift **A**ttribution (TSSA) framework, which analyzes *problem-specific* patterns of distribution shifts. Our framework attributes performance degradation from various types of shifts to each *temporal data property* in a detailed manner, supported by theoretical analysis of unbiasedness and asymptotic properties. Empirical studies in real-world healthcare applications highlight how the TSSA framework enhances the understanding of time-series shifts, facilitating reliable model deployment and driving targeted improvements from both algorithmic and data-centric perspectives.

Details

ICML Conference 2025 Conference Paper

Topology-Aware Dynamic Reweighting for Distribution Shifts on Graph

Weihuang Zheng
Jiashuo Liu
Jiaxing Li
Jiayun Wu
Peng Cui 0001
Youyong Kong

Graph Neural Networks (GNNs) are widely used for node classification tasks but often fail to generalize when training and test nodes come from different distributions, limiting their practicality. To address this challenge, recent approaches have adopted invariant learning and sample reweighting techniques from the out-of-distribution (OOD) generalization field. However, invariant learning-based methods face difficulties when applied to graph data, as they rely on the impractical assumption of obtaining real environment labels and strict invariance, which may not hold in real-world graph structures. Moreover, current sample reweighting methods tend to overlook topological information, potentially leading to suboptimal results. In this work, we introduce the Topology-Aware Dynamic Reweighting (TAR) framework to address distribution shifts by leveraging the inherent graph structure. TAR dynamically adjusts sample weights through gradient flow on the graph edges during training. Instead of relying on strict invariance assumptions, we theoretically prove that our method is able to provide distributional robustness, thereby enhancing the out-of-distribution generalization performance on graph data. Our framework’s superiority is demonstrated through standard testing on extensive node classification OOD datasets, exhibiting marked improvements over existing methods.

Details

NeurIPS Conference 2024 Conference Paper

Bridging Multicalibration and Out-of-distribution Generalization Beyond Covariate Shift

Jiayun Wu
Jiashuo Liu
Peng Cui
Zhiwei S. Wu

We establish a new model-agnostic optimization framework for out-of-distribution generalization via multicalibration, a criterion that ensures a predictor is calibrated across a family of overlapping groups. Multicalibration is shown to be associated with robustness of statistical inference under covariate shift. We further establish a link between multicalibration and robustness for prediction tasks both under and beyond covariate shift. We accomplish this by extending multicalibration to incorporate grouping functions that consider covariates and labels jointly. This leads to an equivalence of the extended multicalibration and invariance, an objective for robust learning in existence of concept shift. We show a linear structure of the grouping function class spanned by density ratios, resulting in a unifying framework for robust learning by designing specific grouping functions. We propose MC-Pseudolabel, a post-processing algorithm to achieve both extended multicalibration and out-of-distribution generalization. The algorithm, with lightweight hyperparameters and optimization through a series of supervised regression steps, achieves superior performance on real-world datasets with distribution shift.

PDF Details DOI

ICML Conference 2024 Conference Paper

Domain-wise Data Acquisition to Improve Performance under Distribution Shift

Yue He 0001
Dongbai Li
Pengfei Tian
Han Yu 0009
Jiashuo Liu
Hao Zou 0001
Peng Cui 0001

Despite notable progress in enhancing the capability of machine learning against distribution shifts, training data quality remains a bottleneck for cross-distribution generalization. Recently, from a data-centric perspective, there have been considerable efforts to improve model performance through refining the preparation of training data. Inspired by realistic scenarios, this paper addresses a practical requirement of acquiring training samples from various domains on a limited budget to facilitate model generalization to target test domain with distribution shift. Our empirical evidence indicates that the advance in data acquisition can significantly benefit the model performance on shifted data. Additionally, by leveraging unlabeled test domain data, we introduce a Domain-wise Active Acquisition framework. This framework iteratively optimizes the data acquisition strategy as training samples are accumulated, theoretically ensuring the effective approximation of test distribution. Extensive real-world experiments demonstrate our proposal’s advantages in machine learning applications. The code is available at https: //github. com/dongbaili/DAA.

Details

ICML Conference 2024 Conference Paper

Geometry-Calibrated DRO: Combating Over-Pessimism with Free Energy Implications

Jiashuo Liu
Jiayun Wu
Tianyu Wang
Hao Zou 0001
Bo Li 0064
Peng Cui 0001

Machine learning algorithms minimizing average risk are susceptible to distributional shifts. Distributionally Robust Optimization (DRO) addresses this issue by optimizing the worst-case risk within an uncertainty set. However, DRO suffers from over-pessimism, leading to low-confidence predictions, poor parameter estimations as well as poor generalization. In this work, we conduct a theoretical analysis of a probable root cause of over-pessimism: excessive focus on noisy samples. To alleviate the impact of noise, we incorporate data geometry into calibration terms in DRO, resulting in our novel Geometry-Calibrated DRO (GCDRO) for regression. We establish the connection between our risk objective and the Helmholtz free energy in statistical physics, and this free-energy-based risk can extend to standard DRO methods. Leveraging gradient flow in Wasserstein space, we develop an approximate minimax optimization algorithm with a bounded error ratio and elucidate how our approach mitigates noisy sample effects. Comprehensive experiments confirm GCDRO’s superiority over conventional DRO methods.

Details

ICML Conference 2024 Conference Paper

Stability Evaluation through Distributional Perturbation Analysis

Jose H. Blanchet
Peng Cui 0001
Jiajin Li
Jiashuo Liu

The performance of learning models often deteriorates when deployed in out-of-sample environments. To ensure reliable deployment, we propose a stability evaluation criterion based on distributional perturbations. Conceptually, our stability evaluation criterion is defined as the minimal perturbation required on our observed dataset to induce a prescribed deterioration in risk evaluation. In this paper, we utilize the optimal transport (OT) discrepancy with moment constraints on the (sample, density) space to quantify this perturbation. Therefore, our stability evaluation criterion can address both data corruptions and sub-population shifts—the two most common types of distribution shifts in real-world scenarios. To further realize practical benefits, we present a series of tractable convex formulations and computational methods tailored to different classes of loss functions. The key technical tool to achieve this is the strong duality theorem provided in this paper. Empirically, we validate the practical utility of our stability evaluation criterion across a host of real-world applications. These empirical studies showcase the criterion’s ability not only to compare the stability of different learning models and features but also to provide valuable guidelines and strategies to further improve models.

Details

ICLR Conference 2024 Conference Paper

Towards Robust Out-of-Distribution Generalization Bounds via Sharpness

Yingtian Zou
Kenji Kawaguchi
Yingnan Liu 0002
Jiashuo Liu
Mong-Li Lee
Wynne Hsu

Generalizing to out-of-distribution (OOD) data or unseen domain, termed OOD generalization, still lacks appropriate theoretical guarantees. Canonical OOD bounds focus on different distance measurements between source and target domains but fail to consider the optimization property of the learned model. As empirically shown in recent work, sharpness of learned minimum influences OOD generalization. To bridge this gap between optimization and OOD generalization, we study the effect of sharpness on how a model tolerates data change in domain shift which is usually captured by "robustness" in generalization. In this paper, we give a rigorous connection between sharpness and robustness, which gives better OOD guarantees for robust algorithms. It also provides a theoretical backing for "flat minima leads to better OOD generalization". Overall, we propose a sharpness-based OOD generalization bound by taking robustness into consideration, resulting in a tighter bound than non-robust guarantees. Our findings are supported by the experiments on a ridge regression model, as well as the experiments on deep learning classification tasks.

Details

ICLR Conference 2023 Conference Paper

Measure the Predictive Heterogeneity

Jiashuo Liu
Jiayun Wu
Renjie Pi
Renzhe Xu
Xingxuan Zhang
Bo Li 0064
Peng Cui 0001

As an intrinsic and fundamental property of big data, data heterogeneity exists in a variety of real-world applications, such as in agriculture, sociology, health care, etc. For machine learning algorithms, the ignorance of data heterogeneity will significantly hurt the generalization performance and the algorithmic fairness, since the prediction mechanisms among different sub-populations are likely to differ. In this work, we focus on the data heterogeneity that affects the prediction of machine learning models, and first formalize the Predictive Heterogeneity, which takes into account the model capacity and computational constraints. We prove that it can be reliably estimated from finite data with PAC bounds even in high dimensions. Additionally, we propose the Information Maximization (IM) algorithm, a bi-level optimization algorithm, to explore the predictive heterogeneity of data. Empirically, the explored predictive heterogeneity provides insights for sub-population divisions in agriculture, sociology, and object recognition, and leveraging such heterogeneity benefits the out-of-distribution generalization performance.

Details

NeurIPS Conference 2023 Conference Paper

On the Need for a Language Describing Distribution Shifts: Illustrations on Tabular Datasets

Jiashuo Liu
Tianyu Wang
Peng Cui
Hongseok Namkoong

Different distribution shifts require different algorithmic and operational interventions. Methodological research must be grounded by the specific shifts they address. Although nascent benchmarks provide a promising empirical foundation, they \emph{implicitly} focus on covariate shifts, and the validity of empirical findings depends on the type of shift, e. g. , previous observations on algorithmic performance can fail to be valid when the $Y|X$ distribution changes. We conduct a thorough investigation of natural shifts in 5 tabular datasets over 86, 000 model configurations, and find that $Y|X$-shifts are most prevalent. To encourage researchers to develop a refined language for distribution shifts, we build ``WhyShift``, an empirical testbed of curated real-world shifts where we characterize the type of shift we benchmark performance over. Since $Y|X$-shifts are prevalent in tabular settings, we \emph{identify covariate regions} that suffer the biggest $Y|X$-shifts and discuss implications for algorithmic and data-based interventions. Our testbed highlights the importance of future research that builds an understanding of why distributions differ.

PDF Details

NeurIPS Conference 2022 Conference Paper

Distributionally Robust Optimization with Data Geometry

Jiashuo Liu
Jiayun Wu
Bo Li
Peng Cui

Distributionally Robust Optimization (DRO) serves as a robust alternative to empirical risk minimization (ERM), which optimizes the worst-case distribution in an uncertainty set typically specified by distance metrics including $f$-divergence and the Wasserstein distance. The metrics defined in the ostensible high dimensional space lead to exceedingly large uncertainty sets, resulting in the underperformance of most existing DRO methods. It has been well documented that high dimensional data approximately resides on low dimensional manifolds. In this work, to further constrain the uncertainty set, we incorporate data geometric properties into the design of distance metrics, obtaining our novel Geometric Wasserstein DRO (GDRO). Empowered by Gradient Flow, we derive a generically applicable approximate algorithm for the optimization of GDRO, and provide the bounded error rate of the approximation as well as the convergence rate of our algorithm. We also theoretically characterize the edge cases where certain existing DRO methods are the degeneracy of GDRO. Extensive experiments justify the superiority of our GDRO to existing DRO methods in multiple settings with strong distributional shifts, and confirm that the uncertainty set of GDRO adapts to data geometry.

PDF Details

ICML Conference 2021 Conference Paper

Heterogeneous Risk Minimization

Jiashuo Liu
Zheyuan Hu
Peng Cui 0001
Bo Li 0064
Zheyan Shen

Machine learning algorithms with empirical risk minimization usually suffer from poor generalization performance due to the greedy exploitation of correlations among the training data, which are not stable under distributional shifts. Recently, some invariant learning methods for out-of-distribution (OOD) generalization have been proposed by leveraging multiple training environments to find invariant relationships. However, modern datasets are frequently assembled by merging data from multiple sources without explicit source labels. The resultant unobserved heterogeneity renders many invariant learning methods inapplicable. In this paper, we propose Heterogeneous Risk Minimization (HRM) framework to achieve joint learning of latent heterogeneity among the data and invariant relationship, which leads to stable prediction despite distributional shifts. We theoretically characterize the roles of the environment labels in invariant learning and justify our newly proposed HRM framework. Extensive experimental results validate the effectiveness of our HRM framework.

Details

NeurIPS Conference 2021 Conference Paper

Integrated Latent Heterogeneity and Invariance Learning in Kernel Space

Jiashuo Liu
Zheyuan Hu
Peng Cui
Bo Li
Zheyan Shen

The ability to generalize under distributional shifts is essential to reliable machine learning, while models optimized with empirical risk minimization usually fail on non-$i. i. d$ testing data. Recently, invariant learning methods for out-of-distribution (OOD) generalization propose to find causally invariant relationships with multi-environments. However, modern datasets are frequently multi-sourced without explicit source labels, rendering many invariant learning methods inapplicable. In this paper, we propose Kernelized Heterogeneous Risk Minimization (KerHRM) algorithm, which achieves both the latent heterogeneity exploration and invariant learning in kernel space, and then gives feedback to the original neural network by appointing invariant gradient direction. We theoretically justify our algorithm and empirically validate the effectiveness of our algorithm with extensive experiments.

PDF Details

AAAI Conference 2021 Conference Paper

Stable Adversarial Learning under Distributional Shifts

Jiashuo Liu
Zheyan Shen
Peng Cui
Linjun Zhou
Kun Kuang
Bo Li
Yishi Lin

Machine learning algorithms with empirical risk minimization are vulnerable under distributional shifts due to the greedy adoption of all the correlations found in training data. Recently, there are robust learning methods aiming at this problem by minimizing the worst-case risk over an uncertainty set. However, they equally treat all covariates to form the decision sets regardless of the stability of their correlations with the target, resulting in the overwhelmingly large set and low confidence of the learner. In this paper, we propose Stable Adversarial Learning (SAL) algorithm that leverages heterogeneous data sources to construct a more practical uncertainty set and conduct differentiated robustness optimization, where covariates are differentiated according to the stability of their correlations with the target. We theoretically show that our method is tractable for stochastic gradientbased optimization and provide the performance guarantees for our method. Empirical studies on both simulation and real datasets validate the effectiveness of our method in terms of uniformly good performance across unknown distributional shifts.

PDF Details