Arrow Research search

Author name cluster

Kui Yu

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

18 papers
2 author rows

Possible papers

18

TIST Journal 2026 Journal Article

Mutual Information-Guided Style Augmentation for Single Domain Generalization

  • Shuai Yang
  • Zhen Zhang
  • Kui Yu
  • Lichuan Gu
  • Xindong Wu

Single domain generalization aims to develop a robust model trained on a source domain to generalize well on unseen target domains. Recent progress in single domain generalization has focused on expanding the scope of training data through style (e.g., backgrounds) augmentation. However, existing methods are difficult to generate data with large style shifts due to the lack of precise correlation measures between the generated and original data, and they struggle to effectively capture the consistency between the generated and original data when learning feature representations. In this article, we propose a novel Mutual Information-guided Style Augmentation (MISA) based single domain generalization method. Specifically, MISA incorporates a style diversity module, which uses the matrix-based Rényi’s \(\alpha\) -order entropy functionals to compute an approximate mutual information value between the augmented and original data, minimizing it to guide style generator learning. Moreover, MISA combines the merits of the random convolution and affine transformation to further improve the texture diversity of the augmented data. Additionally, MISA introduces a representation learning module, which minimizes the approximate mutual information value between the prediction logits of the original sample and its corresponding residual component to capture the consistency between the generated and original data for feature representation optimization. Using five real-world datasets, the extensive experiments have demonstrated the effectiveness of MISA, in comparison with state-of-the-art methods.

TIST Journal 2026 Journal Article

PMARL: Multi-Agent Reinforcement Learning in Large-Scale Systems

  • Baofu Fang
  • Qiong Wang
  • Hao Wang
  • Kui Yu
  • Zaijun Wang

Large-scale multi-agent systems face two core challenges: inefficient policy learning and the explosion of state dimensions. Existing methods often rely on manually designed task sequences to guide agents’ learning in stages, but these designs lack adaptability to agents’ learning abilities, making it difficult to ensure the rationality of task difficulty. Moreover, the representation capability of current network structures is limited, making it challenging to efficiently handle high-dimensional state information and complex interaction relationships. To address these issues, we propose a Progressive Multi-Agent Reinforcement Learning (PMARL) framework. PMARL introduces a task adapter that adaptively selects task difficulty based on agents’ learning abilities, eliminating reliance on manual experience. Additionally, a Dynamic Dimension Adaptive Network (DDAN) is designed, incorporating hypernetwork and self-attention mechanisms to achieve adaptive feature extraction of high-dimensional states and efficient representation of agent interaction relationships. Experimental results demonstrate that PMARL exhibits higher efficiency and better adaptability compared to existing methods when addressing large-scale multi-agent tasks.

NeurIPS Conference 2025 Conference Paper

Automatic Visual Instrumental Variable Learning for Confounding-Resistant Domain Generalization

  • Fuyuan Cao
  • Shichang Qiao
  • Kui Yu
  • Jiye Liang

Many confounding-resistant domain generalization methods for image classification have been developed based on causal interventions. However, their reliance on strong assumptions limits their effectiveness in handling unobserved confounders. Although recent work introduces instrumental variables (IVs) to overcome this limitation, the reliance on manually predefined instruments, particularly in the context of visual data, may result in severe bias or invalidity when IV conditions are violated. To address these issues, we propose a novel approach to automatically learning Visual Instrumental Variables for confounding-resistant Domain Generalization (VIV-DG). We observe that certain non-causal visual attributes in image data naturally satisfy the basic conditions required for valid IVs. Motivated by this insight, we propose the visual instrumental variable, a novel concept that extends classical IV theory to the visual domain. Furthermore, we develop an automatic visual instrumental variable learner that enforces IV conditions on learned representations, enabling the automatic learning of valid visual instrumental variables from image data. Ultimately, VIV-DG inherits the strengths of classical IVs to mitigate unobserved confounding and avoids the significant bias caused by violations of IV conditions in predefined IVs. Extensive experiments on multiple benchmarks verify that VIV-DG achieves superior generalization ability.

ICML Conference 2025 Conference Paper

Federated Causal Structure Learning with Non-identical Variable Sets

  • Yunxia Wang
  • Fuyuan Cao
  • Kui Yu
  • Jiye Liang

Federated causal structure learning aims to infer causal relationships from data stored on individual clients, with privacy concerns. Most existing methods assume identical variable sets across clients and present federated strategies for aggregating local updates. However, in practice, clients often observe overlapping but non-identical variable sets, and non-overlapping variables may introduce spurious dependencies. Moreover, existing strategies typically reflect only the overall quality of local graphs, ignoring the varying importance of relationships within each graph. In this paper, we study federated causal structure learning with non-identical variable sets, aiming to design an effective strategy for aggregating “correct” and “good” (non-)causal relationships across distributed datasets. Specifically, we first develop theories for detecting spurious dependencies, examining whether the learned relationships are “correct” or not. Furthermore, we define stable relationships as those that are both “correct” and “good” across multiple graphs, and finally design a two-level priority selection strategy for aggregating local updates, obtaining a global causal graph over the integrated variables. Experimental results on synthetic, benchmark and real-world data demonstrate the effectiveness of our method.

AAAI Conference 2025 Conference Paper

Federated Causally Invariant Feature Learning

  • Xianjie Guo
  • Kui Yu
  • Lizhen Cui
  • Han Yu
  • Xiaoxiao Li

Federated feature selection (FFS) is a promising field for selecting informative features while preserving data privacy in federated learning (FL) settings. Existing FFS methods focus on capturing the correlations between features and labels. They struggle to achieve satisfactory performance in the face of data distribution heterogeneity among FL clients, and cannot address the out-of-distribution (OOD) problem that arises when a significant portion of clients do not actively participate in FL training. To address these limitations, we propose Federated Causally Invariant Feature Learning (FedCIFL), a novel approach for learning causally invariant features in a privacy-preserving manner. We design a sample reweighting strategy to eliminate spurious correlations introduced by selection bias and iteratively estimate the federated causal effect between each feature and the labels (with the remaining features initially treated as confounders). By iteratively refining the confounding feature set to identify the true confounders, FedCIFL mitigates the impact of limited local data on the accuracy of federated causal effect estimation. Theoretical analysis proves the correctness of FedCIFL under reasonable assumptions. Extensive experiments on synthetic and real-world datasets demonstrate the superiority of FedCIFL against eight state-of-the-art baselines, beating the best-performing approach by 3.19%, 9.07% and 2.65% in terms of average test Accuracy, RMSE and F1 score, respectively. It is a first-of-its-kind FFS approach capable of handling Non-IID and OOD data simultaneously. The source code is available at https://github.com/Xianjie-Guo/FedCIFL.

IJCAI Conference 2025 Conference Paper

Hybrid Local Causal Discovery

  • Zhaolong Ling
  • Honghui Peng
  • Yiwen Zhang
  • Debo Cheng
  • Xingyu Wu
  • Peng Zhou
  • Kui Yu

Local causal discovery aims to identify and distinguish the direct causes and effects of a target variable from observational data. Due to the inherent incompleteness of local information, popular methods from global causal discovery often face new challenges in local causal discovery tasks, such as 1) erroneous symmetry constraint tests and the resulting cascading errors in constraint-based methods, and 2) confusion within score-based approaches caused by local spurious equivalence classes. To address the above issues, we propose a Hybrid Local Causal Discovery algorithm, called HLCD. Specifically, HLCD initially utilizes a constraint-based approach with the OR rule to obtain a candidate skeleton, which is subsequently refined using a score-based method to eliminate redundant structures. Furthermore, during the local causal orientation phase, HLCD distinguishes between V-structures and equivalence classes by comparing local structure scores between the two, thereby avoiding orientation interference caused by local equivalence class ambiguities. Comprehensive experiments on 14 benchmark Bayesian networks and two real datasets validate that the proposed algorithm outperforms the existing local causal discovery methods.

AAAI Conference 2025 Conference Paper

Local Causal Discovery Without Causal Sufficiency

  • Zhaolong Ling
  • Jiale Yu
  • Yiwen Zhang
  • Debo Cheng
  • Peng Zhou
  • Xingyu Wu
  • Bingbing Jiang
  • Kui Yu

Local causal discovery is crucial for revealing the causal relationships between specific variables from data. Existing local causal discovery algorithms are designed under the assumption of causal sufficiency, which states that there are no latent common causes for two or more of the observed variables in data. However, the assumption of causal sufficiency is often violated in practice. To address this issue, we first propose the local Maximal Ancestral Graph (MAG), referred to as LocalMAG, to describe the local causal relationships of the target variable in the MAG. Then, we propose a local causal discovery algorithm without the assumption of causal sufficiency, called LatentLCD, to learn the LocalMAG. Specifically, LatentLCD first uses the traditional parents and children discovery algorithm to identify the local causal skeleton that includes latent variables and verifies it theoretically. It then identifies bidirectional edges by determining whether both the target variable and its adjacent variables are colliders, thereby identifying latent variables in the local structure of the target variable. Extensive experiments on synthetic datasets have validated that the proposed LatentLCD algorithm significantly outperforms the state-of-the-art methods.

ICLR Conference 2024 Conference Paper

Causal Inference with Conditional Front-Door Adjustment and Identifiable Variational Autoencoder

  • Ziqi Xu 0001
  • Debo Cheng
  • Jiuyong Li
  • Jixue Liu
  • Lin Liu 0003
  • Kui Yu

An essential and challenging problem in causal inference is causal effect estimation from observational data. The problem becomes more difficult with the presence of unobserved confounding variables. The front-door adjustment is an approach for dealing with unobserved confounding variables. However, the restriction for the standard front-door adjustment is difficult to satisfy in practice. In this paper, we relax some of the restrictions by proposing the concept of conditional front-door (CFD) adjustment and develop the theorem that guarantees the causal effect identifiability of CFD adjustment. By leveraging the ability of deep generative models, we propose CFDiVAE to learn the representation of the CFD adjustment variable directly from data with the identifiable Variational AutoEncoder and formally prove the model identifiability. Extensive experiments on synthetic datasets validate the effectiveness of CFDiVAE and its superiority over existing methods. The experiments also show that the performance of CFDiVAE is less sensitive to the causal strength of unobserved confounding variables. We further apply CFDiVAE to a real-world dataset to demonstrate its potential application.

AAAI Conference 2024 Conference Paper

FedCSL: A Scalable and Accurate Approach to Federated Causal Structure Learning

  • Xianjie Guo
  • Kui Yu
  • Lin Liu
  • Jiuyong Li

As an emerging research direction, federated causal structure learning (CSL) aims at learning causal relationships from decentralized data across multiple clients while preserving data privacy. Existing federated CSL algorithms suffer from scalability and accuracy issues, since they require computationally expensive CSL algorithms to be executed at each client. Furthermore, in real-world scenarios, the number of samples held by each client varies significantly, and existing methods still assign equal weights to the learned structural information from each client, which severely harms the learning accuracy of those methods. To address these two limitations, we propose FedCSL, a scalable and accurate method for federated CSL. Specifically, FedCSL consists of two novel strategies: (1) a federated local-to-global learning strategy that enables FedCSL to scale to high-dimensional data for tackling the scalability issue, and (2) a novel weighted aggregation strategy that does not rely on any complex encryption techniques while preserving data privacy for tackling the accuracy issue. Extensive experiments on benchmark datasets, high-dimensional synthetic datasets and a real-world dataset verify the efficacy of the proposed FedCSL method. The source code is available at https://github.com/Xianjie-Guo/FedCSL.

AAAI Conference 2024 Conference Paper

Learning Robust Rationales for Model Explainability: A Guidance-Based Approach

  • Shuaibo Hu
  • Kui Yu

Selective rationalization can be regarded as a straightforward self-explaining approach for enhancing model explainability in natural language processing tasks. It aims to provide explanations that are more accessible and understandable to non-technical users by first selecting subsets of input texts as rationales and then predicting based on chosen subsets. However, existing methods that follow this select-then-predict framework may suffer from the rationalization degeneration problem, resulting in sub-optimal or unsatisfactory rationales that do not align with human judgments. This problem may further lead to rationalization failure, resulting in meaningless rationales that ultimately undermine people's trust in the rationalization model. To address these challenges, we propose a Guidance-based Rationalization method (G-RAT) that effectively improves robustness against failure situations and the quality of rationales by using a guidance module to regularize selections and distributions. Experimental results on two synthetic settings prove that our method is robust to the rationalization degeneration and failure problems, while the results on two real datasets show its effectiveness in providing rationales in line with human judgments. The source code is available at https://github.com/shuaibo919/g-rat.

IJCAI Conference 2024 Conference Paper

Sample Quality Heterogeneity-aware Federated Causal Discovery through Adaptive Variable Space Selection

  • Xianjie Guo
  • Kui Yu
  • Hao Wang
  • Lizhen Cui
  • Han Yu
  • Xiaoxiao Li

Federated causal discovery (FCD) aims to uncover causal relationships among variables from decentralized data across multiple clients, while preserving data privacy. In practice, the sample quality of each client's local data may vary across different variable spaces, referred to as sample quality heterogeneity. Thus, data from different clients might be suitable for learning different causal relationships among variables. Model aggregated under existing FCD methods requires the entire model parameters from each client, thereby being unable to handle the sample quality heterogeneity issue. In this paper, we propose the Federated Adaptive Causal Discovery (FedACD) method to bridge this gap. During federated model aggregation, it adaptively selects the causal relationships learned under the "good" variable space (i. e. , one with high-quality samples) from each client, while masking those learned under the "bad" variable space (i. e. , one with low-quality samples). This way, each client only needs to send the optimal learning results to the server, achieving accurate FCD. Extensive experiments on various types of datasets demonstrate significant advantages of FedACD over existing methods. The source code is available at https: //github. com/Xianjie-Guo/FedACD.

TIST Journal 2023 Journal Article

Causal Feature Selection in the Presence of Sample Selection Bias

  • Shuai Yang
  • Xianjie Guo
  • Kui Yu
  • Xiaoling Huang
  • Tingting Jiang
  • Jin He
  • Lichuan Gu

Almost all existing causal feature selection methods are proposed without considering the problem of sample selection bias. However, in practice, as data-gathering process cannot be fully controlled, sample selection bias often occurs, leading to spurious correlations between features and the class variable, which seriously deteriorates the performance of those existing methods. In this article, we study the problem of causal feature selection under sample selection bias and propose a novel Progressive Causal Feature Selection (PCFS) algorithm which has three phases. First, PCFS learns the sample weights to balance the treated group and control group distributions corresponding to each feature for removing spurious correlations. Second, based on the sample weights, PCFS uses a weighted cross-entropy model to estimate the causal effect of each feature and removes some irrelevant features from the confounder set. Third, PCFS progressively repeats the first two phases to remove more irrelevant features and finally obtains a causal feature set. Using synthetic and real-world datasets, the experiments have validated the effectiveness of PCFS, in comparison with several state-of-the-art classical and causal feature selection methods.

AAAI Conference 2022 Conference Paper

Efficient Causal Structure Learning from Multiple Interventional Datasets with Unknown Targets

  • Yunxia Wang
  • Fuyuan Cao
  • Kui Yu
  • Jiye Liang

We consider the problem of reducing the false discovery rate in multiple high-dimensional interventional datasets under unknown targets. Traditional algorithms merged directly multiple causal graphs learned, which ignores the contradictions of different datasets, leading to lots of inconsistent directions of edges. For reducing the contradictory information, we propose a new algorithm, which first learns an interventional Markov equivalence class (I-MEC) before merging multiple graphs. It utilizes the full power of the constraints available in interventional data and combines ideas from local learning, intervention, and search-and-score techniques in a principled and effective way in different intervention experiments. Specifically, local learning on multiple datasets is used to build a causal skeleton. Perfect intervention destroys some possible triangles, leading to the identification of more possible V-structures. And then a theoretically correct I-MEC is learned. Search and scoring techniques based on the learned I-MEC further identify the remaining unoriented edges. Both theoretical analysis and experiments on benchmark Bayesian networks with the number of variables from 20 to 724 validate that the effectiveness of our algorithm in reducing the false discovery rate in high-dimensional interventional data.

ECAI Conference 2020 Conference Paper

Causal Query in Observational Data with Hidden Variables

  • Debo Cheng
  • Jiuyong Li
  • Lin Liu 0003
  • Jixue Liu
  • Kui Yu
  • Thuc Duy Le

This paper discusses the problem of causal query in observational data with hidden variables, with the aim of seeking the change of an outcome when “manipulating” a variable while given a set of plausible confounding variables which affect the manipulated variable and the outcome. Such an “experiment on data” to estimate the causal effect of the manipulated variable is useful for validating an experiment design using historical data or for exploring confounders when studying a new relationship. However, existing data-driven methods for causal effect estimation face some major challenges, including poor scalability with high dimensional data, low estimation accuracy due to heuristics used by the global causal structure learning algorithms, and the assumption of causal sufficiency when hidden variables are inevitable in data. In this paper, we develop theorems for using local search to find a superset of the adjustment (or confounding) variables for causal effect estimation from observational data under a realistic pretreatment assumption. The theorems ensure that the unbiased estimate of causal effect is included in the set of causal effects estimated by the superset of adjustment variables. Based on the developed theorems, we propose a data-driven algorithm for causal query. Experiments show that the proposed algorithm is faster and produces better causal effect estimation than an existing data-driven causal effect estimation method with hidden variables. The causal effects estimated by the proposed algorithm are as accurate as those by the state-of-the-art methods using domain knowledge.

AAAI Conference 2020 Conference Paper

Multi-Label Causal Feature Selection

  • Xingyu Wu
  • Bingbing Jiang
  • Kui Yu
  • Huanhuan Chen
  • Chunyan Miao

Multi-label feature selection has received considerable attentions during the past decade. However, existing algorithms do not attempt to uncover the underlying causal mechanism, and individually solve different types of variable relationships, ignoring the mutual effects between them. Furthermore, these algorithms lack of interpretability, which can only select features for all labels, but cannot explain the correlation between a selected feature and a certain label. To address these problems, in this paper, we theoretically study the causal relationships in multi-label data, and propose a novel Markov blanket based multi-label causal feature selection (MB-MCF) algorithm. MB-MCF mines the causal mechanism of labels and features first, to obtain a complete representation of information about labels. Based on the causal relationships, MB- MCF then selects predictive features and simultaneously distinguishes common features shared by multiple labels and label-specific features owned by single labels. Experiments on real-world data sets validate that MB-MCF could automatically determine the number of selected features and simultaneously achieve the best performance compared with state-of-the-art methods. An experiment in Emotions data set further demonstrates the interpretability of MB-MCF.

TIST Journal 2019 Journal Article

BAMB

  • Zhaolong Ling
  • Kui Yu
  • Hao Wang
  • Lin Liu
  • Wei Ding
  • Xindong Wu

The discovery of Markov blanket (MB) for feature selection has attracted much attention in recent years, since the MB of the class attribute is the optimal feature subset for feature selection. However, almost all existing MB discovery algorithms focus on either improving computational efficiency or boosting learning accuracy, instead of both. In this article, we propose a novel MB discovery algorithm for balancing efficiency and accuracy, called <underline>BA</underline>lanced <underline>M</underline>arkov <underline>B</underline>lanket (BAMB) discovery. To achieve this goal, given a class attribute of interest, BAMB finds candidate PC (parents and children) and spouses and removes false positives from the candidate MB set in one go. Specifically, once a feature is successfully added to the current PC set, BAMB finds the spouses with regard to this feature, then uses the updated PC and the spouse set to remove false positives from the current MB set. This makes the PC and spouses of the target as small as possible and thus achieves a trade-off between computational efficiency and learning accuracy. In the experiments, we first compare BAMB with 8 state-of-the-art MB discovery algorithms on 7 benchmark Bayesian networks, then we use 10 real-world datasets and compare BAMB with 12 feature selection algorithms, including 8 state-of-the-art MB discovery algorithms and 4 other well-established feature selection methods. On prediction accuracy, BAMB outperforms 12 feature selection algorithms compared. On computational efficiency, BAMB is close to the IAMB algorithm while it is much faster than the remaining seven MB discovery algorithms.

AAAI Conference 2019 Conference Paper

Joint Semi-Supervised Feature Selection and Classification through Bayesian Approach

  • Bingbing Jiang
  • Xingyu Wu
  • Kui Yu
  • Huanhuan Chen

With the increasing data dimensionality, feature selection has become a fundamental task to deal with high-dimensional data. Semi-supervised feature selection focuses on the problem of how to learn a relevant feature subset in the case of abundant unlabeled data with few labeled data. In recent years, many semi-supervised feature selection algorithms have been proposed. However, these algorithms are implemented by separating the processes of feature selection and classifier training, such that they cannot simultaneously select features and learn a classifier with the selected features. Moreover, they ignore the difference of reliability inside unlabeled samples and directly use them in the training stage, which might cause performance degradation. In this paper, we propose a joint semi-supervised feature selection and classification algorithm (JSFS) which adopts a Bayesian approach to automatically select the relevant features and simultaneously learn a classifier. Instead of using all unlabeled samples indiscriminately, JSFS associates each unlabeled sample with a self-adjusting weight to distinguish the difference between them, which can effectively eliminate the irrelevant unlabeled samples via introducing a left-truncated Gaussian prior. Experiments on various datasets demonstrate the effectiveness and superiority of JSFS.