Arrow Research search

Author name cluster

Furui Liu

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

17 papers
2 author rows

Possible papers

17

AAAI Conference 2026 Conference Paper

DSAP: Enhancing Generalization in Goal-Conditioned Reinforcement Learning

  • Yiming Wang
  • Kaiyan Zhao
  • Ming Yang
  • Yan Li
  • Furui Liu
  • Jiayu Chen
  • Leong Hou U

Goal-conditioned Reinforcement Learning (RL) is a promising direction for training agents capable of tackling a variety of tasks. However, generalizing to new goals in different environments remains a central challenge for goal-conditioned RL agents. Existing methods often rely on state abstraction, which involves learning abstracted state representations by excluding irrelevant features, to improve generalization. Despite their success in simplified settings, these methods often fail to generalize effectively to realistic environments with varied goals. In this work, we propose to enhance generalization through state abstraction from the perspective of causal inference. We hypothesize that the generalization gap arises in part due to unobserved confounders: latent variables that simultaneously influence both the global and goal states. To address this, we introduce Deconfounded State Abstraction for Policy learning (DSAP), a novel framework that mitigates backdoor confounding by employing a learned causal graph as a *proxy* for the hidden confounders. We provide theoretical analysis demonstrating that DSAP improves both the learning process and the generalization capability of goal-conditioned policies. Extensive experiments across different settings of multiple benchmarks show that our method significantly outperforms existing methods.

ECAI Conference 2025 Conference Paper

DiTAC: Discrete Teamwork Abstraction for Ad Hoc Collaboration

  • Jing Wang 0055
  • Pengjie Gu
  • Mengchen Zhao
  • Guangyong Chen
  • Furui Liu
  • Pheng-Ann Heng

Training autonomous agents to collaborate with unknown teammates in cooperative multi-agent environments remains a fundamental challenge in ad hoc teamwork research. Conventional approaches rely heavily on online interactions with arbitrary teammates under the assumption of full observability. However, in real-world scenarios, teammate policies are often inaccessible, making historical trajectory rollouts a more practical alternative. We propose DiTAC, a method that learns discrete teamwork abstractions for ad hoc collaboration by automatically extracting latent cooperation patterns from short trajectory segments and adapting effectively to diverse teammate behaviors. To mitigate the out-of-distribution challenge, we constrain learned representations within a discrete code-book. Furthermore, we employ a masked bidirectional transformer architecture to infer teammate behaviors from local observations, thereby relaxing the full observability assumption. Empirical results demonstrate that DiTAC significantly outperforms existing baselines and its variants across widely-used ad hoc teamwork tasks.

AAMAS Conference 2025 Conference Paper

Dual Ensembled Multiagent Q-Learning with Hypernet Regularizer

  • Yaodong Yang
  • Guangyong Chen
  • Hongyao Tang
  • Furui Liu
  • Danruo Deng
  • Pheng-Ann Heng

Overestimation in single-agent reinforcement learning has been extensively studied. In contrast, overestimation in the multiagent setting has received comparatively little attention although it increases with the number of agents and leads to severe learning instability. Previous works concentrate on reducing overestimation in the estimation process of target Q-value. They ignore the follow-up optimization process of online Q-network, thus making it hard to fully address the complex multiagent overestimation problem. To solve this challenge, in this study, we first establish an iterative estimation-optimization analysis framework for multiagent value-mixing Q-learning. Our analysis reveals that multiagent overestimation not only comes from the computation of target Qvalue but also accumulates in the online Q-network’s optimization. Motivated by it, we propose the Dual Ensembled Multiagent Q- Learning with Hypernet Regularizer algorithm to tackle multiagent overestimation from two aspects. First, we extend the random ensemble technique into the estimation of target individual and global Q-values to derive a lower update target. Second, we propose a novel hypernet regularizer on hypernetwork weights and biases to constrain the optimization of online global Q-network to prevent overestimation accumulation. Extensive experiments in MPE and SMAC show that the proposed method successfully addresses overestimation across various tasks1.

JBHI Journal 2025 Journal Article

scSTD: A Swin Transformer-Based Diffusion Model for Recovering scRNA-Seq Data

  • Yang Li
  • Furui Liu
  • Junlei Zhou
  • Fangyuan Shi
  • Zhenhua Yu

Dropout events and technical noise are pervasive challenges in single-cell RNA sequencing (scRNA-seq) data, often obscuring true gene expression profiles and undermining the reliability of downstream analyses. Existing imputation and denoising methods offer partial relief but frequently struggle with over-smoothing and fail to fully capture the complex heterogeneity of cellular states. To address these limitations, we introduce scSTD, a novel imputation and denoising framework that uniquely combines the Swin Transformer (SwinT) architecture with a latent diffusion model. In scSTD, a deep autoencoder first encodes each cell into a compact latent embedding, which is then modeled via a SwinT-based latent diffusion process designed to learn the rich, multimodal distribution of scRNA-seq data. This integration enables scSTD to accurately recover gene expression profiles while preserving subtle biological variation. By synthesizing realistic latent neighbors for each cell and aggregating their decoded outputs, scSTD achieves high-fidelity imputation and denoising. Comprehensive evaluations on both synthetic and real scRNA-seq datasets demonstrate that scSTD significantly outperforms existing methods in recovering true gene expression profiles and maintaining the topological integrity of cellular landscapes.

AAAI Conference 2024 Conference Paper

ANEDL: Adaptive Negative Evidential Deep Learning for Open-Set Semi-supervised Learning

  • Yang Yu
  • Danruo Deng
  • Furui Liu
  • Qi Dou
  • Yueming Jin
  • Guangyong Chen
  • Pheng Ann Heng

Semi-supervised learning (SSL) methods assume that labeled data, unlabeled data and test data are from the same distribution. Open-set semi-supervised learning (Open-set SSL) con- siders a more practical scenario, where unlabeled data and test data contain new categories (outliers) not observed in labeled data (inliers). Most previous works focused on out- lier detection via binary classifiers, which suffer from insufficient scalability and inability to distinguish different types of uncertainty. In this paper, we propose a novel framework, Adaptive Negative Evidential Deep Learning (ANEDL) to tackle these limitations. Concretely, we first introduce evidential deep learning (EDL) as an outlier detector to quantify different types of uncertainty, and design different uncertainty metrics for self-training and inference. Furthermore, we propose a novel adaptive negative optimization strategy, making EDL more tailored to the unlabeled dataset containing both inliers and outliers. As demonstrated empirically, our proposed method outperforms existing state-of-the-art methods across four datasets.

AAAI Conference 2024 Conference Paper

DR-Label: Label Deconstruction and Reconstruction of GNN Models for Catalysis Systems

  • Bowen Wang
  • Chen Liang
  • Jiaze Wang
  • Jiezhong Qiu
  • Furui Liu
  • Shaogang Hao
  • Dong Li
  • Guangyong Chen

Attaining the equilibrium geometry of a catalyst-adsorbate system is key to fundamentally assessing its effective properties, such as adsorption energy. While machine learning methods with advanced representation or supervision strategies have been applied to boost and guide the relaxation processes of catalysis systems, existing methods that produce linearly aggregated geometry predictions are susceptible to edge representations ambiguity, and are therefore vulnerable to graph variations. In this paper, we present a novel graph neural network (GNN) supervision and prediction strategy DR-Label. Our approach mitigates the multiplicity of solutions in edge representation and encourages model predictions that are independent of graph structural variations. DR-Label first Deconstructs finer-grained equilibrium state information to the model by projecting the node-level supervision signal to each edge. Reversely, the model Reconstructs a more robust equilibrium state prediction by converting edge-level predictions to node-level via a sphere-fitting algorithm. When applied to three fundamentally different models, DR-Label consistently enhanced performance. Leveraging the graph structure invariance of the DR-Label strategy, we further propose DRFormer, which applied explicit intermediate positional update and achieves a new state-of-the-art performance on the Open Catalyst 2020 (OC20) dataset and the Cu-based single-atom alloys CO adsorption (SAA) dataset. We expect our work to highlight vital principles for advancing geometric GNN models for catalysis systems and beyond. Our code is available at https://github.com/bowenwang77/DR-Label

NeurIPS Conference 2024 Conference Paper

Rethinking Exploration in Reinforcement Learning with Effective Metric-Based Exploration Bonus

  • Yiming Wang
  • Kaiyan Zhao
  • Furui Liu
  • Leong Hou U

Enhancing exploration in reinforcement learning (RL) through the incorporation of intrinsic rewards, specifically by leveraging *state discrepancy* measures within various metric spaces as exploration bonuses, has emerged as a prevalent strategy to encourage agents to visit novel states. The critical factor lies in how to quantify the difference between adjacent states as *novelty* for promoting effective exploration. Nonetheless, existing methods that evaluate state discrepancy in the latent space under $L_1$ or $L_2$ norm often depend on count-based episodic terms as scaling factors for exploration bonuses, significantly limiting their scalability. Additionally, methods that utilize the bisimulation metric for evaluating state discrepancies face a theory-practice gap due to improper approximations in metric learning, particularly struggling with *hard exploration* tasks. To overcome these challenges, we introduce the **E**ffective **M**etric-based **E**xploration-bonus (EME). EME critically examines and addresses the inherent limitations and approximation inaccuracies of current metric-based state discrepancy methods for exploration, proposing a robust metric for state discrepancy evaluation backed by comprehensive theoretical analysis. Furthermore, we propose the diversity-enhanced scaling factor integrated into the exploration bonus to be dynamically adjusted by the variance of prediction from an ensemble of reward models, thereby enhancing exploration effectiveness in particularly challenging scenarios. Extensive experiments are conducted on hard exploration tasks within Atari games, Minigrid, Robosuite, and Habitat, which illustrate our method's scalability to various scenarios. The project website can be found at https: //sites. google. com/view/effective-metric-exploration.

AAAI Conference 2024 Conference Paper

Where and How to Attack? A Causality-Inspired Recipe for Generating Counterfactual Adversarial Examples

  • Ruichu Cai
  • Yuxuan Zhu
  • Jie Qiao
  • Zefeng Liang
  • Furui Liu
  • Zhifeng Hao

Deep neural networks (DNNs) have been demonstrated to be vulnerable to well-crafted adversarial examples, which are generated through either well-conceived L_p-norm restricted or unrestricted attacks. Nevertheless, the majority of those approaches assume that adversaries can modify any features as they wish, and neglect the causal generating process of the data, which is unreasonable and unpractical. For instance, a modification in income would inevitably impact features like the debt-to-income ratio within a banking system. By considering the underappreciated causal generating process, first, we pinpoint the source of the vulnerability of DNNs via the lens of causality, then give theoretical results to answer where to attack. Second, considering the consequences of the attack interventions on the current state of the examples to generate more realistic adversarial examples, we propose CADE, a framework that can generate Counterfactual ADversarial Examples to answer how to attack. The empirical results demonstrate CADE's effectiveness, as evidenced by its competitive performance across diverse attack scenarios, including white-box, transfer-based, and random intervention attacks.

NeurIPS Conference 2023 Conference Paper

Efficient Potential-based Exploration in Reinforcement Learning using Inverse Dynamic Bisimulation Metric

  • Yiming Wang
  • Ming Yang
  • Renzhi Dong
  • Binbin Sun
  • Furui Liu
  • Leong Hou U

Reward shaping is an effective technique for integrating domain knowledge into reinforcement learning (RL). However, traditional approaches like potential-based reward shaping totally rely on manually designing shaping reward functions, which significantly restricts exploration efficiency and introduces human cognitive biases. While a number of RL methods have been proposed to boost exploration by designing an intrinsic reward signal as exploration bonus. Nevertheless, these methods heavily rely on the count-based episodic term in their exploration bonus which falls short in scalability. To address these limitations, we propose a general end-to-end potential-based exploration bonus for deep RL via potentials of state discrepancy, which motivates the agent to discover novel states and provides them with denser rewards without manual intervention. Specifically, we measure the novelty of adjacent states by calculating their distance using the bisimulation metric-based potential function, which enhances agent's exploration and ensures policy invariance. In addition, we offer a theoretical guarantee on our inverse dynamic bisimulation metric, bounding the value difference and ensuring that the agent explores states with higher TD error, thus significantly improving training efficiency. The proposed approach is named \textbf{LIBERTY} (exp\textbf{L}oration v\textbf{I}a \textbf{B}isimulation m\textbf{E}t\textbf{R}ic-based s\textbf{T}ate discrepanc\textbf{Y}) which is comprehensively evaluated on the MuJoCo and the Arcade Learning Environments. Extensive experiments have verified the superiority and scalability of our algorithm compared with other competitive methods.

NeurIPS Conference 2023 Conference Paper

Invariant Learning via Probability of Sufficient and Necessary Causes

  • Mengyue Yang
  • Zhen Fang
  • Yonggang Zhang
  • Yali Du
  • Furui Liu
  • Jean-Francois Ton
  • Jianhong Wang
  • Jun Wang

Out-of-distribution (OOD) generalization is indispensable for learning models in the wild, where testing distribution typically unknown and different from the training. Recent methods derived from causality have shown great potential in achieving OOD generalization. However, existing methods mainly focus on the invariance property of causes, while largely overlooking the property of sufficiency and necessity conditions. Namely, a necessary but insufficient cause (feature) is invariant to distribution shift, yet it may not have required accuracy. By contrast, a sufficient yet unnecessary cause (feature) tends to fit specific data well but may have a risk of adapting to a new domain. To capture the information of sufficient and necessary causes, we employ a classical concept, the probability of sufficiency and necessary causes (PNS), which indicates the probability of whether one is the necessary and sufficient cause. To associate PNS with OOD generalization, we propose PNS risk and formulate an algorithm to learn representation with a high PNS value. We theoretically analyze and prove the generalizability of the PNS risk. Experiments on both synthetic and real-world benchmarks demonstrate the effectiveness of the proposed method. The detailed implementation can be found at the GitHub repository: https: //github. com/ymy4323460/CaSN.

AAAI Conference 2023 Conference Paper

Learning from Good Trajectories in Offline Multi-Agent Reinforcement Learning

  • Qi Tian
  • Kun Kuang
  • Furui Liu
  • Baoxiang Wang

Offline multi-agent reinforcement learning (MARL) aims to learn effective multi-agent policies from pre-collected datasets, which is an important step toward the deployment of multi-agent systems in real-world applications. However, in practice, each individual behavior policy that generates multi-agent joint trajectories usually has a different level of how well it performs. e.g., an agent is a random policy while other agents are medium policies. In the cooperative game with global reward, one agent learned by existing offline MARL often inherits this random policy, jeopardizing the utility of the entire team. In this paper, we investigate offline MARL with explicit consideration on the diversity of agent-wise trajectories and propose a novel framework called Shared Individual Trajectories (SIT) to address this problem. Specifically, an attention-based reward decomposition network assigns the credit to each agent through a differentiable key-value memory mechanism in an offline manner. These decomposed credits are then used to reconstruct the joint offline datasets into prioritized experience replay with individual trajectories, thereafter agents can share their good trajectories and conservatively train their policies with a graph attention network (GAT) based critic. We evaluate our method in both discrete control (i.e., StarCraft II and multi-agent particle environment) and continuous control (i.e., multi-agent mujoco). The results indicate that our method achieves significantly better results in complex and mixed offline multi-agent datasets, especially when the difference of data quality between individual trajectories is large.

AAAI Conference 2023 Conference Paper

Learning Instrumental Variable from Data Fusion for Treatment Effect Estimation

  • Anpeng Wu
  • Kun Kuang
  • Ruoxuan Xiong
  • Minqin Zhu
  • Yuxuan Liu
  • Bo Li
  • Furui Liu
  • Zhihua Wang

The advent of the big data era brought new opportunities and challenges to draw treatment effect in data fusion, that is, a mixed dataset collected from multiple sources (each source with an independent treatment assignment mechanism). Due to possibly omitted source labels and unmeasured confounders, traditional methods cannot estimate individual treatment assignment probability and infer treatment effect effectively. Therefore, we propose to reconstruct the source label and model it as a Group Instrumental Variable (GIV) to implement IV-based Regression for treatment effect estimation. In this paper, we conceptualize this line of thought and develop a unified framework (Meta-EM) to (1) map the raw data into a representation space to construct Linear Mixed Models for the assigned treatment variable; (2) estimate the distribution differences and model the GIV for the different treatment assignment mechanisms; and (3) adopt an alternating training strategy to iteratively optimize the representations and the joint distribution to model GIV for IV regression. Empirical results demonstrate the advantages of our Meta-EM compared with state-of-the-art methods. The project page with the code and the Supplementary materials is available at https://github.com/causal-machine-learning-lab/meta-em.

ICML Conference 2023 Conference Paper

Uncertainty Estimation by Fisher Information-based Evidential Deep Learning

  • Danruo Deng
  • Guangyong Chen
  • Yang Yu
  • Furui Liu
  • Pheng-Ann Heng

Uncertainty estimation is a key factor that makes deep learning reliable in practical applications. Recently proposed evidential neural networks explicitly account for different uncertainties by treating the network’s outputs as evidence to parameterize the Dirichlet distribution, and achieve impressive performance in uncertainty estimation. However, for high data uncertainty samples but annotated with the one-hot label, the evidence-learning process for those mislabeled classes is over-penalized and remains hindered. To address this problem, we propose a novel method, Fisher Information-based Evidential Deep Learning ($\mathcal{I}$-EDL). In particular, we introduce Fisher Information Matrix (FIM) to measure the informativeness of evidence carried by each sample, according to which we can dynamically reweight the objective loss terms to make the network more focus on the representation learning of uncertain classes. The generalization ability of our network is further improved by optimizing the PAC-Bayesian bound. As demonstrated empirically, our proposed method consistently outperforms traditional EDL-related algorithms in multiple uncertainty estimation tasks, especially in the more challenging few-shot classification settings.

NeurIPS Conference 2022 Conference Paper

ConfounderGAN: Protecting Image Data Privacy with Causal Confounder

  • Qi Tian
  • Kun Kuang
  • Kelu Jiang
  • Furui Liu
  • Zhihua Wang
  • Fei Wu

The success of deep learning is partly attributed to the availability of massive data downloaded freely from the Internet. However, it also means that users' private data may be collected by commercial organizations without consent and used to train their models. Therefore, it's important and necessary to develop a method or tool to prevent unauthorized data exploitation. In this paper, we propose ConfounderGAN, a generative adversarial network (GAN) that can make personal image data unlearnable to protect the data privacy of its owners. Specifically, the noise produced by the generator for each image has the confounder property. It can build spurious correlations between images and labels, so that the model cannot learn the correct mapping from images to labels in this noise-added dataset. Meanwhile, the discriminator is used to ensure that the generated noise is small and imperceptible, thereby remaining the normal utility of the encrypted image for humans. The experiments are conducted in six image classification datasets, including three natural object datasets and three medical datasets. The results demonstrate that our method not only outperforms state-of-the-art methods in standard settings, but can also be applied to fast encryption scenarios. Moreover, we show a series of transferability and stability experiments to further illustrate the effectiveness and superiority of our method.

ICML Conference 2022 Conference Paper

Deconfounded Value Decomposition for Multi-Agent Reinforcement Learning

  • Jiahui Li 0003
  • Kun Kuang 0001
  • Baoxiang Wang 0001
  • Furui Liu
  • Long Chen 0016
  • Changjie Fan
  • Fei Wu 0001
  • Jun Xiao 0001

Value decomposition (VD) methods have been widely used in cooperative multi-agent reinforcement learning (MARL), where credit assignment plays an important role in guiding the agents’ decentralized execution. In this paper, we investigate VD from a novel perspective of causal inference. We first show that the environment in existing VD methods is an unobserved confounder as the common cause factor of the global state and the joint value function, which leads to the confounding bias on learning credit assignment. We then present our approach, deconfounded value decomposition (DVD), which cuts off the backdoor confounding path from the global state to the joint value function. The cut is implemented by introducing the trajectory graph, which depends only on the local trajectories, as a proxy confounder. DVD is general enough to be applied to various VD methods, and extensive experiments show that DVD can consistently achieve significant performance gains over different state-of-the-art VD methods on StarCraft II and MACO benchmarks.

JMLR Journal 2022 Journal Article

Weakly Supervised Disentangled Generative Causal Representation Learning

  • Xinwei Shen
  • Furui Liu
  • Hanze Dong
  • Qing Lian
  • Zhitang Chen
  • Tong Zhang

This paper proposes a Disentangled gEnerative cAusal Representation (DEAR) learning method under appropriate supervised information. Unlike existing disentanglement methods that enforce independence of the latent variables, we consider the general case where the underlying factors of interests can be causally related. We show that previous methods with independent priors fail to disentangle causally related factors even under supervision. Motivated by this finding, we propose a new disentangled learning method called DEAR that enables causal controllable generation and causal representation learning. The key ingredient of this new formulation is to use a structural causal model (SCM) as the prior distribution for a bidirectional generative model. The prior is then trained jointly with a generator and an encoder using a suitable GAN algorithm incorporated with supervised information on the ground-truth factors and their underlying causal structure. We provide theoretical justification on the identifiability and asymptotic convergence of the proposed method. We conduct extensive experiments on both synthesized and real data sets to demonstrate the effectiveness of DEAR in causal controllable generation, and the benefits of the learned representations for downstream tasks in terms of sample efficiency and distributional robustness. [abs] [ pdf ][ bib ] [ code ] &copy JMLR 2022. ( edit, beta )

TIST Journal 2015 Journal Article

Causal Discovery on Discrete Data with Extensions to Mixture Model

  • Furui Liu
  • Laiwan Chan

In this article, we deal with the causal discovery problem on discrete data. First, we present a causal discovery method for traditional additive noise models that identifies the causal direction by analyzing the supports of the conditional distributions. Then, we present a causal mixture model to address the problem that the function transforming cause to effect varies across the observations. We propose a novel method called Support Analysis (SA) for causal discovery with the mixture model. Experiments using synthetic and real data are presented to demonstrate the performance of our proposed algorithm.