Arrow Research search

Author name cluster

Guan Yuan

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

8 papers
1 author row

Possible papers

8

AAAI Conference 2026 Conference Paper

Noise-Aware Graph-Based Cognitive Diagnostic Framework Through Low-Rank Alignment

  • Guixian Zhang
  • Yanmei Zhang
  • Guan Yuan
  • Shang Liu
  • Xiaojing Du
  • Debo Cheng

Graph Neural Networks (GNNs) have effectively improved the performance of Cognitive Diagnosis Models (CDMs). Existing works have proposed a series of Graph-based Cognitive Diagnosis Frameworks (GCDFs) to enhance robustness to noise. However, these robust designs are often general methods for GNNs and are not designed for cognitive diagnosis, which undermines real cognitive information during the denoising process. Interestingly, a noteworthy phenomenon has been overlooked: even without robustness designs, GCDFs can still learn correct information in noisy environments. In this paper, we conduct a comprehensive empirical analysis of this issue. We found that noise primarily accumulates in lower singular components. Even in noisy environments, the principal subspaces of representations still remain stable. Based on these findings, we propose a Noise-aware Cognitive Diagnostic framework based on Low-rank Alignment, named NCDLA. The framework first performs low-rank reconstruction of the interaction matrix between students and exercises, retaining only larger singular values to achieve noise reduction. Then, the reconstructed interaction matrix and the original interaction matrix are combined with the Q matrix to form a noise-reduced heterogeneous graph and an original heterogeneous graph. In order to distinguish between the interaction patterns of correct and incorrect responses, we decompose the heterogeneous graph according to the type of response. NCDLA achieves denoising of student representations and exercises representations through a self-supervised strategy based on low-rank reconstruction and a spectral anchor regularisation method. Extensive experiments on three datasets demonstrate that NCDLA achieves optimal prediction performance and robustness.

AAAI Conference 2026 Conference Paper

Stability-Aware Reinforcement Learning for Robust Class Integration Test Order Generation

  • Yanru Ding
  • Yanmei Zhang
  • Guan Yuan
  • Shujuan Jiang
  • Wei Dai
  • Luciano Baresi

Generating a class integration test order (CITO) is essential to reduce the overhead of test stub construction (the primary cost in integration testing) and to ensure system reliability in complex software systems. Although reinforcement learning (RL) has shown promise in automating CITO generation, existing methods suffer from unstable policy learning and limited robustness against structural perturbations and defect injection. These challenges stem from insufficient reward shaping and the lack of reliable oracles for validation. To address these limitations, we propose LM-CITO, a stability-aware RL framework that integrates Lyapunov-guided reward shaping with semantic validation through metamorphic testing (MT). Specifically, we design a Lyapunov energy function over class dependency graphs to promote monotonic structural convergence during training, and define metamorphic relations (MRs) to verify behavioral consistency under controlled perturbations. Extensive experiments on six real-world systems demonstrate that LM-CITO consistently produces more effective policies, yielding CITOs with significantly reduced stubbing costs compared to baseline models. Furthermore, MT verifies the capability of our MRs to detect defects in 19 injected bug variants, confirming the robustness of LM-CITO under various fault-induced perturbations. These results highlight the synergy of stability guidance and MR-based validation, offering an effective, principled solution for oracle-free RL in software testing.

TIST Journal 2026 Journal Article

Towards Fair Graph Representation Learning by Overcoming Social Homophily

  • Guixian Zhang
  • Guan Yuan
  • Debo Cheng
  • Lin Liu
  • Jiuyong Li
  • Shichao Zhang

With the widespread use of Graph Neural Networks (GNNs) for representation learning from network data, the fairness of GNN models has raised great attention lately. Fair GNNs aim to ensure that node representations can be accurately classified, but not easily associated with a specific group. Existing advanced approaches essentially enhance the generalisation of node representation in combination with data augmentation strategy and do not directly impose constraints on the fairness of GNNs. In this work, we identify that a fundamental reason for the unfairness of GNNs is the phenomenon of social homophily, i.e., users in the same group are more inclined to congregate. The message-passing mechanism of GNNs can cause users in the same group to have similar representations due to social homophily, leading model predictions to establish spurious correlations with sensitive attributes. Inspired by this reason, we propose a method called Equity-Aware GNN (EAGNN) towards fair graph representation learning. Specifically, to ensure that model predictions are independent of sensitive attributes while maintaining prediction performance, we introduce constraints for fair representation learning based on three principles: sufficiency, independence and separation. We theoretically demonstrate that our EAGNN method can effectively achieve group fairness. Extensive experiments on three datasets with varying levels of social homophily illustrate that our EAGNN method achieves the state-of-the-art performance across two fairness metrics and offers competitive effectiveness.

IJCAI Conference 2025 Conference Paper

Causality-Inspired Disentanglement for Fair Graph Neural Networks

  • Guixian Zhang
  • Debo Cheng
  • Guan Yuan
  • Shang Liu
  • Yanmei Zhang

Fair graph neural networks aim to eliminate discriminatory biases in predictions. Existing approaches often rely on adversarial learning to mitigate dependencies between sensitive attributes and labels but face challenges due to optimisation difficulties. A key limitation lies in neglecting intrinsic causality, which may lead to the entanglement of sensitive and causal factors, discarding causal factors or retaining sensitive factors in the final prediction, especially on unbalanced datasets. To address this issue, we propose a Causality-inspired Disentangled framework for Fair Graph neural networks (CDFG). In CDFG, node representations are conceptualised as a combination of causal and sensitive factors, enabling fair representation learning by only utilising the causal factors. We first use a counterfactual data generation mechanism to generate counterfactual data with similar causal factors but completely different sensitive factors. Then, we input real-world data and counterfactual data into the factor disentanglement module to achieve independence and disentanglement between the causal factors and sensitive factors. Finally, an adaptive mask module extracts the causal representation for fair and accurate graph-based predictions. Extensive experiments on three widely used datasets demonstrate that CDFG consistently outperforms existing methods, achieving competitive utility and significantly improved fairness.

IJCAI Conference 2025 Conference Paper

RobustHAR: Multi-scale Spatial-temporal Masked Self-supervised Pre-training for Robust Human Activity Recognition

  • Xiao Liu
  • Guan Yuan
  • Yanmei Zhang
  • Shang Liu
  • Qiuyan Yan

Human activity recognition (HAR) is prone to performance degradation in real-world applications due to data missing between intra-sensor and inter-sensor channels. Masked modeling, as one mainstream paradigm of self-supervised pre-training, can learn robust representations across sensors in the data missing scenario by reconstructing the masked content based on the unmasked part. However, the existing methods predominantly emphasize the temporal dynamics of human activities, which limits their ability to effectively capture the spatial interdependencies among multiple sensors. Besides, different human activities often span across various spatial-temporal scales, which results in activity recognizer failing to capture intricate spatial-temporal semantic information. To address these issues, we propose RobustHAR, a new HAR model with multi-scale spatial-temporal masked self-supervised pre-training designed to improve model performance on the data missing context. RobustHAR involves three main steps: (1) RobustHAR constructs location-inspired spatial-temporal 3D-variation modeling to capture spatial-temporal correlated information in human activity data. (2) RobustHAR then designs multi-scale spatial-temporal masked self-supervised pre-training with semantic-consistent multi-scale feature co-learning for learning robust features at different scales. (3) Finally, RobustHAR fine-tunes the pretraining model with adaptive multi-scale feature fusion for human activity recognition. Extensive experiments on three public multi-sensor datasets demonstrate that RobustHAR outperforms existing state-of-the-art methods.

EAAI Journal 2023 Journal Article

Imbalanced data classification: Using transfer learning and active sampling

  • Yang Liu
  • Guoping Yang
  • Shaojie Qiao
  • Meiqi Liu
  • Lulu Qu
  • Nan Han
  • Guan Yuan
  • Tao Wu

Recently, deep learning models have made great breakthroughs in the field of computer vision, relying on large-scale class-balanced datasets. However, most of them do not consider the class-imbalanced data. In reality, the class-imbalanced distribution can lead to the degradation of model performance, reducing the generalization of these models. In addition, in the era of big data, many applications need to use real-time visual data. These data come from different mobile devices, which continuously generate a huge number of visual data. However, there are few studies using real-time data from information systems, real-time data is easy to capture but difficult to use. In order to solve the above problems, we propose a new model (Transfer Learning Classifier, TLC) based on transfer learning to deal with class-imbalanced data. The model includes active sampling module, real-time data augmentation module and DenseNet module. Among them, (1) the newly proposed active sampling module can dynamically adjust the number of samples with skewed distribution; (2) the data augmentation module can expand the real-time data to avoid over-fitting and insufficient data; (3) the DenseNet module is a standard DenseNet network pre-trained on the ImageNet dataset and transferred to TLC for relearning, and then we adjust the memory usage of the standard DenseNet to make it more efficient. In addition, we have applied a new end-to-end real-time data storage and analysis system. A large number of experiments have been carried out on four different long mantissa data sets. Experimental results show that the proposed TLC model can effectively deal with the static data as well as the real-time data, and the classification effect of imbalanced data is better than that of existing models.

EAAI Journal 2020 Journal Article

A point-of-interest suggestion algorithm in Multi-source geo-social networks

  • Xi Xiong
  • Shaojie Qiao
  • Yuanyuan Li
  • Nan Han
  • Guan Yuan
  • Yongqing Zhang

Newly emerging location-based social network (LBSN) services provide us with new platforms to share interests and individual experience based on their activity history. The problems of data sparsity and user distrust in LBSNs create a severe challenge for traditional recommender systems. Moreover, users’ behaviors in LBSNs show an obvious spatio-temporal pattern. Valuable extra information from microblog-based social networks (MBSNs) can be utilized to improve the effectiveness of POI suggestion. In this study, we propose a latent probabilistic generative model called MTAS, which can accurately capture the underlying information in users’ words extracted from both LBSNs and MBSNs by taking into consideration the decision probability, a latent variable indicating a user’s tendency to publish a review in LBSNs or MBSNs. Then, the parameters of the MTAS model can be inferred by the Gibbs sampling method in an effective manner. Based on MTAS, we design an effective framework to fulfill the top- k suggestion. Extensive experiments on two real geo-social networks show that MTAS achieves better performance than existing state-of-the-art methods.

AIIM Journal 2019 Journal Article

A novel Chinese herbal medicine clustering algorithm via artificial bee colony optimization

  • Nan Han
  • Shaojie Qiao
  • Guan Yuan
  • Ping Huang
  • Dingxiang Liu
  • Kun Yue

Traditional Chinese medicine (TCM) has become popular and been viewed as an effective clinical treatment across the world. Accordingly, there is an ever-increasing interest in performing data analysis over TCM data. Aiming to cope with the problem of excessively depending on empirical values when selecting cluster centers by traditional clustering algorithms, an improved artificial bee colony algorithm is proposed by which to automatically select cluster centers and apply it to aggregate Chinese herbal medicines. The proposed method integrates the following new techniques: (1) improving the artificial bee colony algorithm by applying a new searching strategy of neighbour nectar, (2) employing the improved artificial bee colony algorithm to optimize the parameters of the cutoff distance d c, the local density ρ i and the minimum distance δ i between the element i and any other element with higher density in the cluster algorithm by fast search and finding of density peaks (called DP algorithm) to find the optimal cluster centers, in order to clustering herbal medicines in an accurate fashion with the guarantee of runtime performance. Extensive experiments were conducted on the UCI benchmark datasets and the TCM datasets and the results verify the effectiveness of the proposed method by comparing it with classical clustering algorithms including K-means, K-mediods and DBSCAN in multiple evaluation metrics, that is, Silhouette Coefficient, Entropy, Purity, Precision, Recall and F1-Measure. The results show that the IABC-DP algorithm outperforms other approaches with good clustering quality and accuracy on the UCI and the TCM datasets as well. In addition, it can be found that the improved artificial bee colony algorithm can effectively reduce the number of iterations when compared to the traditional bee colony algorithm. In particular, the IABC-DP algorithm is applied to cluster multi-dimensional Chinese herbal medicines and the result shows that it outperforms other clustering algorithms in clustering Chinese herbal medicines, which can contribute to a larger effort targeted at advancing the study of discovering composition rules of traditional Chinese prescriptions.