Arrow Research search

Author name cluster

Zhengkui Wang

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

9 papers
2 author rows

Possible papers

9

AAAI Conference 2026 Conference Paper

Multi-graph Fusion Cross-model Contrastive Learning for Recommendation

  • Shengjun Ma
  • Yuhai Zhao
  • Fenglong Ma
  • Baoyin Liu
  • Zhengkui Wang
  • Wen Shan

Knowledge Graph (KG)-supported Graph Neural Network models are becoming crucial in recommendation systems due to their ability to mitigate the data sparsity challenge. However, these models remain suboptimal because they overlook the representation differences between the inherent user-item Bipartite Graph (BG) and the external head-relation-tail KG, leading to semantic misalignment. Moreover, they indiscriminately incorporate various types of relations from the KG, which may introduce noise information into the model, ultimately degrading recommendation performance. To address these challenges, we propose an end-to-end model named Multi-graph Fusion Cross-model Contrastive Learning (MFCCL). To uncover users' interest in items and explore the associations between items, we first construct a user-interest graph by integrating information from both the BG and KG, and an item-association graph derived from the BG. We devise a multi-graph representation learning module that incorporates rich semantics into user and item representations in parallel. Simultaneously, a classical collaborative filtering module is introduced to fully leverage user-item collaborative signals. Additionally, we design a novel free data-augmentation cross-model contrastive learning to facilitate the exchange of complementary information between different models. Empirical evaluations on three widely used benchmarks demonstrate that our MFCCL method achieved significant improvements over the baselines.

AAAI Conference 2026 Conference Paper

Self-Supervised Contrastive Re-Learning for Multi-Graph Multi-Label Classification

  • Meixia Wang
  • Yuhai Zhao
  • Zhengkui Wang
  • Yejiang Wang
  • Miaomiao Huang
  • Fenglong Ma
  • Fazal Wahab
  • Wen Shan

Multi-graph multi-label learning (MGML) represents each object as a bag-of-graphs with multiple labels, but demands large-scale labeled data whose acquisition is often difficult and costly. Self-supervised contrastive learning (SCL) mitigates label dependence by leveraging data augmentation to construct discriminative pretext tasks, proving effective for multi-instance learning. However, when applied to MGML, SCL faces two key challenges: (1) it distinguishes individual instances by their differences, whereas MGML requires modeling label correlations; (2) it assumes semantic invariance under augmentation, but structural perturbations in MGML alter label semantics. To tackle these challenges, we propose a self-suPervised contrastive rE-learning framework for mulTi-grAph multi-labeL classification (PETAL). Specifically, to model label correlations, we first define a unified label space to learn label prototypes and align features with them, yielding prototype-aligned representations. We then design a multi-granularity contrastive loss over these representations, which captures label dependencies by contrasting at the bag level, graph level, and bag-graph level. Moreover, to ensure semantic invariance, we develop a contrastive re-learning strategy based on prototype-aligned representations to generate augmentation-free positive samples. This guarantees consistent multi-label distributions without structural perturbations. Experiments on six datasets demonstrate that PETAL achieves an average improvement of 4.12% over state-of-the-art self-supervised and supervised baselines.

AAAI Conference 2026 Short Paper

When Equal Isn’t Fair: Mitigating Over-Normalization in Large Language Models (Student Abstract)

  • Ravada Satyadev
  • Aditya Ganesh Kumar
  • Avinash Anand
  • Rajiv Ratn Shah
  • Zhengkui Wang
  • Mukesh Prasad

Bias in Large Language Models (LLMs) is increasingly addressed through fairness-oriented techniques. However, in some cases, these approaches may inadvertently remove genuine cultural differences between groups, leading to “over-normalization” or models losing important socio-cultural distinctions. In this work, we introduce OverNormEval, a benchmark designed to detect when an LLM exhibits such over-normalization. We further explore the use of Direct Preference Optimization (DPO) to mitigate over-normalization.

ICML Conference 2025 Conference Paper

Equivalence is All: A Unified View for Self-supervised Graph Learning

  • Yejiang Wang
  • Yuhai Zhao
  • Zhengkui Wang
  • Ling Li
  • Jiapu Wang
  • Fangting Li
  • Miaomiao Huang
  • Shirui Pan

Node equivalence is common in graphs, such as computing networks, encompassing automorphic equivalence (preserving adjacency under node permutations) and attribute equivalence (nodes with identical attributes). Despite their importance for learning node representations, these equivalences are largely ignored by existing graph models. To bridge this gap, we propose a GrAph self-supervised Learning framework with Equivalence (GALE) and analyze its connections to existing techniques. Specifically, we: 1) unify automorphic and attribute equivalence into a single equivalence class; 2) enforce the equivalence principle to make representations within the same class more similar while separating those across classes; 3) introduce approximate equivalence classes with linear time complexity to address the NP-hardness of exact automorphism detection and handle node-feature variation; 4) analyze existing graph encoders, noting limitations in message passing neural networks and graph transformers regarding equivalence constraints; 5) show that graph contrastive learning are a degenerate form of equivalence constraint; and 6) demonstrate that GALE achieves superior performance over baselines.

ICML Conference 2025 Conference Paper

N2GON: Neural Networks for Graph-of-Net with Position Awareness

  • Yejiang Wang
  • Yuhai Zhao
  • Zhengkui Wang
  • Wen Shan
  • Ling Li
  • Qian Li 0043
  • Miaomiao Huang
  • Meixia Wang

Graphs, fundamental in modeling various research subjects such as computing networks, consist of nodes linked by edges. However, they typically function as components within larger structures in real-world scenarios, such as in protein-protein interactions where each protein is a graph in a larger network. This study delves into the Graph-of-Net (GON), a structure that extends the concept of traditional graphs by representing each node as a graph itself. It provides a multi-level perspective on the relationships between objects, encapsulating both the detailed structure of individual nodes and the broader network of dependencies. To learn node representations within the GON, we propose a position-aware neural network for Graph-of-Net which processes both intra-graph and inter-graph connections and incorporates additional data like node labels. Our model employs dual encoders and graph constructors to build and refine a constraint network, where nodes are adaptively arranged based on their positions, as determined by the network’s constraint system. Our model demonstrates significant improvements over baselines in empirical evaluations on various datasets.

AAAI Conference 2024 Conference Paper

Limited-Supervised Multi-Label Learning with Dependency Noise

  • Yejiang Wang
  • Yuhai Zhao
  • Zhengkui Wang
  • Wen Shan
  • Xingwei Wang

Limited-supervised multi-label learning (LML) leverages weak or noisy supervision for multi-label classification model training over data with label noise, which contain missing labels and/or redundant labels. Existing studies usually solve LML problems by assuming that label noise is independent of the input features and class labels, while ignoring the fact that noisy labels may depend on the input features (instance-dependent) and the classes (label-dependent) in many real-world applications. In this paper, we propose limited-supervised Multi-label Learning with Dependency Noise (MLDN) to simultaneously identify the instance-dependent and label-dependent label noise by factorizing the noise matrix as the outputs of a mapping from the feature and label representations. Meanwhile, we regularize the problem with the manifold constraint on noise matrix to preserve local relationships and uncover the manifold structure. Theoretically, we bound noise recover error for the resulting problem. We solve the problem by using a first-order scheme based on proximal operator, and the convergence rate of it is at least sub-linear. Extensive experiments conducted on various datasets demonstrate the superiority of our proposed method.

IJCAI Conference 2024 Conference Paper

Towards Robust Multi-Label Learning against Dirty Label Noise

  • Yuhai Zhao
  • Yejiang Wang
  • Zhengkui Wang
  • Wen Shan
  • Miaomiao Huang
  • Meixia Wang
  • Min Huang
  • Xingwei Wang

In multi-label learning, one of the major challenges is that the data are associated with label noise including the random noisy labels (e. g. , data encoding errors) and noisy labels created by annotators (e. g. , missing, extra, or error label), where noise is promoted by different structures (e. g. , gaussian, sparse or subjective). Existing methods are tailored to handle noise with one specific structure. However, they lack of consideration of the fact that the data are always with dirty noisy labels, simutaneously gaussian, sparse and subjective, in real applications. In this paper, we formalize the multi-label learning with dirty noise as a new learning problem, namely Noisy Multi-label Learning (NML). To solve the NML problem, we decompose a corrupted label matrix as the noise matrix plus a true label matrix (maybe high-rank). For the noise matrix, a mixed norm penalty is developed as regularizer for dirty noise distribution. Under this norm, the conditions required for exact noise recovery are provided theoretically. For the true label matrix that is not necessarily low-rank, we apply a non-linear mapping to ensure its low-rankness such that the high-order label correlation can be utilized. Experimental results show that the proposed method outperforms the state-of-the-art methods significantly.

IJCAI Conference 2023 Conference Paper

Explainable Text Classification via Attentive and Targeted Mixing Data Augmentation

  • Songhao Jiang
  • Yan Chu
  • Zhengkui Wang
  • Tianxing Ma
  • Hanlin Wang
  • Wenxuan Lu
  • Tianning Zang
  • Bo Wang

Mixing data augmentation methods have been widely used in text classification recently. However, existing methods do not control the quality of augmented data and have low model explainability. To tackle these issues, this paper proposes an explainable text classification solution based on attentive and targeted mixing data augmentation, ATMIX. Instead of selecting data for augmentation without control, ATMIX focuses on the misclassified training samples as the target for augmentation to better improve the model's capability. Meanwhile, to generate meaningful augmented samples, it adopts a self-attention mechanism to understand the importance of the subsentences in a text, and cut and mix the subsentences between the misclassified and correctly classified samples wisely. Furthermore, it employs a novel dynamic augmented data selection framework based on the loss function gradient to dynamically optimize the augmented samples for model training. In the end, we develop a new model explainability evaluation method based on subsentence attention and conduct extensive evaluations over multiple real-world text datasets. The results indicate that ATMIX is more effective with higher explainability than the typical classification models, hidden-level, and input-level mixup models.

AAAI Conference 2023 Conference Paper

Robust Self-Supervised Multi-Instance Learning with Structure Awareness

  • Yejiang Wang
  • Yuhai Zhao
  • Zhengkui Wang
  • Meixia Wang

Multi-instance learning (MIL) is a supervised learning where each example is a labeled bag with many instances. The typical MIL strategies are to train an instance-level feature extractor followed by aggregating instances features as bag-level representation with labeled information. However, learning such a bag-level representation highly depends on a large number of labeled datasets, which are difficult to get in real-world scenarios. In this paper, we make the first attempt to propose a robust Self-supervised Multi-Instance LEarning architecture with Structure awareness (SMILEs) that learns unsupervised bag representation. Our proposed approach is: 1) permutation invariant to the order of instances in bag; 2) structure-aware to encode the topological structures among the instances; and 3) robust against instances noise or permutation. Specifically, to yield robust MIL model without label information, we augment the multi-instance bag and train the representation encoder to maximize the correspondence between the representations of the same bag in its different augmented forms. Moreover, to capture topological structures from nearby instances in bags, our framework learns optimal graph structures for the bags and these graphs are optimized together with message passing layers and the ordered weighted averaging operator towards contrastive loss. Our main theorem characterizes the permutation invariance of the bag representation. Compared with state-of-the-art supervised MIL baselines, SMILEs achieves average improvement of 4.9%, 4.4% in classification accuracy on 5 benchmark datasets and 20 newsgroups datasets, respectively. In addition, we show that the model is robust to the input corruption.