Author name cluster

Yaming Yang

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

11 papers

1 author row

AAAI Conference 2026 Conference Paper

ProGMLP: A Progressive Framework for GNN-to-MLP Knowledge Distillation with Efficient Trade-offs

Weigang Lu
Ziyu Guan
Wei Zhao
Yaming Yang
Yujie Sun
Zheng Liang
Yibing Zhan
Dapeng Tao

GNN-to-MLP (G2M) methods have emerged as a promising approach to accelerate Graph Neural Networks (GNNs) by distilling their knowledge into simpler Multi-Layer Perceptrons (MLPs). These methods bridge the gap between the expressive power of GNNs and the computational efficiency of MLPs, making them well-suited for resource-constrained environments. However, existing G2M methods are limited by their inability to flexibly adjust inference cost and accuracy dynamically, a critical requirement for real-world applications where computational resources and time constraints can vary significantly. To address this, we introduce a Progressive framework designed to offer flexible and on-demand trade-offs between inference cost and accuracy for GNN-to-MLP knowledge distillation (ProGMLP). ProGMLP employs a Progressive Training Structure (PTS), where multiple MLP students are trained in sequence, each building on the previous one. Furthermore, ProGMLP incorporates Progressive Knowledge Distillation (PKD) to iteratively refine the distillation process from GNNs to MLPs, and Progressive Mixup Augmentation (PMA) to enhance generalization by progressively generating harder mixed samples. Our approach is validated through comprehensive experiments on eight real-world graph datasets, demonstrating that ProGMLP maintains high accuracy while dynamically adapting to varying runtime scenarios, making it highly effective for deployment in diverse application settings.

PDF Details DOI

AAAI Conference 2025 Conference Paper

AGMixup: Adaptive Graph Mixup for Semi-supervised Node Classification

Weigang Lu
Ziyu Guan
Wei Zhao
Yaming Yang
Yibing Zhan
Yiheng Lu
Dapeng Tao

Mixup is a data augmentation technique that enhances model generalization by interpolating between data points using a mixing ratio lambda in the image domain. Recently, the concept of mixup has been adapted to the graph domain through node-centric interpolations. However, these approaches often fail to address the complexity of interconnected relationships, potentially damaging the graph's natural topology and undermining node interactions. Furthermore, current graph mixup methods employ a one-size-fits-all strategy with a randomly sampled lambda for all mixup pairs, ignoring the diverse needs of different pairs. This paper proposes an Adaptive Graph Mixup (AGMixup) framework for semi-supervised node classification. AGMixup introduces a subgraph-centric approach, which treats each subgraph similarly to how images are handled in Euclidean domains, thus facilitating a more natural integration of mixup into graph-based learning. We also propose an adaptive mechanism to tune the mixing ratio lambda for diverse mixup pairs, guided by the contextual similarity and uncertainty of the involved subgraphs. Extensive experiments across seven datasets on semi-supervised node classification benchmarks demonstrate AGMixup's superiority over state-of-the-art graph mixup methods.

PDF Details DOI

NeurIPS Conference 2025 Conference Paper

Defining and Discovering Hyper-meta-paths for Heterogeneous Hypergraphs

Yaming Yang
Ziyu Zheng
Weigang Lu
Zhe Wang
Xinyan Huang
Wei Zhao
Ziyu Guan

Heterogeneous hypergraph is a kind of structural data that contains multiple types of nodes and multiple types of hyperedges. Each hyperedge type corresponds to a specific multi-ary relation (called hyper-relation) among subsets of nodes, which goes beyond traditional pair-wise relations in simple graphs. Existing representation learning methods for heterogeneous hypergraphs typically learn embeddings for nodes and hyperedges based on graph neural networks. Although achieving promising performance, they are still limited in capturing more complex structural features and richer semantics conveyed by the composition of various hyper-relations. To fill this research gap, in this work, we propose the concept of hyper-meta-path for heterogeneous hypergraphs, which is defined as the composition of a sequence of hyper-relations. Besides, we design an attention-based heterogeneous hypergraph neural network (HHNN) to automatically learn the importance of hyper-meta-paths. By exploiting useful ones, HHNN is able to capture more complex structural features to boost the model's performance, as well as leverage their conveyed semantics to improve the model's interpretability. Extensive experiments show that HHNN can achieve significantly better performance than state-of-the-art baselines, and the discovered hyper-meta-paths bring good interpretability for the model predictions. To facilitate the reproducibility of this work, we provide our dataset as well as anonymized source code at: https: //github. com/zhengziyu77/HHNN.

PDF Details

AAAI Conference 2025 Conference Paper

MTL-LoRA: Low-Rank Adaptation for Multi-Task Learning

Yaming Yang
Dilxat Muhtar
Yelong Shen
Yuefeng Zhan
Jianfeng Liu
Yujing Wang
Hao Sun
Weiwei Deng

Parameter-efficient fine-tuning (PEFT) has been widely employed for domain adaptation, with LoRA being one of the most prominent methods due to its simplicity and effectiveness. However, in multi-task learning (MTL) scenarios, LoRA tends to obscure the distinction between tasks by projecting sparse high-dimensional features from different tasks into the same dense low-dimensional intrinsic space. This leads to task interference and suboptimal performance for LoRA and its variants. To tackle this challenge, we propose MTL-LoRA, which retains the advantages of low-rank adaptation while significantly enhancing MTL capabilities. MTL-LoRA augments LoRA by incorporating additional task-adaptive parameters that differentiate task-specific information and capture shared knowledge across various tasks within low-dimensional spaces. This approach enables pretrained models to jointly adapt to different target domains with a limited number of trainable parameters. Comprehensive experimental results, including evaluations on public academic benchmarks for natural language understanding, commonsense reasoning, and image-text understanding, as well as real-world industrial text Ads relevance datasets, demonstrate that MTL-LoRA outperforms LoRA and its various variants with comparable or even fewer learnable parameters in MTL setting.

PDF Details DOI

AAAI Conference 2024 Conference Paper

Entropy Induced Pruning Framework for Convolutional Neural Networks

Yiheng Lu
Ziyu Guan
Yaming Yang
Wei Zhao
Maoguo Gong
Cai Xu

Structured pruning techniques have achieved great compression performance on convolutional neural networks for image classification tasks. However, the majority of existing methods are sensitive with respect to the model parameters, and their pruning results may be unsatisfactory when the original model is trained poorly. That is, they need the original model to be fully trained, to obtain useful weight information. This is time-consuming, and makes the effectiveness of the pruning results dependent on the degree of model optimization. To address the above issue, we propose a novel metric named Average Filter Information Entropy (AFIE). It decomposes the weight matrix of each layer into a low-rank space, and quantifies the filter importance based on the distribution of the normalized eigenvalues. Intuitively, the eigenvalues capture the covariance among filters, and therefore could be a good guide for pruning. Since the distribution of eigenvalues is robust to the updating of parameters, AFIE can yield a stable evaluation for the importance of each filter no matter whether the original model is trained fully. We implement our AFIE-based pruning method for three popular CNN models of AlexNet, VGG-16, and ResNet-50, and test them on three widely-used image datasets MNIST, CIFAR-10, and ImageNet, respectively. The experimental results are encouraging. We surprisingly observe that for our methods, even when the original model is trained with only one epoch, the AFIE score of each filter keeps identical to the results when the model is fully-trained. This fully indicates the effectiveness of the proposed pruning method.

PDF Details DOI

AAAI Conference 2024 Conference Paper

NodeMixup: Tackling Under-Reaching for Graph Neural Networks

Weigang Lu
Ziyu Guan
Wei Zhao
Yaming Yang
Long Jin

Graph Neural Networks (GNNs) have become mainstream methods for solving the semi-supervised node classification problem. However, due to the uneven location distribution of labeled nodes in the graph, labeled nodes are only accessible to a small portion of unlabeled nodes, leading to the under-reaching issue. In this study, we firstly reveal under-reaching by conducting an empirical investigation on various well-known graphs. Then, we demonstrate that under-reaching results in unsatisfactory distribution alignment between labeled and unlabeled nodes through systematic experimental analysis, significantly degrading GNNs' performance. To tackle under-reaching for GNNs, we propose an architecture-agnostic method dubbed NodeMixup. The fundamental idea is to (1) increase the reachability of labeled nodes by labeled-unlabeled pairs mixup, (2) leverage graph structures via fusing the neighbor connections of intra-class node pairs to improve performance gains of mixup, and (3) use neighbor label distribution similarity incorporating node degrees to determine sampling weights for node mixup. Extensive experiments demonstrate the efficacy of NodeMixup in assisting GNNs in handling under-reaching. The source code is available at https://github.com/WeigangLu/NodeMixup.

PDF Details DOI

AAAI Conference 2023 Conference Paper

Progressive Deep Multi-View Comprehensive Representation Learning

Cai Xu
Wei Zhao
Jinglong Zhao
Ziyu Guan
Yaming Yang
Long Chen
Xiangyu Song

Multi-view Comprehensive Representation Learning (MCRL) aims to synthesize information from multiple views to learn comprehensive representations of data items. Prevalent deep MCRL methods typically concatenate synergistic view-specific representations or average aligned view-specific representations in the fusion stage. However, the performance of synergistic fusion methods inevitably degenerate or even fail when partial views are missing in real-world applications; the aligned based fusion methods usually cannot fully exploit the complementarity of multi-view data. To eliminate all these drawbacks, in this work we present a Progressive Deep Multi-view Fusion (PDMF) method. Considering the multi-view comprehensive representation should contain complete information and the view-specific data contain partial information, we deem that it is unstable to directly learn the mapping from partial information to complete information. Hence, PDMF employs a progressive learning strategy, which contains the pre-training and fine-tuning stages. In the pre-training stage, PDMF decodes the auxiliary comprehensive representation to the view-specific data. It also captures the consistency and complementarity by learning the relations between the dimensions of the auxiliary comprehensive representation and all views. In the fine-tuning stage, PDMF learns the mapping from the original data to the comprehensive representation with the help of the auxiliary comprehensive representation and relations. Experiments conducted on a synthetic toy dataset and 4 real-world datasets show that PDMF outperforms state-of-the-art baseline methods. The code is released at https://github.com/winterant/PDMF.

PDF Details DOI

AAAI Conference 2022 Conference Paper

Graph Pointer Neural Networks

Tianmeng Yang
Yujing Wang
Zhihan Yue
Yaming Yang
Yunhai Tong
Jing Bai

Graph Neural Networks (GNNs) have shown advantages in various graph-based applications. Most existing GNNs assume strong homophily of graph structure and apply permutation-invariant local aggregation of neighbors to learn a representation for each node. However, they fail to generalize to heterophilic graphs, where most neighboring nodes have different labels or features, and the relevant nodes are distant. Few recent studies attempt to address this problem by combining multiple hops of hidden representations of central nodes (i. e. , multi-hop-based approaches) or sorting the neighboring nodes based on attention scores (i. e. , rankingbased approaches). As a result, these approaches have some apparent limitations. On the one hand, multi-hop-based approaches do not explicitly distinguish relevant nodes from a large number of multi-hop neighborhoods, leading to a severe over-smoothing problem. On the other hand, ranking-based models do not joint-optimize node ranking with end tasks and result in sub-optimal solutions. In this work, we present Graph Pointer Neural Networks (GPNN) to tackle the challenges mentioned above. We leverage a pointer network to select the most relevant nodes from a large amount of multihop neighborhoods, which constructs an ordered sequence according to the relationship with the central node. 1D convolution is then applied to extract high-level features from the node sequence. The pointer-network-based ranker in GPNN is joint-optimized with other parts in an end-to-end manner. Extensive experiments are conducted on six public node classification datasets with heterophilic graphs. The results show that GPNN significantly improves the classification performance of state-of-the-art methods. In addition, analyses also reveal the privilege of the proposed GPNN in filtering out irrelevant neighbors and reducing over-smoothing.

PDF Details

NeurIPS Conference 2022 Conference Paper

Self-supervised Heterogeneous Graph Pre-training Based on Structural Clustering

Yaming Yang
Ziyu Guan
Zhe Wang
Wei Zhao
Cai Xu
Weigang Lu
Jianbin Huang

Recent self-supervised pre-training methods on Heterogeneous Information Networks (HINs) have shown promising competitiveness over traditional semi-supervised Heterogeneous Graph Neural Networks (HGNNs). Unfortunately, their performance heavily depends on careful customization of various strategies for generating high-quality positive examples and negative examples, which notably limits their flexibility and generalization ability. In this work, we present SHGP, a novel Self-supervised Heterogeneous Graph Pre-training approach, which does not need to generate any positive examples or negative examples. It consists of two modules that share the same attention-aggregation scheme. In each iteration, the Att-LPA module produces pseudo-labels through structural clustering, which serve as the self-supervision signals to guide the Att-HGNN module to learn object embeddings and attention coefficients. The two modules can effectively utilize and enhance each other, promoting the model to learn discriminative embeddings. Extensive experiments on four real-world datasets demonstrate the superior effectiveness of SHGP against state-of-the-art unsupervised baselines and even semi-supervised baselines. We release our source code at: https: //github. com/kepsail/SHGP.

PDF Details

NeurIPS Conference 2021 Conference Paper

WRENCH: A Comprehensive Benchmark for Weak Supervision

Jieyu Zhang
Yue Yu
Yujing Wang
Yaming Yang
Mao Yang
Alexander Ratner

Recent Weak Supervision (WS) approaches have had widespread success in easing the bottleneck of labeling training data for machine learning by synthesizing labels from multiple potentially noisy supervision sources. However, proper measurement and analysis of these approaches remain a challenge. First, datasets used in existing works are often private and/or custom, limiting standardization. Second, WS datasets with the same name and base data often vary in terms of the labels and weak supervision sources used, a significant "hidden" source of evaluation variance. Finally, WS studies often diverge in terms of the evaluation protocol and ablations used. To address these problems, we introduce a benchmark platform, WRENCH, for thorough and standardized evaluation of WS approaches. It consists of 22 varied real-world datasets for classification and sequence tagging; a range of real, synthetic, and procedurally-generated weak supervision sources; and a modular, extensible framework for WS evaluation, including implementations for popular WS methods. We use WRENCH to conduct extensive comparisons over more than 120 method variants to demonstrate its efficacy as a benchmark platform. The code is available at https: //github. com/JieyuZ2/wrench.

PDF Details

AAAI Conference 2020 Conference Paper

TextNAS: A Neural Architecture Search Space Tailored for Text Representation

Yujing Wang
Yaming Yang
Yiren Chen
Jing Bai
Ce Zhang
Guinan Su
Xiaoyu Kou
Yunhai Tong

Learning text representation is crucial for text classiﬁcation and other language related tasks. There are a diverse set of text representation networks in the literature, and how to ﬁnd the optimal one is a non-trivial problem. Recently, the emerging Neural Architecture Search (NAS) techniques have demonstrated good potential to solve the problem. Nevertheless, most of the existing works of NAS focus on the search algorithms and pay little attention to the search space. In this paper, we argue that the search space is also an important human prior to the success of NAS in different applications. Thus, we propose a novel search space tailored for text representation. Through automatic search, the discovered network architecture outperforms state-of-the-art models on various public datasets on text classiﬁcation and natural language inference tasks. Furthermore, some of the design principles found in the automatic network agree well with human intuition.

PDF Details