Author name cluster

Junbin Gao

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

21 papers

2 author rows

NeurIPS Conference 2025 Conference Paper

ACT as Human: Multimodal Large Language Model Data Annotation with Critical Thinking

Lequan Lin
Dai Shi
Andi Han
Feng Chen
Qiuzheng Chen
Jiawen Li
Zhaoyang Li
Jiyuan Zhang

Supervised learning relies on high-quality labeled data, but obtaining such data through human annotation is both expensive and time-consuming. Recent work explores using large language models (LLMs) for annotation, but LLM-generated labels still fall short of human-level quality. To address this problem, we propose the Annotation with Critical Thinking (ACT) data pipeline, where LLMs serve not only as annotators but also as judges to critically identify potential errors. Human effort is then directed towards reviewing only the most "suspicious" cases, significantly improving the human annotation efficiency. Our major contributions are as follows: (1) ACT is applicable to a wide range of domains, including natural language processing (NLP), computer vision (CV), and multimodal understanding, by leveraging multimodal-LLMs (MLLMs). (2) Through empirical studies, we derive 7 insights on how to enhance annotation quality while efficiently reducing the human cost, and then translate these findings into user-friendly guidelines. (3) We theoretically analyze how to modify the loss function so that models trained on ACT data achieve similar performance to those trained on fully human-annotated data. Our experiments show that the performance gap can be reduced to less than 2% on most benchmark datasets while saving up to 90% of human costs.

PDF Details

ICLR Conference 2025 Conference Paper

Diffusing to the Top: Boost Graph Neural Networks with Minimal Hyperparameter Tuning

Lequan Lin
Dai Shi
Andi Han
Zhiyong Wang 0001
Junbin Gao

Graph Neural Networks (GNNs) are proficient in graph representation learning and achieve promising performance on versatile tasks such as node classification and link prediction. Usually, a comprehensive hyperparameter tuning is essential for fully unlocking GNN's top performance, especially for complicated tasks such as node classification on large graphs and long-range graphs. This is usually associated with high computational and time costs and careful design of appropriate search spaces. This work introduces a graph-conditioned latent diffusion framework (GNN-Diff) to generate high-performing GNNs based on the model checkpoints of sub-optimal hyperparameters selected by a light-tuning coarse search. We validate our method through 166 experiments across four graph tasks: node classification on small, large, and long-range graphs, as well as link prediction. Our experiments involve 10 classic and state-of-the-art target models and 20 publicly available datasets. The results consistently demonstrate that GNN-Diff: (1) boosts the performance of GNNs with efficient hyperparameter tuning; and (2) presents high stability and generalizability on unseen data across multiple generation runs. The code is available at https://github.com/lequanlin/GNN-Diff.

Details

AAAI Conference 2025 Conference Paper

HC-LLM: Historical-Constrained Large Language Models for Radiology Report Generation

Tengfei Liu
Jiapu Wang
Yongli Hu
Mingjie Li
Junfei Yi
Xiaojun Chang
Junbin Gao
Baocai Yin

Radiology report generation (RRG) models typically focus on individual exams, often overlooking the integration of historical visual or textual data, which is crucial for patient follow-ups. Traditional methods usually struggle with long sequence dependencies when incorporating historical information, but large language models (LLMs) excel at in-context learning, making them well-suited for analyzing longitudinal medical data. In light of this, we propose a novel Historical-Constrained Large Language Models (HC-LLM) framework for RRG, empowering LLMs with longitudinal report generation capabilities by constraining the consistency and differences between longitudinal images and their corresponding reports. Specifically, our approach extracts both time-shared and time-specific features from longitudinal chest X-rays and diagnostic reports to capture disease progression. Then, we ensure consistent representation by applying intra-modality similarity constraints and aligning various features across modalities with multimodal contrastive and structural constraints. These combined constraints effectively guide the LLMs in generating diagnostic reports that accurately reflect the progression of the disease, achieving state-of-the-art results on the Longitudinal-MIMIC dataset. Notably, our approach performs well even without historical data during testing and can be easily adapted to other multimodal large models, enhancing its versatility.

PDF Details DOI

NeurIPS Conference 2025 Conference Paper

Hybrid-Collaborative Augmentation and Contrastive Sample Adaptive-Differential Awareness for Robust Attributed Graph Clustering

Tianxiang Zhao
Youqing Wang
Jinlu Wang
Jiapu Wang
Mingliang Cui
Junbin Gao
Jipeng Guo

Due to its powerful capability of self-supervised representation learning and clustering, contrastive attributed graph clustering (CAGC) has achieved great success, which mainly depends on effective data augmentation and contrastive objective setting. However, most CAGC methods utilize edges as auxiliary information to obtain node-level embedding representation and only focus on node-level embedding augmentation. This approach overlooks edge-level embedding augmentation and the interactions between node-level and edge-level embedding augmentations across various granularity. Moreover, they often treat all contrastive sample pairs equally, neglecting the significant differences between hard and easy positive-negative sample pairs, which ultimately limits their discriminative capability. To tackle these issues, a novel robust attributed graph clustering (RAGC), incorporating hybrid-collaborative augmentation (HCA) and contrastive sample adaptive-differential awareness (CSADA), is proposed. First, node-level and edge-level embedding representations and augmentations are simultaneously executed to establish a more comprehensive similarity measurement criterion for subsequent contrastive learning. In turn, the discriminative similarity further consciously guides edge augmentation. Second, by leveraging pseudo-label information with high confidence, a CSADA strategy is elaborately designed, which adaptively identifies all contrastive sample pairs and differentially treats them by an innovative weight modulation function. The HCA and CSADA modules mutually reinforce each other in a beneficent cycle, thereby enhancing discriminability in representation learning. Comprehensive graph clustering evaluations over six benchmark datasets demonstrate the effectiveness of the proposed RAGC against several state-of-the-art CAGC methods. The code of RAGC could be available at https: //github. com/TianxiangZhao0474/RAGC. git.

PDF Details

ICLR Conference 2025 Conference Paper

When Graph Neural Networks Meet Dynamic Mode Decomposition

Dai Shi
Lequan Lin
Andi Han
Zhiyong Wang 0001
Yi Guo 0001
Junbin Gao

Graph Neural Networks (GNNs) have emerged as fundamental tools for a wide range of prediction tasks on graph-structured data. Recent studies have drawn analogies between GNN feature propagation and diffusion processes, which can be interpreted as dynamical systems. In this paper, we delve deeper into this perspective by connecting the dynamics in GNNs to modern Koopman theory and its numerical method, Dynamic Mode Decomposition (DMD). We illustrate how DMD can estimate a low-rank, finite-dimensional linear operator based on multiple states of the system, effectively approximating potential nonlinear interactions between nodes in the graph. This approach allows us to capture complex dynamics within the graph accurately and efficiently. We theoretically establish a connection between the DMD-estimated operator and the original dynamic operator between system states. Building upon this foundation, we introduce a family of DMD-GNN models that effectively leverage the low-rank eigenfunctions provided by the DMD algorithm. We further discuss the potential of enhancing our approach by incorporating domain-specific constraints such as symmetry into the DMD computation, allowing the corresponding GNN models to respect known physical properties of the underlying system. Our work paves the path for applying advanced dynamical system analysis tools via GNNs. We validate our approach through extensive experiments on various learning tasks, including directed graphs, large-scale graphs, long-range interactions, and spatial-temporal graphs. We also empirically verify that our proposed models can serve as powerful encoders for link prediction tasks. The results demonstrate that our DMD-enhanced GNNs achieve state-of-the-art performance, highlighting the effectiveness of integrating DMD into GNN frameworks.

Details

ICML Conference 2024 Conference Paper

Diffusion Models Demand Contrastive Guidance for Adversarial Purification to Advance

Mingyuan Bai
Wei Huang
Tenghui Li
Andong Wang
Junbin Gao
Cesar F. Caiafa
Qibin Zhao

In adversarial defense, adversarial purification can be viewed as a special generation task with the purpose to remove adversarial attacks and diffusion models excel in adversarial purification for their strong generative power. With different predetermined generation requirements, various types of guidance have been proposed, but few of them focuses on adversarial purification. In this work, we propose to guide diffusion models for adversarial purification using contrastive guidance. We theoretically derive the proper noise level added in the forward process diffusion models for adversarial purification from a feature learning perspective. For the reverse process, it is implied that the role of contrastive loss guidance is to facilitate the evolution towards the signal direction. From the theoretical findings and implications, we design the forward process with the proper amount of Gaussian noise added and the reverse process with the gradient of contrastive loss as the guidance of diffusion models for adversarial purification. Empirically, extensive experiments on CIFAR-10, CIFAR-100, the German Traffic Sign Recognition Benchmark and ImageNet datasets with ResNet and WideResNet classifiers show that our method outperforms most of current adversarial training and adversarial purification methods by a large improvement.

Details

TMLR Journal 2024 Journal Article

From Continuous Dynamics to Graph Neural Networks: Neural Diffusion and Beyond

Andi Han
Dai Shi
Lequan Lin
Junbin Gao

Graph neural networks (GNNs) have demonstrated significant promise in modelling relational data and have been widely applied in various fields of interest. The key mechanism behind GNNs is the so-called message passing where information is being iteratively aggregated to central nodes from their neighbourhood. Such a scheme has been found to be intrinsically linked to a physical process known as heat diffusion, where the propagation of GNNs naturally corresponds to the evolution of heat density. Analogizing the process of message passing to the heat dynamics allows to fundamentally understand the power and pitfalls of GNNs and consequently informs better model design. Recently, there emerges a plethora of works that proposes GNNs inspired from the continuous dynamics formulation, in an attempt to mitigate the known limitations of GNNs, such as oversmoothing and oversquashing. In this survey, we provide the first systematic and comprehensive review of studies that leverage the continuous perspective of GNNs. To this end, we introduce foundational ingredients for adapting continuous dynamics to GNNs, along with a general framework for the design of graph neural dynamics. We then review and categorize existing works based on their driven mechanisms and underlying dynamics. We also summarize how the limitations of classic GNNs can be addressed under the continuous framework. We conclude by identifying multiple open research directions.

PDF Details

AAAI Conference 2024 Conference Paper

Graph Neural Networks with Soft Association between Topology and Attribute

Yachao Yang
Yanfeng Sun
Shaofan Wang
Jipeng Guo
Junbin Gao
Fujiao Ju
Baocai Yin

Graph Neural Networks (GNNs) have shown great performance in learning representations for graph-structured data. However, recent studies have found that the interference between topology and attribute can lead to distorted node representations. Most GNNs are designed based on homophily assumptions, thus they cannot be applied to graphs with heterophily. This research critically analyzes the propagation principles of various GNNs and the corresponding challenges from an optimization perspective. A novel GNN called Graph Neural Networks with Soft Association between Topology and Attribute (GNN-SATA) is proposed. Different embeddings are utilized to gain insights into attributes and structures while establishing their interconnections through soft association. Further as integral components of the soft association, a Graph Pruning Module (GPM) and Graph Augmentation Module (GAM) are developed. These modules dynamically remove or add edges to the adjacency relationships to make the model better fit with graphs with homophily or heterophily. Experimental results on homophilic and heterophilic graph datasets convincingly demonstrate that the proposed GNN-SATA effectively captures more accurate adjacency relationships and outperforms state-of-the-art approaches. Especially on the heterophilic graph dataset Squirrel, GNN-SATA achieves a 2.81% improvement in accuracy, utilizing merely 27.19% of the original number of adjacency relationships. Our code is released at https://github.com/wwwfadecom/GNN-SATA.

PDF Details DOI

TMLR Journal 2024 Journal Article

Revisiting Generalized p-Laplacian Regularized Framelet GCNs: Convergence, Energy Dynamic and as Non-Linear Diffusion

Dai Shi
Zhiqi Shao
Yi Guo
Qibin Zhao
Junbin Gao

This paper presents a comprehensive theoretical analysis of the graph p-Laplacian regularized framelet network (pL-UFG) to establish a solid understanding of its properties. We conduct a convergence analysis on pL-UFG, addressing the gap in the understanding of its asymptotic behaviors. Further by investigating the generalized Dirichlet energy of pL-UFG, we demonstrate that the Dirichlet energy remains non-zero throughout convergence, ensuring the avoidance of over-smoothing issues. Additionally, we elucidate the energy dynamic perspective, highlighting the synergistic relationship between the implicit layer in pL-UFG and graph framelets. This synergy enhances the model's adaptability to both homophilic and heterophilic data. Notably, we reveal that pL-UFG can be interpreted as a generalized non-linear diffusion process, thereby bridging the gap between pL-UFG and differential equations on the graph. Importantly, these multifaceted analyses lead to unified conclusions that offer novel insights for understanding and implementing pL-UFG, as well as other graph neural network (GNN) models. Finally, based on our dynamic analysis, we propose two novel pL-UFG models with manually controlled energy dynamics. We demonstrate empirically and theoretically that our proposed models not only inherit the advantages of pL-UFG but also significantly reduce computational costs for training on large-scale graph datasets.

PDF Details

TMLR Journal 2024 Journal Article

SA-MLP: Distilling Graph Knowledge from GNNs into Structure-Aware MLP

Jie Chen
Mingyuan Bai
Shouzhen Chen
Junbin Gao
Junping Zhang
Jian Pu

The recursive node fetching and aggregation in message-passing cause inference latency when deploying Graph Neural Networks (GNNs) to large-scale graphs. One promising inference acceleration direction is to distill GNNs into message-passing-free student Multi-Layer Perceptrons (MLPs). However, the MLP student without graph dependency cannot fully learn the structure knowledge from GNNs, which causes inferior performance in heterophilic and online scenarios. To address this problem, we first design a simple yet effective Structure-Aware MLP (SA-MLP) as a student model. It utilizes linear layers as encoders and decoders to capture features and graph structures without message-passing among nodes. Furthermore, we introduce a novel structure-mixing knowledge distillation technique. It generates virtual samples imbued with a hybrid of structure knowledge from teacher GNNs, thereby enhancing the learning ability of MLPs for structure information. Extensive experiments on eight benchmark datasets under both transductive and online settings show that our SA-MLP can consistently achieve similar or even better results than teacher GNNs while maintaining as fast inference speed as MLPs. Our findings reveal that SA-MLP efficiently assimilates graph knowledge through distillation from GNNs in an end-to-end manner, eliminating the need for complex model architectures and preprocessing of features/structures. Our code is available at https://github.com/JC-202/SA-MLP.

PDF Details

TMLR Journal 2023 Journal Article

Nonconvex-nonconcave min-max optimization on Riemannian manifolds

Andi Han
Bamdev Mishra
Pratik Jawanpuria
Junbin Gao

This work studies nonconvex-nonconcave min-max problems on Riemannian manifolds. We first characterize the local optimality of nonconvex-nonconcave problems on manifolds with a generalized notion of local minimax points. We then define the stability and convergence criteria of dynamical systems on manifolds and provide necessary and sufficient conditions of strictly stable equilibrium points for both continuous and discrete dynamics. Additionally, we propose several novel second-order methods on manifolds that provably converge to local minimax points asymptotically. We validate the empirical benefits of the proposed methods with extensive experiments.

PDF Details

AAAI Conference 2021 Conference Paper

Hierarchical Graph Convolution Network for Traffic Forecasting

Kan Guo
Yongli Hu
Yanfeng Sun
Sean Qian
Junbin Gao
Baocai Yin

Traffic forecasting is attracting considerable interest due to its widespread application in intelligent transportation systems. Given the complex and dynamic traffic data, many methods focus on how to establish a spatial-temporal model to express the non-stationary traffic patterns. Recently, the latest Graph Convolution Network (GCN) has been introduced to learn spatial features while the time neural networks are used to learn temporal features. These GCN based methods obtain state-of-the-art performance. However, the current GCN based methods ignore the natural hierarchical structure of traffic systems which is composed of the micro layers of road networks and the macro layers of region networks, in which the nodes are obtained through pooling method and could include some hot traffic regions such as downtown and CBD etc. , while the current GCN is only applied on the micro graph of road networks. In this paper, we propose a novel Hierarchical Graph Convolution Networks (HGC- N) for traffic forecasting by operating on both the micro and macro traffic graphs. The proposed method is evaluated on two complex city traffic speed datasets. Compared to the latest GCN based methods like Graph WaveNet, the proposed HGCN gets higher traffic forecasting precision with lower computational cost. The website of the code is https: //github. com/guokan987/HGCN. git.

PDF Details

ICML Conference 2021 Conference Paper

How Framelets Enhance Graph Neural Networks

Xuebin Zheng
Bingxin Zhou
Junbin Gao
Yu Guang Wang 0001
Pietro Liò
Ming Li 0065
Guido Montúfar

This paper presents a new approach for assembling graph neural networks based on framelet transforms. The latter provides a multi-scale representation for graph-structured data. We decompose an input graph into low-pass and high-pass frequencies coefficients for network training, which then defines a framelet-based graph convolution. The framelet decomposition naturally induces a graph pooling strategy by aggregating the graph feature into low-pass and high-pass spectra, which considers both the feature values and geometry of the graph data and conserves the total information. The graph neural networks with the proposed framelet convolution and pooling achieve state-of-the-art performance in many node and graph prediction tasks. Moreover, we propose shrinkage as a new activation for the framelet convolution, which thresholds high-frequency information at different scales. Compared to ReLU, shrinkage activation improves model performance on denoising and signal compression: noises in both node and structure can be significantly reduced by accurately cutting off the high-pass coefficients from framelet decomposition, and the signal can be compressed to less than half its original size with well-preserved prediction performance.

Details

NeurIPS Conference 2021 Conference Paper

On Riemannian Optimization over Positive Definite Matrices with the Bures-Wasserstein Geometry

Andi Han
Bamdev Mishra
Pratik Kumar Jawanpuria
Junbin Gao

In this paper, we comparatively analyze the Bures-Wasserstein (BW) geometry with the popular Affine-Invariant (AI) geometry for Riemannian optimization on the symmetric positive definite (SPD) matrix manifold. Our study begins with an observation that the BW metric has a linear dependence on SPD matrices in contrast to the quadratic dependence of the AI metric. We build on this to show that the BW metric is a more suitable and robust choice for several Riemannian optimization problems over ill-conditioned SPD matrices. We show that the BW geometry has a non-negative curvature, which further improves convergence rates of algorithms over the non-positively curved AI geometry. Finally, we verify that several popular cost functions, which are known to be geodesic convex under the AI geometry, are also geodesic convex under the BW geometry. Extensive experiments on various applications support our findings.

PDF Details

IJCAI Conference 2021 Conference Paper

Riemannian Stochastic Recursive Momentum Method for non-Convex Optimization

Andi Han
Junbin Gao

We propose a stochastic recursive momentum method for Riemannian non-convex optimization that achieves a nearly-optimal complexity to find epsilon-approximate solution with one sample. The new algorithm requires one-sample gradient evaluations per iteration and does not require restarting with a large batch gradient, which is commonly used to obtain a faster rate. Extensive experiment results demonstrate the superiority of the proposed algorithm. Extensions to nonsmooth and constrained optimization settings are also discussed.

PDF Details DOI

IJCAI Conference 2020 Conference Paper

Hype-HAN: Hyperbolic Hierarchical Attention Network for Semantic Embedding

Chengkun Zhang
Junbin Gao

Hyperbolic space is a well-defined space with constant negative curvature. Recent research demonstrates its odds of capturing complex hierarchical structures with its exceptional high capacity and continuous tree-like properties. This paper bridges hyperbolic space's superiority to the power-law structure of documents by introducing a hyperbolic neural network architecture named Hyperbolic Hierarchical Attention Network (Hype-HAN). Hype-HAN defines three levels of embeddings (word/sentence/document) and two layers of hyperbolic attention mechanism (word-to-sentence/sentence-to-document) on Riemannian geometries of the Lorentz model, Klein model and Poincaré model. Situated on the evolving embedding spaces, we utilize both conventional GRUs (Gated Recurrent Units) and hyperbolic GRUs with Möbius operations. Hype-HAN is applied to large scale datasets. The empirical experiments show the effectiveness of our method.

PDF Details DOI

AAAI Conference 2020 Conference Paper

Shared Generative Latent Representation Learning for Multi-View Clustering

Ming Yin
Weitian Huang
Junbin Gao

Clustering multi-view data has been a fundamental research topic in the computer vision community. It has been shown that a better accuracy can be achieved by integrating information of all the views than just using one view individually. However, the existing methods often struggle with the issues of dealing with the large-scale datasets and the poor performance in reconstructing samples. This paper proposes a novel multi-view clustering method by learning a shared generative latent representation that obeys a mixture of Gaussian distributions. The motivation is based on the fact that the multi-view data share a common latent embedding despite the diversity among the various views. Speciﬁcally, beneﬁtting from the success of the deep generative learning, the proposed model can not only extract the nonlinear features from the views, but render a powerful ability in capturing the correlations among all the views. The extensive experimental results on several datasets with different scales demonstrate that the proposed method outperforms the state-of-the-art methods under a range of performance criteria.

PDF Details

IJCAI Conference 2018 Conference Paper

Cascaded Low Rank and Sparse Representation on Grassmann Manifolds

Boyue Wang
Yongli Hu
Junbin Gao
Yanfeng Sun
Baocai Yin

Inspired by low rank representation and sparse subspace clustering acquiring success, ones attempt to simultaneously perform low rank and sparse constraints on the affinity matrix to improve the performance. However, it is just a trade-off between these two constraints. In this paper, we propose a novel Cascaded Low Rank and Sparse Representation (CLRSR) method for subspace clustering, which seeks the sparse expression on the former learned low rank latent representation. To make our proposed method suitable to multi-dimension or imageset data, we extend CLRSR onto Grassmann manifolds. An effective solution and its convergence analysis are also provided. The excellent experimental results demonstrate the proposed method is more robust than other state-of-the-art clustering methods on imageset data.

PDF Details

AAAI Conference 2018 Conference Paper

Locality Preserving Projection Based on F-norm

Xiangjie Hu
Yanfeng Sun
Junbin Gao
Yongli Hu
Baocai Yin

Locality preserving projection (LPP) is a well-known method for dimensionality reduction in which the neighborhood graph structure of data is preserved. Traditional LPP employ squared F-norm for distance measurement. This may exaggerate more distance errors, and result in a model being sensitive to outliers. In order to deal with this issue, we propose two novel F-norm-based models, termed as F-LPP and F-2DLPP, which are developed for vector-based and matrixbased data, respectively. In F-LPP and F-2DLPP, the distance of data projected to a low dimensional space is measured by F-norm. Thus it is anticipated that both methods can reduce the inﬂuence of outliers. To solve the F-norm-based models, we propose an iterative optimization algorithm, and give the convergence analysis of algorithm. The experimental results on three public databases have demonstrated the effectiveness of our proposed methods.

PDF Details

IJCAI Conference 2017 Conference Paper

Locality Preserving Projections for Grassmann manifold

Boyue Wang
Yongli Hu
Junbin Gao
Yanfeng Sun
Haoran Chen
Muhammad Ali
Baocai Yin

Learning on Grassmann manifold has become popular in many computer vision tasks, with the strong capability to extract discriminative information for imagesets and videos. However, such learning algorithms particularly on high-dimensional Grassmann manifold always involve with significantly high computational cost, which seriously limits the applicability of learning on Grassmann manifold in more wide areas. In this research, we propose an unsupervised dimensionality reduction algorithm on Grassmann manifold based on the Locality Preserving Projections (LPP) criterion. LPP is a commonly used dimensionality reduction algorithm for vector-valued data, aiming to preserve local structure of data in the dimension-reduced space. The strategy is to construct a mapping from higher dimensional Grassmann manifold into the one in a relative low-dimensional with more discriminative capability. The proposed method can be optimized as a basic eigenvalue problem. The performance of our proposed method is assessed on several classification and clustering tasks and the experimental results show show its clear advantages over other Grassmann based algorithms.

PDF Details

AAAI Conference 2016 Conference Paper

Product Grassmann Manifold Representation and Its LRR Models

Boyue Wang
Yongli Hu
Junbin Gao
Yanfeng Sun
Baocai Yin

It is a challenging problem to cluster multi- and highdimensional data with complex intrinsic properties and nonlinear manifold structure. The recently proposed subspace clustering method, Low Rank Representation (LRR), shows attractive performance on data clustering, but it generally does with data in Euclidean spaces. In this paper, we intend to cluster complex high dimensional data with multiple varying factors. We propose a novel representation, namely Product Grassmann Manifold (PGM), to represent these data. Additionally, we discuss the geometry metric of the manifold and expand the conventional LRR model in Euclidean space onto PGM and thus construct a new LRR model. Several clustering experimental results show that the proposed method obtains superior accuracy compared with the clustering methods on manifolds or conventional Euclidean spaces.

PDF Details