Author name cluster

Yujun Li

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

10 papers

2 author rows

AAAI Conference 2025 Conference Paper

Edge Contrastive Learning: An Augmentation-Free Graph Contrastive Learning Model

Yujun Li
Hongyuan Zhang
Yuan Yuan

Graph contrastive learning (GCL) aims to learn representations from unlabeled graph data in a self-supervised manner and has developed rapidly in recent years. However, edge-level contrasts are not well explored by most existing GCL methods. Most studies in GCL only regard edges as auxiliary information while updating node features. One of the primary obstacles of edge-based GCL is the heavy computation burden. To tackle this issue, we propose a model that can efficiently learn edge features for GCL, namely Augmentation-Free Edge Contrastive Learning (AFECL) to achieve edge-edge contrast. AFECL depends on no augmentation consisting of two parts. Firstly, we design a novel edge feature generation method, where edge features are computed by embedding concatenation of their connected nodes. Secondly, an edge contrastive learning scheme is developed, where edges connecting the same nodes are defined as positive pairs, and other edges are defined as negative pairs. Experimental results show that compared with recent state-of-the-art GCL methods or even some supervised GNNs, AFECL achieves SOTA performance on link prediction and semi-supervised node classification of extremely scarce labels.

PDF Details DOI

ICML Conference 2025 Conference Paper

Flat-LoRA: Low-Rank Adaptation over a Flat Loss Landscape

Tao Li
Zhengbao He
Yujun Li
Yasheng Wang
Lifeng Shang
Xiaolin Huang

Fine-tuning large-scale pre-trained models is prohibitively expensive in terms of computation and memory costs. Low-Rank Adaptation (LoRA), a popular Parameter-Efficient Fine-Tuning (PEFT) method, offers an efficient solution by optimizing only low-rank matrices. Despite recent progress in improving LoRA’s performance, the relationship between the LoRA optimization space and the full parameter space is often overlooked. A solution that appears flat in the loss landscape of the LoRA space may still exhibit sharp directions in the full parameter space, potentially compromising generalization. We introduce Flat-LoRA, which aims to identify a low-rank adaptation situated in a flat region of the full parameter space. Instead of adopting the well-established sharpness-aware minimization approach, which incurs significant computation and memory overheads, we employ a Bayesian expectation loss objective to preserve training efficiency. Further, we design a refined strategy for generating random perturbations to enhance performance and carefully manage memory overhead using random seeds. Experiments across diverse tasks—including mathematical reasoning, coding abilities, dialogue generation, instruction following, and text-to-image generation—demonstrate that Flat-LoRA improves both in-domain and out-of-domain generalization. Code is available at https: //github. com/nblt/Flat-LoRA.

Details

EAAI Journal 2025 Journal Article

Mixed-type micro-defect detection in semiconductor Wafers: A dual-modal feature real-time detection approach via optical topography and lightweight classification network

Shangbin Jiao
Wei Yang
Chenyan Wu
Yujun Li
Bing Xue

Details DOI

EAAI Journal 2025 Journal Article

Multi-scale patch transformer with adaptive decomposition for carbon emissions forecasting

Xiang Li
Lei Chu
Yujun Li
Fengqian Ding
Zhenzhen Quan
Fangx Qu
Zhanjun Xing

Details DOI

NeurIPS Conference 2025 Conference Paper

RidgeLoRA: Matrix Ridge Enhanced Low-Rank Adaptation of Large Language Models

Junda Zhu
Jun Ai
Yujun Li
Yichun Yin
Yasheng Wang
Lifeng Shang
Qun Liu

As one of the state-of-the-art parameter-efficient fine-tuning~(PEFT) methods, Low-Rank Adaptation (LoRA) enables model optimization with reduced computational cost through trainable low-rank matrix. However, the low-rank nature makes it prone to produce a decrease in the representation ability, leading to suboptimal performance. In order to break this limitation, we propose RidgeLoRA, a lightweight architecture like LoRA that incorporates novel architecture and matrix ridge enhanced full-rank approximation, to match the performance of full-rank training, while eliminating the need for high memory and a large number of parameters to restore the rank of matrices. We provide a rigorous mathematical derivation to prove that RidgeLoRA has a better upper bound on the representations than vanilla LoRA. Furthermore, extensive experiments across multiple domains demonstrate that RidgeLoRA achieves better performance than other LoRA variants, and can even match or surpass full-rank training.

PDF Details

JBHI Journal 2024 Journal Article

Multi-Contrast Complementary Learning for Accelerated MR Imaging

Bangjun Li
Weifeng Hu
Chun-Mei Feng
Yujun Li
Zhi Liu
Yong Xu

Thanks to its powerful ability to depict high-resolution anatomical information, magnetic resonance imaging (MRI) has become an essential non-invasive scanning technique in clinical practice. However, excessive acquisition time often leads to the degradation of image quality and psychological discomfort among subjects, hindering its further popularization. Besides reconstructing images from the undersampled protocol itself, multi-contrast MRI protocols bring promising solutions by leveraging additional morphological priors for the target modality. Nevertheless, previous multi-contrast techniques mainly adopt a simple fusion mechanism that inevitably ignores valuable knowledge. In this work, we propose a novel multi-contrast complementary information aggregation network named MCCA, aiming to exploit available complementary representations fully to reconstruct the undersampled modality. Specifically, a multi-scale feature fusion mechanism has been introduced to incorporate complementary-transferable knowledge into the target modality. Moreover, a hybrid convolution transformer block was developed to extract global-local context dependencies simultaneously, which combines the advantages of CNNs while maintaining the merits of Transformers. Compared to existing MRI reconstruction methods, the proposed method has demonstrated its superiority through extensive experiments on different datasets under different acceleration factors and undersampling patterns.

Details DOI

NeurIPS Conference 2022 Conference Paper

Finding Second-Order Stationary Points in Nonconvex-Strongly-Concave Minimax Optimization

Luo Luo
Yujun Li
Cheng Chen

We study the smooth minimax optimization problem $\min_{\bf x}\max_{\bf y} f({\bf x}, {\bf y})$, where $f$ is $\ell$-smooth, strongly-concave in ${\bf y}$ but possibly nonconvex in ${\bf x}$. Most of existing works focus on finding the first-order stationary point of the function $f({\bf x}, {\bf y})$ or its primal function $P({\bf x})\triangleq \max_{\bf y} f({\bf x}, {\bf y})$, but few of them focus on achieving the second-order stationary point, which is essential to nonconvex problems. In this paper, we propose a novel approach for minimax optimization, called Minimax Cubic Newton (MCN), which could find an ${\mathcal O}\left(\varepsilon, \kappa^{1. 5}\sqrt{\rho\varepsilon}\right)$-second-order stationary point of $P({\bf x})$ with calling ${\mathcal O}\left(\kappa^{1. 5}\sqrt{\rho}\varepsilon^{-1. 5}\right)$ times of second-order oracles and $\tilde{\mathcal O}\left(\kappa^{2}\sqrt{\rho}\varepsilon^{-1. 5}\right)$ times of first-order oracles, where $\kappa$ is the condition number and $\rho$ is the Lipschitz continuous constant for the Hessian of $f({\bf x}, {\bf y})$. In addition, we propose an inexact variant of MCN for high-dimensional problems to avoid calling the expensive second-order oracles. Instead, our method solves the cubic sub-problem inexactly via gradient descent and matrix Chebyshev expansion. This strategy still obtains the desired approximate second-order stationary point with high probability but only requires $\tilde{\mathcal O}\left(\kappa^{1. 5}\ell\varepsilon^{-2}\right)$ Hessian-vector oracle calls and $\tilde{\mathcal O}\left(\kappa^{2}\sqrt{\rho}\varepsilon^{-1. 5}\right)$ first-order oracle calls. To the best of our knowledge, this is the first work that considers the non-asymptotic convergence behavior of finding second-order stationary points for minimax problems without the convex-concave assumptions.

PDF Details

JBHI Journal 2022 Journal Article

HarMI: Human Activity Recognition Via Multi-Modality Incremental Learning

Xiao Zhang
Hongzheng Yu
Yang Yang
Jingjing Gu
Yujun Li
Fuzhen Zhuang
Dongxiao Yu
Zhaochun Ren

Nowadays, with the development of various kinds of sensors in smartphones or wearable devices, human activity recognition (HAR) has been widely researched and has numerous applications in healthcare, smart city, etc. Many techniques based on hand-crafted feature engineering or deep neural network have been proposed for sensor based HAR. However, these existing methods usually recognize activities offline, which means the whole data should be collected before training, occupying large-capacity storage space. Moreover, once the offline model training finished, the trained model can’t recognize new activities unless retraining from the start, thus with a high cost of time and space. In this paper, we propose a multi-modality incremental learning model, called HarMI, with continuous learning ability. The proposed HarMI model can start training quickly with little storage space and easily learn new activities without storing previous training data. In detail, we first adopt attention mechanism to align heterogeneous sensor data with different frequencies. In addition, to overcome catastrophic forgetting in incremental learning, HarMI utilizes the elastic weight consolidation and canonical correlation analysis from a multi-modality perspective. Extensive experiments based on two public datasets demonstrate that HarMI can achieve a superior performance compared with several state-of-the-arts.

Details DOI

JBHI Journal 2020 Journal Article

MRLN: Multi-Task Relational Learning Network for MRI Vertebral Localization, Identification, and Segmentation

Ranran Zhang
Xiaoyan Xiao
Zhi Liu
Yujun Li
Shuo Li

Magnetic resonance imaging (MRI) vertebral localization, identification, and segmentation are important steps in the automatic analysis of spines. Due to the similar appearances of vertebrae, the accurate segmentation, localization, and identification of vertebrae remain challenging. Previous methods solved the three tasks independently, ignoring the intrinsic correlation among them. In this paper, we propose a multi-task relational learning network (MRLN) that utilizes both the relationships between vertebrae and the relevance of the three tasks. A dilation convolution group is used to expand the receptive field, and LSTM(Long Short-Term Memory) to learn the prior knowledge of the order relationship between the vertebral bodies. We introduce a co-attention module to learn the correlation information, localization-guided segmentation attention(LGSA) and segmentation-guided localization attention(SGLA), in the decoder stage of segmentation and localization tasks. Learning two tasks simultaneously as well as the correlation between tasks can not only avoid the overfitting of a single task but also correct each other. To avoids the cumbersome weight adjustment for different tasks loss functions, we formulated a novel XOR loss that provides a direct evaluation criterion for the localization relationship of the semantic location regression and semantic segmentation. This method was evaluated on a dataset which includes multiple MRI modalities (T1 and T2), various fields of view. Experimental results demonstrate that both of the co-attention and XOR loss work outperforms the most recent state of art.

Details DOI

AAAI Conference 2016 Conference Paper

Accelerating Random Kaczmarz Algorithm Based on Clustering Information

Yujun Li
Kaichun Mo
Haishan Ye

Kaczmarz algorithm is an efﬁcient iterative algorithm to solve overdetermined consistent system of linear equations. During each updating step, Kaczmarz chooses a hyperplane based on an individual equation and projects the current estimate for the exact solution onto that space to get a new estimate. Many vairants of Kaczmarz algorithms are proposed on how to choose better hyperplanes. Using the property of randomly sampled data in high-dimensional space, we propose an accelerated algorithm based on clustering information to improve block Kaczmarz and Kaczmarz via Johnson- Lindenstrauss lemma. Additionally, we theoretically demonstrate convergence improvement on block Kaczmarz algorithm.

PDF Details