Author name cluster

Hui Yu

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

15 papers

2 author rows

JBHI Journal 2026 Journal Article

An Explainable Molecular Token Estimation Method for Knowledge-aware Drug-Drug Interaction Prediction

Hui Yu
Chao Song
Jiahao Yuan
Xinkun Li
Xiao Zhang
Yang Yang
Zhe Yu
Jian-Yu Shi

In molecular representation learning (MRL), tokens ( e. g. , atoms, motifs, and fingerprints) are the basic elements to represent molecules. It is a common practice by using various tokens to enhance the expressive power of Graph Neural Networks (GNNs) on molecular graphs. Although prior GNNs-based methods employing tokens achieve promising performances in drug-drug interaction (DDI) prediction, the influence of the token on the expressiveness of molecular embedding models remains underexplored. To bridge the gap, we provide an axiomatic definition of MRL from a frequency domain perspective, revealing that the model's performance is closely related to the number of tokens and deriving a theoretical upper bound of likelihood-based model convergency. Building on these insights, we propose SimMotifPro, a simple yet efficient motif-based method, for DDI prediction. Specifically, SimMotifPro uses a variant of DeeperGCN encoder and builds a motif-motif knowledge graph to capture motif interconnections. A Motif Ranker module is also introduced to decouple learned representations and differentiate the contributions of selected motifs. Empirically, we demonstrate that SimMotifPro adheres to the properties demonstrated in our theoretical upper bound and validate the general applicability of our theory across different methods. Furthermore, our approach achieves state-of-the-art performance on various benchmarks for DDI prediction. Our codes and checkpoints are available at https://github.com/siriusong/sim_motif_pro.

Details DOI

NeurIPS Conference 2025 Conference Paper

Attribution-Driven Adaptive Token Pruning for Transformers

YAOYAO YAN
Hui Yu
Weizhi Xu

Transformers have been widely adopted in natural language processing, computer vision, and other domains due to their exceptional performance across a variety of tasks. However, the computational cost of Transformers is prohibitively high, particularly when handling long input sequences, significantly increasing both training and inference time. Although various token pruning methods have been proposed to reduce the computational burden of Transformers, most approaches overlook critical differences in sequences in terms of length and complexity, leading to suboptimal compression efficiency. In this paper, we propose AD-TP, an Attribution-Driven Adaptive Token Pruning method designed to retain only the most informative tokens. We analyze the performance of using accumulated attention values to measure token importance and find that attention values do not accurately reflect the actual contribution of each token to text understanding. Additionally, we observe significant variations in the length and complexity of different sequences within the dataset. Based on these insights, we adopt Integrated Gradients to evaluate token importance and introduce a lightweight adaptive token retainer module that dynamically generates pruning configurations for each input sequence. In addition, we incorporate both teacher supervision and self-supervised learning objectives to enhance the training efficiency, accuracy, and robustness of the model. Experiments conducted on GLUE, SQuAD, and 20News demonstrate that AD-TP outperforms state-of-the-art token pruning and model compression methods in both accuracy and computational efficiency. On GLUE, AD-TP reduces FLOPs by an average of 7. 8× while improving performance by 0. 6%.

PDF Details

NeurIPS Conference 2025 Conference Paper

EnzyControl: Adding Functional and Substrate-Specific Control for Enzyme Backbone Generation

Chao Song
Zhiyuan Liu
Han Huang
Liang Wang
Qiong Wang
Jian-Yu Shi
Hui Yu
Yihang Zhou

Designing enzyme backbones with substrate-specific functionality is a critical challenge in computational protein engineering. Current generative models excel in protein design but face limitations in binding data, substrate-specific control, and flexibility for de novo enzyme backbone generation. To address this, we introduce EnzyBind, a dataset with 11, 100 experimentally validated enzyme-substrate pairs specifically curated from PDBbind. Building on this, we propose EnzyControl, a method that enables functional and substrate-specific control in enzyme backbone generation. Our approach generates enzyme backbones conditioned on MSA-annotated catalytic sites and their corresponding substrates, which are automatically extracted from curated enzyme-substrate data. At the core of EnzyControl is EnzyAdapter, a lightweight, modular component integrated into a pretrained motif-scaffolding model, allowing it to become substrate-aware. A two-stage training paradigm further refines the model's ability to generate accurate and functional enzyme structures. Experiments show that our EnzyControl achieves the best performance across structural and functional metrics on EnzyBind and EnzyBench benchmarks, with particularly notable improvements of 13% in designability and 13% in catalytic efficiency compared to the baseline models. The code is released at https: //github. com/Vecteur-libre/EnzyControl.

PDF Details

JBHI Journal 2025 Journal Article

Mesh Regression Based Shape Enhancement Operator Designed for Organ Segmentation

Yuanyuan Xu
Hui Yu
Jiliu Zhou
Yan Liu

Organ delineation is critical for diagnosis and treatment planning so as to attract a lot of attention. Recently, neural network based methods yield accurate segmentation metrics like dice coefficient. However, they have to face the problem of indistinct boundaries since segmentation is usually modeled as a pixel classification task ignoring anatomical priors. Inspired by the fact that anatomical information is an essential prior for doctors in organ segmentation, this paper proposes a mesh regression-based shape enhancement operator. This operator innovatively models the refinement of segmentation masks as a mesh vertex regression task, enabling the model to refine the segmentation contours from the perspective of segmentation targets rather than purely from a pixel perspective. The proposed operator starts from the coarse segmentation masks produced by any segmentation model. By representing mesh with the fast point feature histogram of mesh vertexes, the displacement of each vertex is predicted by a graph convolutional neural network. Once the coordinate displacements are obtained, the mesh will be evolved through vertex moving. The operator is plug-and-play, and could co-operate with any backbone segmentation model. The constructed two-stage segmentation pipeline is capable of refining organ segmentation results based on geometrical characteristics of target appearance. Validation has been performed on two public accessible datasets to delineate pancreas and liver. Results have shown that the proposed shape enhancement operator could significantly improve segmentation performance, which have also demonstrated its effectiveness and application prospects.

Details DOI

EAAI Journal 2025 Journal Article

Modelling and scheduling distributed assembly permutation flow-shops using reinforcement learning-based evolutionary algorithms

Bohan Qiu
Kaizhou Gao
Hui Yu
Ali Sadollah

Distributed assembly permutation flowshop scheduling problem (DAPFSP) have aroused extensive interests in both academia and industry due to its wide range of applications in manufacturing systems. However, the existing literature on this topic is comparatively insufficient. This study investigated the considered problem with total flowtime (TF) criterion by developing three evolutionary algorithms (EAs) and their reinforcement learning (RL) based variants. First, two heuristic rules are proposed to initialize population with diversity. Second, six elaborated local search operators are incorporate with the basic EAs considering the problem’s characteristics. Then, the RL algorithms e. g. , Q-Learning and Sarsa, are employed to select high-quality local search operators during algorithms' iterations to improve their performance. Three problem characteristic based RL strategies are designed for mapping the RL algorithms and local search operators. Finally, detailed calibration according to 81 large-scale instances is performed to illustrate the effectiveness of the proposed algorithms for solving the DAPFSP.

Details DOI

EAAI Journal 2024 Journal Article

Ensemble meta-heuristics and Q-learning for staff dissatisfaction constrained surgery scheduling and rescheduling

Hui Yu
Kai-Zhou Gao
Naiqi Wu
Ponnuthurai Nagaratnam Suganthan

In this study, we investigate the multi-objective surgery scheduling and rescheduling problems with considering medical staff dissatisfaction and fuzzy surgery time. Rescheduling is activated when emergency patients arrive. First, a multi-objective mathematical model is established for maximizing the average patient satisfaction, and minimizing the fuzzy maximum completion time and total medical cost, simultaneously. Second, five meta-heuristics are employed and improved to solve the concerned problems. Five heuristic rules are developed to improve the diversity and quality of initial solutions. For improving the performance of meta-heuristics, six local search operators are designed and two Q-learning-based strategies are developed to select optimal ones intelligently. Finally, 29 instances with different scales are used to verify the performance of the proposed algorithms. Compared with the basic meta-heuristics, the average performance of the algorithms with the second Q-learning-based strategy is improved by 62. 5%, 62. 1%, 50%, 70. 7%, and 70. 7%, respectively. Through the Friedman test, the asymptotic significance values of both metrics (0. 034 and 0. 000) are less than 0. 05, indicating that there is a significant performance gap among five algorithms with the second Q-learning-based strategy. The average rank values of the teaching-learning-based optimization with the second Q-learning strategy are 3. 7069 and 2. 0690 for two metrics, which are better than the compared ones.

Details DOI

JBHI Journal 2024 Journal Article

Gradient-Guided Network With Fourier Enhancement for Glioma Segmentation in Multimodal 3D MRI

Zhongzhou Zhang
Hui Yu
Zhongxian Wang
Zhiwen Wang
Jingfeng Lu
Yan Liu
Yi Zhang

Glioma segmentation is a crucial task in computer-aided diagnosis, requiring precise discrimination between lesions and normal tissue at the pixel level. Popular methods neglect crucial edge information, leading to inaccurate contour delineation. Moreover, global information has been proven beneficial for segmentation. The feature representations extracted by convolution neural networks often struggle with local-related information owing to the limited receptive fields. To address these issues, we propose a novel edge-aware segmentation network that incorporates a dual-path gradient-guided training strategy with Fourier edge-enhancement for precise glioma segmentation, a. k. a. GFNet. First, we introduce a Dual-path Gradient-guided Training strategy (DGT) based on a Siamese network guiding the optimizing direction of one path by the gradient from the other path. DGT pays attention to the indistinguishable pixels with large weight-updating gradient, such as the pixels near the boundary, to guide the network training, addressing hard samples. Second, to further perceive the edge information, we derive a Fourier Edge-enhancement Module (FEM) to augment feature edges with high-frequency representations from the spectral domain, providing global information and edge details. Extensive experiments on public glioma segmentation datasets, BraTS2020 and Medical Segmentation Decathlon (MSD) glioma and prostate segmentation, demonstrate that GFNet achieves competitive performance compared to other state-of-the-art methods, both qualitatively and quantitatively.

Details DOI

AAAI Conference 2024 Conference Paper

Live and Learn: Continual Action Clustering with Incremental Views

Xiaoqiang Yan
Yingtao Gan
Yiqiao Mao
Yangdong Ye
Hui Yu

Multi-view action clustering leverages the complementary information from different camera views to enhance the clustering performance. Although existing approaches have achieved significant progress, they assume all camera views are available in advance, which is impractical when the camera view is incremental over time. Besides, learning the invariant information among multiple camera views is still a challenging issue, especially in continual learning scenario. Aiming at these problems, we propose a novel continual action clustering (CAC) method, which is capable of learning action categories in a continual learning manner. To be specific, we first devise a category memory library, which captures and stores the learned categories from historical views. Then, as a new camera view arrives, we only need to maintain a consensus partition matrix, which can be updated by leveraging the incoming new camera view rather than keeping all of them. Finally, a three-step alternate optimization is proposed, in which the category memory library and consensus partition matrix are optimized. The empirical experimental results on 6 realistic multi-view action collections demonstrate the excellent clustering performance and time/space efficiency of the CAC compared with 15 state-of-the-art baselines.

PDF Details DOI

EAAI Journal 2023 Journal Article

LACN: A lightweight attention-guided ConvNeXt network for low-light image enhancement

Saijie Fan
Wei Liang
Derui Ding
Hui Yu

Images captured under low-light conditions usually have poor visual quality, and hence greatly reduce the accuracy of subsequent tasks such as image segmentation and detection. In the low-light image enhancement task, noises in the dark areas are generally amplified while the images’ brightness is enhanced. It should be pointed out that many deep learning methods cannot effectively suppress the noise at this stage and capture important feature information. To address the above problem, this paper proposes a Lightweight Attention-guided ConvNeXt Network (LACN) for low-light image enhancement. A novel Attention ConvNeXt Module (ACM) is first proposed by introducing a parameter-free attention module (i. e. SimAM) into the ConvNeXt backbone network. Then, a nontrivial lightweight network LACN based on a multi-attention mechanism is established through stacking two ACMs and fusing their features. In what follows, an improved hybrid attention mechanism, Selective Kernel Attention Module (SKAM), is adopted to effectively extract both global and local information. Such a module realizes the evaluation of lighting conditions for the whole image and the adaptive adjustment of the receptive field. Finally, through the feature fusion module, the features of different stages are aggregated to improve the ability of network to retain color information. Numerous experiments on low-light image enhancement are implemented via comparison with other state-of-the-art methods. Experiments show that the proposed method significantly improves the brightness and contrast of low-illumination images, preserves color information, and suppresses the generation of noises after image brightening.

Details DOI

NeurIPS Conference 2023 Conference Paper

Learning Dictionary for Visual Attention

Yingjie Liu
Xuan Liu
Hui Yu
Xuan Tang
Xian Wei

Recently, the attention mechanism has shown outstanding competence in capturing global structure information and long-range relationships within data, thus enhancing the performance of deep vision models on various computer vision tasks. In this work, we propose a novel dictionary learning-based attention (\textit{Dic-Attn}) module, which models this issue as a decomposition and reconstruction problem with the sparsity prior, inspired by sparse coding in the human visual perception system. The proposed \textit{Dic-Attn} module decomposes the input into a dictionary and corresponding sparse representations, allowing for the disentanglement of underlying nonlinear structural information in visual data and the reconstruction of an attention embedding. By applying transformation operations in the spatial and channel domains, the module dynamically selects the dictionary's atoms and sparse representations. Finally, the updated dictionary and sparse representations capture the global contextual information and reconstruct the attention maps. The proposed \textit{Dic-Attn} module is designed with plug-and-play compatibility, allowing for integration into deep attention encoders. Our approach offers an intuitive and elegant means to exploit the discriminative information from data, promoting visual attention construction. Extensive experimental results on various computer vision tasks, e. g. , image and point cloud classification, validate that our method achieves promising performance, and shows a strong competitive comparison with state-of-the-art attention methods.

PDF Details

JBHI Journal 2023 Journal Article

SemiMAR: Semi-Supervised Learning for CT Metal Artifact Reduction

Tao Wang
Hui Yu
Zhiwen Wang
Hu Chen
Yan Liu
Jingfeng Lu
Yi Zhang

Metal artifacts lead to CT imaging quality degradation. With the success of deep learning (DL) in medical imaging, a number of DL-based supervised methods have been developed for metal artifact reduction (MAR). Nonetheless, fully-supervised MAR methods based on simulated data do not perform well on clinical data due to the domain gap. Although this problem can be avoided in an unsupervised way to a certain degree, severe artifacts cannot be well suppressed in clinical practice. Recently, semi-supervised metal artifact reduction (MAR) methods have gained wide attention due to their ability in narrowing the domain gap and improving MAR performance in clinical data. However, these methods typically require large model sizes, posing challenges for optimization. To address this issue, we propose a novel semi-supervised MAR framework. In our framework, only the artifact-free parts are learned, and the artifacts are inferred by subtracting these clean parts from the metal-corrupted CT images. Our approach leverages a single generator to execute all complex transformations, thereby reducing the model's scale and preventing overlap between clean part and artifacts. To recover more tissue details, we distill the knowledge from the advanced dual-domain MAR network into our model in both image domain and latent feature space. The latent space constraint is achieved via contrastive learning. We also evaluate the impact of different generator architectures by investigating several mainstream deep learning-based MAR backbones. Our experiments demonstrate that the proposed method competes favorably with several state-of-the-art semi-supervised MAR techniques in both qualitative and quantitative aspects.

Details DOI

NeurIPS Conference 2022 Conference Paper

Geodesic Self-Attention for 3D Point Clouds

Zhengyu Li
Xuan Tang
Zihao Xu
Xihao Wang
Hui Yu
Mingsong Chen
Xian Wei

Due to the outstanding competence in capturing long-range relationships, self-attention mechanism has achieved remarkable progress in point cloud tasks. Nevertheless, point cloud object often has complex non-Euclidean spatial structures, with the behavior changing dynamically and unpredictably. Most current self-attention modules highly rely on the dot product multiplication in Euclidean space, which cannot capture internal non-Euclidean structures of point cloud objects, especially the long-range relationships along the curve of the implicit manifold surface represented by point cloud objects. To address this problem, in this paper, we introduce a novel metric on the Riemannian manifold to capture the long-range geometrical dependencies of point cloud objects to replace traditional self-attention modules, namely, the Geodesic Self-Attention (GSA) module. Our approach achieves state-of-the-art performance compared to point cloud Transformers on object classification, few-shot classification and part segmentation benchmarks.

PDF Details

JBHI Journal 2021 Journal Article

PCA-Based Multi-Wavelength Photoplethysmography Algorithm for Cuffless Blood Pressure Measurement on Elderly Subjects

Jing Liu
Shirong Qiu
Ningqi Luo
Sze-Kei Lau
Hui Yu
Timothy Kwok
Yuan-Ting Zhang
Ni Zhao

The prevalence of hypertension has made blood pressure (BP) measurement one of the most wanted functions in wearable devices for convenient and frequent self-assessment of health conditions. The widely adopted principle for cuffless BP monitoring is based on arterial pulse transit time (PTT), which is measured with electrocardiography and photoplethysmography (PPG). To achieve cuffless BP monitoring with more compact wearable electronics, we have previously conceived a multi-wavelength PPG (MWPPG) strategy to perform BP estimation from arteriolar PTT, requiring only a single sensing node. However, challenges remain in decoding the compounded MWPPG signals consisting of both heterogeneous physiological information and motion artifact (MA). In this work, we proposed an improved MWPPG algorithm based on principal component analysis (PCA) which matches the statistical decomposition results with the arterial pulse and capillary pulse. The arteriolar PTT is calculated accordingly as the phase shift based on the entire waveforms, instead of local peak lag time, to enhance the feature robustness. Meanwhile, the PCA-derived MA component is employed to identify and exclude the MA-contaminated segments. To evaluate the new algorithm, we performed a comparative experiment (N = 22) with a cuffless MWPPG measurement device and used double-tube auscultatory BP measurement as a reference. The results demonstrate the accuracy improvement enabled by the PCA-based operations on MWPPG signals, yielding errors of 1. 44 ± 6. 89 mmHg for systolic blood pressure and -1. 00 ± 6. 71 mm Hg for diastolic blood pressure. In conclusion, the proposed PCA-based method can improve the performance of MWPPG in wearable medical devices for cuffless BP measurement.

Details DOI

YNIMG Journal 2012 Journal Article

Abnormal insula functional network is associated with episodic memory decline in amnestic mild cognitive impairment

Chunming Xie
Feng Bai
Hui Yu
Yongmei Shi
Yonggui Yuan
Gang Chen
Wenjun Li
Guangyu Chen

Abnormalities of functional connectivity in the default mode network (DMN) recently have been reported in patients with amnestic mild cognitive impairment (aMCI), Alzheimer's disease (AD) or other psychiatric diseases. As such, these abnormalities may be epiphenomena instead of playing a causal role in AD progression. To date, few studies have investigated specific brain networks, which extend beyond the DMN involved in the early AD stages, especially in aMCI. The insula is one site affected by early pathological changes in AD and is a crucial hub of the human brain networks. Currently, we explored the contribution of the insula networks to cognitive performance in aMCI patients. Thirty aMCI and 26 cognitively normal (CN) subjects participated in this study. Intrinsic connectivity of the insula networks was measured, using the resting-state functional connectivity fMRI approach. We examined the differential connectivity of insula networks between groups, and the neural correlation between the altered insula networks connectivity and the cognitive performance in aMCI patients and CN subjects, respectively. Insula subregional volumes were also investigated. AMCI subjects, when compared to CN subjects, showed significantly reduced right posterior insula volumes, cognitive deficits and disrupted intrinsic connectivity of the insula networks. Specifically, decreased intrinsic connectivity was primarily located in the frontal–parietal network and the cingulo-opercular network, including the anterior prefrontal cortex (aPFC), anterior cingulate cortex, operculum, inferior parietal cortex and precuneus. Increased intrinsic connectivity was primarily situated in the visual–auditory pathway, which included the posterior superior temporal gyrus and middle occipital gyrus. Conjunction analysis was performed; and significantly decreased intrinsic connectivity in the overlapping regions of the anterior and posterior insula networks, including the bilateral aPFC, left dorsolateral prefrontal cortex, dorsomedial prefrontal cortex, and anterior temporal pole was found. Furthermore, the disrupted intrinsic connectivity was associated with episodic memory (EM) deficits in the aMCI patients and not in the CN subjects. These findings demonstrated that the functional integration of the insula networks plays an important role in the EM process. They provided new insight into the neural mechanism underlying the memory deficits in aMCI patients.

Details DOI

ICRA Conference 2011 Conference Paper

An adaptive knee joint exoskeleton based on biological geometries

Dong-hai Wang
Jiajie Guo
Kok-Meng Lee
Canjun Yang
Hui Yu

This paper presents a dynamic model of a knee joint interacting with a two-link exoskeleton for investigating the effects of different exoskeleton designs on internal joint forces. The closed kinematic chain of the leg and exoskeleton has a significant effect on the joint forces in the knee. A bio-joint model is used to capture this effect by relaxing a commonly made assumption that approximates a knee joint as a perfect engineering pin-joint in exoskeleton design. Based on the knowledge of a knee-joint kinematics, an adaptive knee-joint exoskeleton has been designed by incorporating different kinematic components (such as a pin, slider and cam profile). This design potentially eliminates the negative effects associated with the closed leg/exoskeleton kinematic chain on a human knee. An investigation in the flexion motion of an artificial human knee joint is presented to compare performances of five exoskeleton designs against the case with no exoskeletons. Analytical results that estimate internal forces using the dynamic model (based on the properties of a knee joint) agree well with the experiments. These studies lead to an adaptive mechanism with a slider/cam as an alternative to pin joints for the exoskeleton, and illustrate the application of the model for designing an adaptive mechanism that minimizes internal joint forces due to a human-exoskeleton interaction.

Details