Arrow Research search

Author name cluster

Zhuliang Yu

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

12 papers
2 author rows

Possible papers

12

JBHI Journal 2026 Journal Article

Efficient Sleep Staging With Bayesian Uncertainty-Guided Active Learning

  • Tianyou Yu
  • Rui Huang
  • Fei Wang
  • Jun Zhang
  • Wei Wu
  • Zhuliang Yu
  • Yuanqing Li
  • Jun Xiao

Automated sleep staging is essential for large-scale and home-based sleep monitoring; however, in routine clinical practice, sleep annotation remains largely dependent on experienced experts performing time-consuming and labor-intensive manual scoring. Existing automatic systems often struggle to adapt reliably to new subjects, limiting their clinical adoption and reinforcing the reliance on expert review. This creates a strong demand for adaptive and efficient sleep staging systems that can substantially reduce annotation workload while preserving expert-level accuracy. We propose BayesSleepNet, a novel framework that integrates Bayesian uncertainty quantification with active learning for adaptive sleep staging. BayesSleepNet employs principled Bayesian modeling by placing distributions over network weights and performing Monte Carlo sampling at inference, enabling explicit quantification of model (epistemic) uncertainty. These uncertainty estimates drive a two-stage sample selection strategy that first fine-tunes the model using representative epochs and subsequently prioritizes persistently uncertain samples for expert review. Across four public sleep datasets, BayesSleepNet consistently improves performance—by 7. 60% in accuracy, 8. 27% in macro-F1, and 0. 104 in Cohen's $\kappa$ —while requiring manual annotation of only 20% of data from new subjects. Despite its adaptive learning capability, BayesSleepNet remains computationally lightweight, using substantially fewer parameters than representative high-capacity state-of-the-art models. These results demonstrate the clinical promise of uncertainty-aware active learning as a practical and cost-efficient paradigm for semi-automated sleep staging. Code is available at https://github.com/yuty2009/bayesugal.

JBHI Journal 2026 Journal Article

MB-STFormer: A Multi-Band Spectral-Temporal Transformer with Efficient Attention for Enhanced EEG-Based Fatigue Detection

  • Ke Liu
  • Lilong Sun
  • Wenlong Wang
  • Zhenghui Gu
  • Zhuliang Yu
  • Wei Wu

Accurate detection of driver fatigue is critical for preventing traffic accidents. Although electroencephalogram (EEG) signals provide a robust physiological indicator of fatigue, effectively capturing their intricate spatiotemporal-spectral dynamics poses significant challenges. In this paper, we propose MB-STFormer, a novel deep neural network designed for EEG-based fatigue detection, which systematically integrates neurophysiological priors into deep feature learning. The proposed MB-STFormer employs a multi-branch frequency-aware module to extract spatiotemporal features from EEG signals, with each branch dedicated to a distinct frequency sub-band. By leveraging adaptive temporal convolution kernel sizes tailored to each sub-band, the model adeptly captures the inherent rhythmic patterns and temporal dynamics unique to different frequency components. Additionally, we introduce an Efficient Additive Attention mechanism to aggregate global contextual information, thereby addressing the over-smoothing of subtle yet critical features often encountered with conventional transformer self-attention mechanisms. Extensive experiments conducted on three publicly available datasets demonstrate that MB-STFormer achieves state-of-the-art performance while maintaining superior interpretability and generalizability. The proposed framework offers a promising solution for real-world fatigue monitoring systems.

YNIMG Journal 2026 Journal Article

VSSI 2 p -Net: Physics-guided deep unfolding with L 2 p -norm and variation sparsity for EEG source imaging

  • Luhua Wang
  • Jun Zhang
  • Zhenghui Gu
  • Ke Liu
  • Wei Wu
  • Tianyou Yu
  • Zhuliang Yu
  • Yuanqing Li

Electroencephalogram (EEG) source imaging (ESI) is highly underdetermined, which poses a long-standing challenge in neuroimaging. Traditional methods typically rely on predefined priors to constrain the solution space; however, the need for manual parameter adjustments often makes it difficult to achieve optimal integration of prior information. Although recent deep learning methods can automatically update parameters in a data-driven manner, their black-box characteristics lead to a lack of interpretability and the need for extensive training sets. To integrate the advantages of these two types of methods, we propose a novel neural network model based on deep unfolding, called variation sparse source imaging network (VSSI 2 p -Net). Specifically, we introduce variation sparsity and ℓ 2, p norm ( 0 < p < 1 ) regularization into the model of the ESI problem and utilize the Alternating Direction Method of Multipliers (ADMM) to iteratively solve this model. Furthermore, by mapping the iterative process into a neural network structure, the proposed VSSI 2 p -Net can optimize all parameters, including the critical p in ℓ 2, p -norm and the variation sparsity operator, in an end-to-end manner with a reasonably sized training set. In this way, VSSI 2 p -Net achieves more flexible prior information integration while retaining the interpretability of traditional methods, so that a more accurate and efficient solution for ESI can be obtained. We compared the performance of VSSI 2 p -Net with several traditional baseline methods and state-of-the-art deep learning methods on synthetic and real datasets. The results show that VSSI 2 p -Net significantly outperforms existing methods in source localization accuracy, spatial range estimation, and imaging speed across various source configurations.

JBHI Journal 2025 Journal Article

ADMM-ESINet: A Deep Unrolling Network for EEG Extended Source Imaging

  • Ke Liu
  • Hang Jiang
  • Hu Yang
  • Jun Zhang
  • Zhenghui Gu
  • Zhuliang Yu
  • Yu Zhang
  • Bin Xiao

Electroencephalography (EEG) source imaging (ESI) methods aim to reconstruct cortical sources from scalp EEG signals, a crucial task for understanding the normal brain as well as brain disorders. Traditional model-driven ESI methods face challenges in real-time reconstruction, while deep neural network (DNN)-based ESI methods often struggle with generalization to new data. To address these issues, we propose ADMM-ESINet, a novel deep unfolding neural network for robust and efficient reconstruction of EEG extended sources. ADMM-ESINet leverages a structured sparsity constraint within a regularization framework and employs the Alternating Direction Method of Multipliers (ADMM) to achieve iterative solutions. By unrolling the ADMM algorithm into a cascaded network architecture, ADMM-ESINet effectively integrates prior knowledge, enabling end-to-end, real-time ESI. Crucially, both the regularization parameters and the spatial transform operator are learned directly from the training data. Numerical results demonstrate that ADMM-ESINet surpasses traditional DNN-based methods in generalization ability and accurately reconstructs the location, extent, and temporal dynamics of extended sources, establishing ADMM-ESINet as a promising method for real-time ESI.

IROS Conference 2025 Conference Paper

Cross-Activity sEMG-Driven Joint Angle Estimation via Hybrid Attention Fusion: Bridging Traditional Features and Deep Spatial Representations

  • Zhimin Tang
  • Xiaoyan Deng
  • Yinke Wen
  • Xi Han
  • Jiatong Wu
  • Zhuliang Yu

The growing prevalence of stroke necessitates advanced lower-limb exoskeleton control. This paper proposes HybridFusionAtt, a novel model for continuous joint angle estimation using surface electromyography (sEMG). Unlike conventional approaches, our framework uniquely integrates traditional time-domain features with CNN-extracted high-dimensional spatial features through an attention mechanism, where traditional features dynamically guide feature fusion as attention queries. The model was validated using data collected from eight participants performing four activities of daily living (walking, stair climbing, stair descending, and obstacle crossing). The proposed model achieves average R 2 values for knee and hip joint angle prediction of 0. 8682 (walking), 0. 8482 (obstacle crossing), 0. 9294 (stair climbing), and 0. 8676 (stair descending). Experimental results show that the proposed model significantly outperforms traditional LSTM and CNN-LSTM models in terms of accuracy and robustness, particularly in handling non-periodic actions such as obstacle crossing. The model achieves high performance by effectively fusing features and adaptively focusing on key features, enabling it to maintain robustness even under noisy conditions and significant individual differences. This demonstrates the model’s broad application potential, especially in rehabilitation and prosthetic control systems.

JBHI Journal 2025 Journal Article

DMSACNN: Deep Multiscale Attentional Convolutional Neural Network for EEG-Based Motor Decoding

  • Ke Liu
  • Xin Xing
  • Tao Yang
  • Zhuliang Yu
  • Bin Xiao
  • Guoyin Wang
  • Wei Wu

Objective: Accurate decoding of electroencephalogram (EEG) signals has become more significant for the brain-computer interface (BCI). Specifically, motor imagery and motor execution (MI/ME) tasks enable the control of external devices by decoding EEG signals during imagined or real movements. However, accurately decoding MI/ME signals remains a challenge due to the limited utilization of temporal information and ineffective feature selection methods. Methods: This paper introduces DMSACNN, an end-to-end deep multiscale attention convolutional neural network for MI/ME-EEG decoding. DMSACNN incorporates a deep multiscale temporal feature extraction module to capture temporal features at various levels. These features are then processed by a spatial convolutional module to extract spatial features. Finally, a local and global feature fusion attention module is utilized to combine local and global information and extract the most discriminative spatiotemporal features. Main results: DMSACNN achieves impressive accuracies of 78. 20%, 96. 34% and 70. 90% for hold-out analysis on the BCI-IV-2a, High Gamma and OpenBMI datasets, respectively, outperforming most of the state-of-the-art methods. Conclusion and significance: These results highlight the potential of DMSACNN in robust BCI applications. Our proposed method provides a valuable solution to improve the accuracy of the MI/ME-EEG decoding, which can pave the way for more efficient and reliable BCI systems.

IJCAI Conference 2025 Conference Paper

Enhancing User-Oriented Proactivity in Open-Domain Dialogues with Critic Guidance

  • Yufeng Wang
  • Jinwu Hu
  • Ziteng Huang
  • Kunyang Lin
  • Zitian Zhang
  • Peihao Chen
  • Yu Hu
  • Qianyue Wang

Open-domain dialogue systems aim to generate natural and engaging conversations, providing significant practical value in real applications such as social robotics and personal assistants. The advent of large language models (LLMs) has greatly advanced this field by improving context understanding and conversational fluency. However, existing LLM-based dialogue systems often fall short in proactively understanding the user's chatting preferences and guiding conversations toward user-centered topics. This lack of user-oriented proactivity can lead users to feel unappreciated, reducing their satisfaction and willingness to continue the conversation in human-computer interactions. To address this issue, we propose a User-oriented Proactive Chatbot (UPC) to enhance the user-oriented proactivity. Specifically, we first construct a critic to evaluate this proactivity inspired by the LLM-as-a-judge strategy. Given the scarcity of high-quality training data, we then employ the critic to guide dialogues between the chatbot and user agents, generating a corpus with enhanced user-oriented proactivity. To ensure the diversity of the user backgrounds, we introduce the ISCO-800, a diverse user background dataset for constructing user agents. Moreover, considering the communication difficulty varies among users, we propose an iterative curriculum learning method that trains the chatbot from easy-to-communicate users to more challenging ones, thereby gradually enhancing its performance. Experiments demonstrate that our proposed training method is applicable to different LLMs, improving user-oriented proactivity and attractiveness in open-domain dialogues. Code and appendix are available at github. com/wang678/LLM-UPC.

JBHI Journal 2025 Journal Article

M3D: Manifold-Based Domain Adaptation With Dynamic Distribution for Non-Deep Transfer Learning in Cross-Subject and Cross-Session EEG-Based Emotion Recognition

  • Ting Luo
  • Jing Zhang
  • Yingwei Qiu
  • Li Zhang
  • Yaohua Hu
  • Zhuliang Yu
  • Zhen Liang

Emotion decoding using Electroencephalography (EEG)-based affective brain-computer interfaces (aBCIs) is crucial for affective computing but is hindered by EEG’s non-stationarity, individual variability, and the high cost of large-scale labeled data. Deep learning-based approaches, while effective, require substantial computational resources and large datasets, limiting their practicality. To address these challenges, we propose Manifold-based Domain Adaptation with Dynamic Distribution (M3D), a lightweight non-deep transfer learning framework. M3D includes four main modules: manifold feature transformation, dynamic distribution alignment, classifier learning, and ensemble learning. The data undergoes a transformation onto an optimal Grassmann manifold space, enabling dynamic alignment of the source and target domains. This process prioritizes both marginal and conditional distributions according to their significance, ensuring enhanced adaptation efficiency across various types of data. In the classifier learning, the principle of structural risk minimization is integrated to develop robust classification models. This is complemented by dynamic distribution alignment, which refines the classifier iteratively. Additionally, the ensemble learning module aggregates the classifiers obtained at different stages of the optimization process, which leverages the diversity of the classifiers to enhance the overall prediction accuracy. The proposed M3D framework is evaluated on three benchmark EEG emotion recognition datasets using two validation protocols (cross-subject single-session and cross-subject cross-session), as well as on a clinical EEG dataset of Major Depressive Disorder (MDD). Experimental results demonstrate that M3D outperforms traditional non-deep learning methods, achieving an average improvement of 6. 67%, while achieving deep learning-comparable performance with significantly lower data and computational requirements. These findings highlight the potential of M3D to enhance the practicality and applicability of aBCIs in real-world scenarios.

NeurIPS Conference 2025 Conference Paper

Open-World Drone Active Tracking with Goal-Centered Rewards

  • Haowei Sun
  • Jinwu Hu
  • Zhirui Zhang
  • Haoyuan Tian
  • Xinze Xie
  • Yufeng Wang
  • Xiaohua Xie
  • Yun Lin

Drone Visual Active Tracking aims to autonomously follow a target object by controlling the motion system based on visual observations, providing a more practical solution for effective tracking in dynamic environments. However, accurate Drone Visual Active Tracking using reinforcement learning remains challenging due to the absence of a unified benchmark and the complexity of open-world environments with frequent interference. To address these issues, we pioneer a systematic solution. First, we propose DAT, the first open-world drone active air-to-ground tracking benchmark. It encompasses 24 city-scale scenes, featuring targets with human-like behaviors and high-fidelity dynamics simulation. DAT also provides a digital twin tool for unlimited scene generation. Additionally, we propose a novel reinforcement learning method called GC-VAT, which aims to improve the performance of drone tracking targets in complex scenarios. Specifically, we design a Goal-Centered Reward to provide precise feedback across viewpoints to the agent, enabling it to expand perception and movement range through unrestricted perspectives. Inspired by curriculum learning, we introduce a Curriculum-Based Training strategy that progressively enhances the tracking performance in complex environments. Besides, experiments on simulator and real-world images demonstrate the superior performance of GC-VAT, achieving a Tracking Success Rate of approximately 72% on the simulator. The benchmark and code are available at https: //github. com/SHWplus/DAT_Benchmark.

AAAI Conference 2025 Conference Paper

Self-Correcting Robot Manipulation via Gaussian-Splatted Foresight

  • Shaohui Pan
  • Yong Xu
  • Ruotao Xu
  • Zihan Zhou
  • Si Wu
  • Zhuliang Yu

Language-conditioned robotic manipulation in unstructured environments presents significant challenges for intelligent robotic systems. However, due to partial observation or imprecise action prediction, failure may be unavoidable for learned policies. Moreover, operational failures can lead to the robotic arm entering an untrained state, potentially causing destructive results. Consequently, the ability to detect and self-correct failures is crucial for the development of practical robotic systems. To address this challenge, we propose a foresight-driven failure detection and self-correction module for robot manipulation. By leveraging 3D Gaussian Splatting, we represent the current scene with multiple Gaussians. Subsequently, we train a prediction network to forecast the Gaussian representation of future scenes conditioned on planned actions. Failure is detected when the predicted future significantly deviates from the real observation after action execution. In such cases, the end-effector rolls back to the previous action to avoid an untrained state. Integrating this approach with the PerACT framework, we develop a self-correcting robot manipulation policy. Evaluations on ten RLBench tasks with 166 variations demonstrate the superior performance of the proposed method, which outperforms state-of-the-art methods by 12.0% success rate on average.

JBHI Journal 2024 Journal Article

MSVTNet: Multi-Scale Vision Transformer Neural Network for EEG-Based Motor Imagery Decoding

  • Ke Liu
  • Tao Yang
  • Zhuliang Yu
  • Weibo Yi
  • Hong Yu
  • Guoyin Wang
  • Wei Wu

Object: Transformer-based neural networks have been applied to the electroencephalography (EEG) decoding for motor imagery (MI). However, most networks focus on applying the self-attention mechanism to extract global temporal information, while the cross-frequency coupling features between different frequencies have been neglected. Additionally, effectively integrating different neural networks poses challenges for the advanced design of decoding algorithms. Methods: This study proposes a novel end-to-end Multi-Scale Vision Transformer Neural Network (MSVTNet) for MI-EEG classification. MSVTNet first extracts local spatio-temporal features at different filtered scales through convolutional neural networks (CNNs). Then, these features are concatenated along the feature dimension to form local multi-scale spatio-temporal feature tokens. Finally, Transformers are utilized to capture cross-scale interaction information and global temporal correlations, providing more distinguishable feature embeddings for classification. Moreover, auxiliary branch loss is leveraged for intermediate supervision to ensure the effective integration of CNNs and Transformers. Results: The performance of MSVTNet was assessed through subject-dependent (session-dependent and session-independent) and subject-independent experiments on three MI datasets, i. e. , the BCI competition IV 2a, 2b and OpenBMI datasets. The experimental results demonstrate that MSVTNet achieves state-of-the-art performance in all analyses. Conclusion: MSVTNet shows superiority and robustness in enhancing MI decoding performance.

YNIMG Journal 2019 Journal Article

Grouped sparse Bayesian learning for voxel selection in multivoxel pattern analysis of fMRI data

  • Zhenfu Wen
  • Tianyou Yu
  • Zhuliang Yu
  • Yuanqing Li

Multivoxel pattern analysis (MVPA) methods have been widely applied in recent years to classify human brain states in functional magnetic resonance imaging (fMRI) data analysis. Voxel selection plays an important role in MVPA studies not only because it can improve decoding accuracy but also because it is useful for understanding brain functions. There are many voxel selection methods that have been proposed in fMRI literature. However, most of these methods either overlook the structure information of fMRI data or require additional cross-validation procedures to determine the hyperparameters of the models. In the present work, we proposed a voxel selection method for binary brain decoding called group sparse Bayesian logistic regression (GSBLR). This method utilizes the group sparse property of fMRI data by using a grouped automatic relevance determination (GARD) as a prior for model parameters. All the parameters in the GSBLR can be estimated automatically, thereby avoiding additional cross-validation. Experimental results based on two publicly available fMRI datasets and simulated datasets demonstrate that GSBLR achieved better classification accuracies and yielded more stable solutions than several state-of-the-art methods.