Arrow Research search

Author name cluster

Xiaodong Yang

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

19 papers
2 author rows

Possible papers

19

JBHI Journal 2026 Journal Article

Mamba-Enhanced Diffusion Model for Perception-Aware Blind Super-Resolution of Magnetic Resonance Imaging

  • Xiaoqiang Zhao
  • Xiaodong Yang
  • Zhaoyang Song

High-resolution magnetic resonance imaging (HR MRI) can provide accurate and rich information for doctors to better detect subtle lesions, delineate tumor boundaries, evaluate small anatomical structures, and assess early-stage pathological changes that might be obscured in lower resolution images. However, the acquisition of HR MRI images often requires prolonged scanning time, which causes the patient's physical and mental discomfort. The patient's slight movement may produce the motion artifacts and make the obtained MRI image become blurry, affecting the accuracy of clinical diagnosis. To tackle these problems, we propose a novel method, Mamba-enhanced Diffusion Model (MDM) for perception-aware blind super-resolution of Magnetic Resonance Imaging, which includes two important components: kernel noise estimator and SR reconstructor. Specifically, we propose a Perception-aware Blur Kernel Noise estimator (PBKN estimator), which takes advantage of the diffusion model to estimate the blur kernel from low-resolution images. Meanwhile, we construct a novel progressive feature reconstructor, which takes the estimated blur kernel and the content information of LR images as prior knowledge to reconstruct more accurate SR MRI images by using diffusion model. Moreover, we design a novel Semantic Information Fusion Mamba (SIF-Mamba) module for the SR reconstruction task. SIF-Mamba is specifically designed in the progressive feature reconstructor to capture the global context of MRI images and improve the feature reconstruction. The extensive experiments demonstrate that our proposed MDM achieves better SR reconstruction results than several outstanding methods.

JBHI Journal 2025 Journal Article

DC-ASTGCN: EEG Emotion Recognition Based on Fusion Deep Convolutional and Adaptive Spatio-Temporal Graph Convolutional Networks

  • Xiaodong Yang
  • Zhengping Zhu
  • Guangkang Jiang
  • Dandan Wu
  • Aijun He
  • Jun Wang

Thanks to advancements in artificial intelligence and brain-computer interface (BCI) research, there has been increasing attention towards emotion recognition techniques based on electroencephalogram (EEG) recently. The complexity of EEG data poses a challenge when it comes to accurately classifying emotions by integrating time, frequency, and spatial domain features. To address this challenge, this paper proposes a fusion model called DC-ASTGCN, which combines the strengths of deep convolutional neural network (DCNN) and adaptive spatio-temporal graphic convolutional neural network (ASTGCN) to comprehensively analyze and understand EEG signals. The DCNN focuses on extracting frequency-domain and local spatial features from EEG signals to identify brain region activity patterns, while the ASTGCN, with its spatio-temporal attention mechanism and adaptive brain topology layer, reveals the functional connectivity features between brain regions in different emotional states. This integration significantly enhances the model's ability to understand and recognize emotional states. Extensive experiments conducted on the DEAP and SEED datasets demonstrate that the DC-ASTGCN model outperforms existing state-of-the-art methods in terms of emotion recognition accuracy.

JBHI Journal 2025 Journal Article

MDD2DG-IRA: Multivariate Degree Distribution to Dynamic Graph With Inter-Channel Relevance Attention Mechanism for Multi-Channel Myocardial Infarction ECG Analysis

  • Xiaodong Yang
  • Guangkang Jiang
  • Zhengping Zhu
  • Dandan Wu
  • Aijun He
  • Jun Wang

We introduced a novel methodology Multivariate Degree Distribution to Dynamic Graph (MDD2DG) with Inter-channel Relevance Attention (IRA) mechanism to analyze multi-channel Electrocardiogram (ECG) signals and explore signal connections across different channels. Our methodology comprises three main steps. First, multi-channel cardiac signals are transformed into multi-channel visual graphs to extract crucial degree distribution features. Then, degree distributions are mapped into dynamic graphs using a neural network with an IRA mechanism. After that, critical features are extracted within dynamic graphs utilizing a Graph Convolutional Neural Networks (GCNNs), and classification is subsequently performed using a multilayer perceptron. In this model, a method of multi-scale position embedding was introduced, which significantly enhanced the processing efficiency of the model by providing a simpler yet sufficiently effective feature representation. Compared to traditional complex network methods, our approach replaces fixed formula-calculated features with dynamic graph models, resulting in improved recognition accuracy. In the experiments, we achieved an impressive 99. 94% classification accuracy for distinguishing ECG signals from the five distinct locations (AMI, ASMI, ALMI, IMI and ILMI) with myocardial infarction (MI) as well as those of the healthy controls (HC). This work contributes to the analysis of complex physiological signals in the field of multi-channel ECG sequence, and provides a robust approach with promising implications for improving clinical medicine and the early detection of cardiac diseases.

NeurIPS Conference 2025 Conference Paper

MoPFormer: Motion-Primitive Transformer for Wearable-Sensor Activity Recognition

  • Hao Zhang
  • Zhan Zhuang
  • Xuehao Wang
  • Xiaodong Yang
  • Yu Zhang

Human Activity Recognition (HAR) with wearable sensors is challenged by limited interpretability, which significantly impacts cross-dataset generalization. To address this challenge, we propose Motion-Primitive Transformer (MoPFormer), a novel self-supervised framework that enhances interpretability by tokenizing inertial measurement unit signals into semantically meaningful motion primitives and leverages a Transformer architecture to learn rich temporal representations. MoPFormer comprises two stages. The first stage is to partition multi-channel sensor streams into short segments and quantize them into discrete ``motion primitive'' codewords, while the second stage enriches those tokenized sequences through a context-aware embedding module and then processes them with a Transformer encoder. The proposed MoPFormer can be pre-trained using a masked motion-modeling objective that reconstructs missing primitives, enabling it to develop robust representations across diverse sensor configurations. Experiments on six HAR benchmarks demonstrate that MoPFormer not only outperforms state-of-the-art methods but also successfully generalizes across multiple datasets. More importantly, the learned motion primitives significantly enhance both interpretability and cross-dataset performance by capturing fundamental movement patterns that remain consistent across similar activities, regardless of dataset origin.

ICLR Conference 2024 Conference Paper

Benign Oscillation of Stochastic Gradient Descent with Large Learning Rate

  • Miao Lu
  • Beining Wu
  • Xiaodong Yang
  • Difan Zou

In this work, we theoretically investigate the generalization properties of neural networks (NN) trained by stochastic gradient descent (SGD) with large learning rates. Under such a training regime, our finding is that, the oscillation of the NN weights caused by SGD with large learning rates turns out to be beneficial to the generalization of the NN, potentially improving over the same NN trained by SGD with small learning rates that converges more smoothly. In view of this finding, we call such a phenomenon “benign oscillation”. Our theory towards demystifying such a phenomenon builds upon the feature learning perspective of deep learning. Specifically, we consider a feature-noise data generation model that consists of (i) weak features which have a small $\ell_2$-norm and appear in each data point; (ii) strong features which have a large $\ell_2$-norm but appear only in a certain fraction of all data points; and (iii) noise. We prove that NNs trained by oscillating SGD with a large learning rate can effectively learn the weak features in the presence of those strong features. In contrast, NNs trained by SGD with a small learning rate can only learn the strong features but make little progress in learning the weak features. Consequently, when it comes to the new testing data points that consist of only weak features, the NN trained by oscillating SGD with a large learning rate can still make correct predictions, while the NN trained by SGD with a small learning rate could not. Our theory sheds light on how large learning rate training benefits the generalization of NNs. Experimental results demonstrate our findings on the phenomenon of “benign oscillation”.

IJCAI Conference 2024 Conference Paper

FedES: Federated Early-Stopping for Hindering Memorizing Heterogeneous Label Noise

  • Bixiao Zeng
  • Xiaodong Yang
  • Yiqiang Chen
  • Zhiqi Shen
  • Hanchao Yu
  • Yingwei Zhang

Federated learning (FL) facilitates collaborative model training across distributed clients while maintaining privacy. Federated noisy label learning (FNLL) is more of a challenge for data inaccessibility and noise heterogeneity. Existing works primarily assume clients are either noisy or clean, which may lack the flexibility to adapt to diverse label noise across different clients, especially when entirely clean or noisy clients are not the majority. To address this, we propose a general noise-robust federated learning framework called Federated Early-Stopping (FedES), which adaptively updates critical parameters of each local model based on their noise rates, thereby avoiding overfitting to noisy labels. FedES is composed of two stages: federated noise estimation and parameter-adaptive local updating \& global aggregation. We introduce a signed distance based on local and global gradients during a federated round to estimate clients' noise rates without requiring additional information. Based on this measure, we employ various degrees of early-stopping during local updating on the clients, and further, a noise-aware global aggregation is employed to achieve noise-robust learning. Extensive experiments conducted on varying synthetic and real-world label noise demonstrate the superior performance of FedES over the state-of-the-art methods.

ICML Conference 2024 Conference Paper

Network Tight Community Detection

  • Jiayi Deng
  • Xiaodong Yang
  • Jun Yu
  • Jun Liu
  • Zhaiming Shen
  • Danyang Huang
  • Huimin Cheng

Conventional community detection methods often categorize all nodes into clusters. However, the presumed community structure of interest may only be valid for a subset of nodes (named as ‘tight nodes’), while the rest of the network may consist of noninformative “scattered nodes”. For example, a protein-protein network often contains proteins that do not belong to specific biological functional modules but are involved in more general processes, or act as bridges between different functional modules. Forcing each of these proteins into a single cluster introduces unwanted biases and obscures the underlying biological implication. To address this issue, we propose a tight community detection (TCD) method to identify tight communities excluding scattered nodes. The algorithm enjoys a strong theoretical guarantee of tight node identification accuracy and is scalable for large networks. The superiority of the proposed method is demonstrated by various synthetic and real experiments.

JBHI Journal 2023 Journal Article

ICL-Net: Global and Local Inter-Pixel Correlations Learning Network for Skin Lesion Segmentation

  • Weiwei Cao
  • Gang Yuan
  • Qi Liu
  • Chengtao Peng
  • Jing Xie
  • Xiaodong Yang
  • Xinye Ni
  • Jian Zheng

Skin lesion segmentation is a fundamental procedure in computer-aided melanoma diagnosis. However, due to the diverse shape, variable size, blurry boundary, and noise interference of lesion regions, existing methods may struggle with the challenge of inconsistency within classes and indiscrimination between classes. In view of this, we propose a novel method to learn and model inter-pixel correlations from both global and local aspects, which can increase inter-class variances and intra-class similarities. Specifically, under the encoder-decoder architecture, we first design a pyramid transformer inter-pixel correlations (PTIC) module, aiming at capturing the non-local context information of different levels and further exploring the global pixel-level relationship to deal with the large variance of shape and size. Further, we devise a local neighborhood metric learning (LNML) module to strengthen the local semantic correlations learning capability and increase the separability between classes in the feature space. These two modules can complementarily strengthen the feature representation capability via exploiting the inter-pixel semantic correlations, thus further improving intra-class consistency and inter-class variance. Comprehensive experiments are performed on public skin lesion segmentation datasets: ISIC 2018, ISIC2016, and PH2, and experimental results demonstrate that the proposed method achieves better segmentation performance than other state-of-the-art methods.

ICML Conference 2023 Conference Paper

Transcendental Idealism of Planner: Evaluating Perception from Planning Perspective for Autonomous Driving

  • Weixin Li
  • Xiaodong Yang

Evaluating the performance of perception modules in autonomous driving is one of the most critical tasks in developing the complex intelligent system. While module-level unit test metrics adopted from traditional computer vision tasks are feasible to some extent, it remains far less explored to measure the impact of perceptual noise on the driving quality of autonomous vehicles in a consistent and holistic manner. In this work, we propose a principled framework that provides a coherent and systematic understanding of the impact an error in the perception module imposes on an autonomous agent’s planning that actually controls the vehicle. Specifically, the planning process is formulated as expected utility maximisation, where all input signals from upstream modules jointly provide a world state description, and the planner strives for the optimal action by maximising the expected utility determined by both world states and actions. We show that, under practical conditions, the objective function can be represented as an inner product between the world state description and the utility function in a Hilbert space. This geometric interpretation enables a novel way to analyse the impact of noise in world state estimation on planning and leads to a universal metric for evaluating perception. The whole framework resembles the idea of transcendental idealism in the classical philosophical literature, which gives the name to our approach.

AAAI Conference 2022 Short Paper

Class-Wise Adaptive Self Distillation for Federated Learning on Non-IID Data (Student Abstract)

  • Yuting He
  • Yiqiang Chen
  • Xiaodong Yang
  • Yingwei Zhang
  • Bixiao Zeng

Federated learning (FL) enables multiple clients to collaboratively train a globally generalized model while keeping local data decentralized. A key challenge in FL is to handle the heterogeneity of data distributions among clients. The local model will shift the global feature when fitting local data, which results in forgetting the global knowledge. Following the idea of knowledge distillation, the global model’s prediction can be utilized to help local models preserve the global knowledge in FL. However, when the global model hasn’t converged completely, its predictions tend to be less reliable on certain classes, which may results in distillation’s misleading of local models. In this paper, we propose a classwise adaptive self distillation (FedCAD) mechanism to ameliorate this problem. We design class-wise adaptive terms to soften the influence of distillation loss according to the global model’s performance on each class and therefore avoid the misleading. Experiments show that our method outperforms other state-of-the-art FL algorithms on benchmark datasets.

TIST Journal 2022 Journal Article

CLC: A Consensus-based Label Correction Approach in Federated Learning

  • Bixiao Zeng
  • Xiaodong Yang
  • Yiqiang Chen
  • Hanchao Yu
  • Yingwei Zhang

Federated learning (FL) is a novel distributed learning framework where multiple participants collaboratively train a global model without sharing any raw data to preserve privacy. However, data quality may vary among the participants, the most typical of which is label noise. The incorrect label would significantly damage the performance of the global model. In FL, the inaccessibility of raw data makes this issue more challenging. Previously published studies are limited to using a task-specific benchmark-trained model to evaluate the relevance between the benchmark dataset in the server and the local one on the participants’ side. However, such approaches have failed to exploit the cooperative nature of FL itself and are not practical. This paper proposes a Consensus-based Label Correction approach (CLC) in FL, which tries to correct the noisy labels using the developed consensus method among the FL participants. The consensus-defined class-wise information is used to identify the noisy labels and correct them with pseudo-labels. Extensive experiments are conducted on several public datasets in various settings. The experimental results prove the advantage over the state-of-art methods. The link to the source code is https://github.com/bixiao-zeng/CLC.git.

JBHI Journal 2021 Journal Article

3D Context-Aware Convolutional Neural Network for False Positive Reduction in Clustered Microcalcifications Detection

  • Jian Zheng
  • Haotian Sun
  • Shandong Wu
  • Ke Jiang
  • Yunsong Peng
  • Xiaodong Yang
  • Fan Zhang
  • Ming Li

False positives (FPs) reduction is indispensable for clustered microcalcifications (MCs) detection in digital breast tomosynthesis (DBT), since there might be excessive false candidates in the detection stage. Considering that DBT volume has an anisotropic resolution, we proposed a novel 3D context-aware convolutional neural network (CNN) to reduce FPs, which consists of a 2D intra-slices feature extraction branch and a 3D inter-slice features fusion branch. In particular, 3D anisotropic convolutions were designed to learn representations from DBT volumes and inter-slice information fusion is only performed on the feature map level, which could avoid the influence of anisotropic resolution of DBT volume. The proposed method was evaluated on a large-scale Chinese women population of 877 cases with 1754 DBT volumes and compared with 8 related methods. Experimental results show that the proposed network achieved the best performance with an accuracy of 92. 68% for FPs reduction with an AUC of 97. 65%, and the FPs are 0. 0512 per DBT volume at a sensitivity of 90%. This also proved that making full use of 3D contextual information of DBT volume can improve the performance of the classification algorithm.

JBHI Journal 2021 Journal Article

Circulating Nurse Assistant: Non-Contact Body Centric Gesture Recognition Towards Reducing Latrogenic Contamination

  • Nan Zhao
  • Xiaodong Yang
  • Zhiya Zhang
  • Muhammad Bilal Khan

Iatrogenic contamination causes serious health threats to both patients and healthcare staff. Contact operation is an important transmission route for nosocomial infection. Reducing direct contact during medical treatment can reduce nosocomial infection quickly and effectively. Scientific and technological progress in the 5G era brings new solutions to the problem of iatrogenic contamination. We conducted experiments at 27 GHz and 37 GHz to achieve contactless gesture recognition through the bornprint of body centric channel. The original channel S-parameters can achieve 82% (27 GHz) and 89% (37 GHz) basic recognition accuracy through simple statistical analysis. Basic switch recognition and multi-gesture selection recognition can meet the common operation requirements of circulating nurses, greatly reducing contact operations and reducing the probability of cross-contamination. Fully physically isolated body centric channel gesture sensing provides a new entry point for reducing iatrogenic contamination.

IJCAI Conference 2020 Conference Paper

Bridging Cross-Tasks Gap for Cognitive Assessment via Fine-Grained Domain Adaptation

  • Yingwei Zhang
  • Yiqiang Chen
  • Hanchao Yu
  • Zeping Lv
  • Qing Li
  • Xiaodong Yang

Discriminating pathologic cognitive decline from the expected decline of normal aging is an important research topic for elderly care and health monitoring. However, most cognitive assessment methods only work when data distributions of the training set and testing set are consistent. Enabling existing cognitive assessment models to adapt to the data in new cognitive assessment tasks is a significant challenge. In this paper, we propose a novel domain adaptation method, namely the Fine-Grained Adaptation Random Forest (FAT), to bridge the cognitive assessment gap when the data distribution is changed. FAT is composed of two essential parts 1) information gain based model evaluation strategy (IGME) and 2) domain adaptation tree growing mechanism (DATG). IGME is used to evaluate every individual tree, and DATG is used to transfer the source model to the target domain. To evaluate the performance of FAT, we conduct experiments in real clinical environments. Experimental results demonstrate that FAT is significantly more accurate and efficient compared with other state-of-the-art methods.

AAAI Conference 2020 Conference Paper

Instance-Wise Dynamic Sensor Selection for Human Activity Recognition

  • Xiaodong Yang
  • Yiqiang Chen
  • Hanchao Yu
  • Yingwei Zhang
  • Wang Lu
  • Ruizhe Sun

Human Activity Recognition (HAR) is an important application of smart wearable/mobile systems for many humancentric problems such as healthcare. The multi-sensor synchronous measurement has shown better performance for HAR than a single sensor. However, the multi-sensor setting increases the costs of data transmission, computation and energy. Therefore, the efficient sensor selection to balance recognition accuracy and sensor cost is the critical challenge. In this paper, we propose an Instance-wise Dynamic Sensor Selection (IDSS) method for HAR. Firstly, we formalize this problem as minimizing both activity classification loss and sensor number by dynamically selecting a sparse subset for each instance. Then, IDSS solves the above minimization problem via Markov Decision Process whose policy for sensor selection is learned by exploiting the instancewise states using Imitation Learning. In order to optimize the parameters of the activity classification model and the sensor selection policy, an algorithm named Mutual DAgger is proposed to alternatively enhance their learning process. To evaluate the performance of IDSS, we conduct experiments on three real-world HAR datasets. The experimental results show that IDSS can effectively reduce the overall sensor number without losing accuracy and outperforms the state-of-theart methods regarding the combined measurement of accuracy and sensor number.

NeurIPS Conference 2019 Conference Paper

Dancing to Music

  • Hsin-Ying Lee
  • Xiaodong Yang
  • Ming-Yu Liu
  • Ting-Chun Wang
  • Yu-Ding Lu
  • Ming-Hsuan Yang
  • Jan Kautz

Dancing to music is an instinctive move by humans. Learning to model the music-to-dance generation process is, however, a challenging problem. It requires significant efforts to measure the correlation between music and dance as one needs to simultaneously consider multiple aspects, such as style and beat of both music and dance. Additionally, dance is inherently multimodal and various following movements of a pose at any moment are equally likely. In this paper, we propose a synthesis-by-analysis learning framework to generate dance from music. In the top-down analysis phase, we decompose a dance into a series of basic dance units, through which the model learns how to move. In the bottom-up synthesis phase, the model learns how to compose a dance by combining multiple basic dancing movements seamlessly according to input music. Experimental qualitative and quantitative results demonstrate that the proposed method can synthesize realistic, diverse, style-consistent, and beat-matching dances from music.

FormaliSE Conference 2019 Conference Paper

Parallelizable reachability analysis algorithms for feed-forward neural networks

  • Hoang-Dung Tran
  • Patrick Musau
  • Diego Manzanas Lopez
  • Xiaodong Yang
  • Luan Viet Nguyen
  • Weiming Xiang 0001
  • Taylor T. Johnson

Artificial neural networks (ANN) have displayed considerable utility in a wide range of applications such as image processing, character and pattern recognition, self-driving cars, evolutionary robotics, and non-linear system identification and control. While ANNs are able to carry out complicated tasks efficiently, they are susceptible to unpredictable and errant behavior due to irregularities that emanate from their complex non-linear structure. As a result, there have been reservations about incorporating them into safety-critical systems. In this paper, we present a reachability analysis method for feed-forward neural networks (FNN) that employ rectified linear units (ReLUs) as activation functions. The crux of our approach relies on three reachable-set computation algorithms, namely exact schemes, lazy-approximate schemes, and mixing schemes. The exact scheme computes an exact reachable set for FNN, while the lazy-approximate and mixing schemes generate an over-approximation of the exact reachable set. All schemes are designed efficiently to run on parallel platforms to reduce the computation time and enhance the scalability. Our methods are implemented in a toolbox called, NNV, and is evaluated using a set of benchmarks that consist of realistic neural networks with sizes that range from tens to a thousand neurons. Notably, NNV successfully computes and visualizes the exact reachable sets of the real world ACAS Xu deep neural networks (DNNs), which are a variant of a family of novel airborne collision detection systems known as the ACAS System X, using a representation of tens to hundreds of polyhedra.

JBHI Journal 2018 Journal Article

Wandering Pattern Sensing at S-Band

  • Xiaodong Yang
  • Syed Aziz Shah
  • Aifeng Ren
  • Nan Zhao
  • Dou Fan
  • Fangming Hu
  • Masood Ur Rehman
  • Karen M. von Deneen

Increasing prevalence of dementia has posed several challenges for care-givers. Patients suffering from dementia often display wandering behavior due to boredom or memory loss. It is considered to be one of the challenging conditions to manage and understand. Traits of dementia patients can compromise their safety causing serious injuries. This paper presents investigation into the design and evaluation of wandering scenarios with patients suffering from dementia using an S-band sensing technique. This frequency band is the wireless channel commonly used to monitor and characterize different scenarios including random, lapping, and pacing movements in an indoor environment. Wandering patterns are characterized depending on the received amplitude and phase information of that measures the disturbance caused in the ideal radio signal. A secondary analysis using support vector machine is used to classify the three patterns. The results show that the proposed technique carries high classification accuracy up to 90% and has good potential for healthcare applications.

JBHI Journal 2015 Journal Article

Sparsity-Inspired Nonparametric Probability Characterization for Radio Propagation in Body Area Networks

  • Xiaodong Yang
  • Shuyuan Yang
  • Qammer Hussain Abbasi
  • Zhiya Zhang
  • Aifeng Ren
  • Wei Zhao
  • Akram Alomainy

Parametric probability models are common references for channel characterization. However, the limited number of samples and uncertainty of the propagation scenario affect the characterization accuracy of parametric models for body area networks. In this paper, we propose a sparse nonparametric probability model for body area wireless channel characterization. The path loss and root-mean-square delay, which are significant wireless channel parameters, can be learned from this nonparametric model. A comparison with available parametric models shows that the proposed model is very feasible for the body area propagation environment and can be seen as a significant supplement to parametric approaches.