Author name cluster

Jun Yao

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

17 papers

2 author rows

EAAI Journal 2026 Journal Article

Active detection-based concept drift adaptive approach for coal mill condition monitoring

Jian Xu
Yuguang Niu
Ming Du
Jun Yao
Guoxiong Zhu

The coal mill plays an important role in coal-fired power plants, as its operational condition affects overall efficiency and reliability. Consequently, condition monitoring of coal mills has received increasing attention. However, existing data-driven methods typically assume stationary data distributions and lack the adaptability required to handle concept drift arising from equipment aging, operational fluctuations, and fuel variations. To address these challenges, a novel active detection-based concept drift adaptive approach is proposed for long-term condition monitoring of coal mills. Instead of relying on periodic retraining, the proposed approach employs an active detection mechanism to autonomously initiate updates upon the identification of drift. This approach consists of three main components: feature extraction, drift detection, and model adaptation. Specifically, the dual-channel graph temporal convolutional network is designed to extract robust spatio-temporal features. The sliding window-based drift detection method is proposed to dynamically adjust detection thresholds and identify drift points. In addition, the complementary learning system-based incremental learning method integrates both short-term and long-term memory mechanisms with a triple-trigger update strategy to ensure timely and reliable model adaptation. Validated on real-world data from a coal-fired power unit, the proposed approach accurately issued two early fault warnings without false alarms over a continuous 102-day monitoring period. Importantly, the model required only seven updates during this period, indicating its ability to adapt effectively to evolving data distributions with minimal intervention. These results indicate that the proposed approach offers a practical and reliable solution for intelligent operation and maintenance in industrial settings.

Details DOI

YNIMG Journal 2026 Journal Article

Associations between glymphatic dysfunction, white matter injury, and cognitive decline in Parkinson’s disease

Jun Yao
Yuting Xia
Song’an Shang
Ting Huang
Youyong Tian
Wei Gao
Yan Gu
Yu-Chen Chen

Parkinson's disease (PD) is a neurodegenerative disorder characterized by motor and nonmotor symptoms, with cognitive impairment significantly affecting patients' quality of life. This study aimed to investigate the relationships among glymphatic dysfunction, white matter (WM) injury and cognitive decline in PD patients. Seventy PD patients and 82 healthy controls (HCs) underwent clinical evaluations and magnetic resonance imaging (MRI) scans. Key metrics included the diffusion tensor imaging-along the perivascular space (DTI-ALPS) index, choroid plexus volume (CPV), white matter free water (WM-FW), and peak width of skeletonized mean diffusivity (PSMD). Statistical analyses included correlation analyses, mediation analysis and receiver operating characteristic (ROC) analyses. PD patients exhibited lower DTI-ALPS index (p < 0.001) and higher CPV (p < 0.001), WM-FW (p = 0.010), and PSMD (p < 0.001) compared to the HCs. The DTI-ALPS index was negatively correlated with WM-FW (r = -0.612, p < 0.001) and PSMD (r = -0.484, p < 0.001), whereas CPV was positively correlated with both (r = 0.613, p < 0.001; r = 0.540, p < 0.001). The DTI-ALPS index correlated positively with Montreal Cognitive Assessment (MoCA) score (ρ = 0.471, p < 0.001). CPV (ρ = -0.421, p = 0.002), WM-FW (ρ = -0.296, p = 0.029), and PSMD (ρ = -0.273, p = 0.044) correlated negatively with the MoCA score. Mediation analysis suggested that DTI-ALPS and CPV may be involved in the association between white matter injury and cognition, but the independent role of each individual indicator was not confirmed. Diagnostic performance evaluations indicated that the PSMD best predicted PD individually (AUC = 0.836), with the integrated four-biomarker model performing best (AUC = 0.855). These findings highlight the correlation between glymphatic function, WM integrity, and cognition in PD patients, supporting the use of these neuroimaging biomarkers for early diagnosis and monitoring of cognitive decline.

Details DOI

EAAI Journal 2026 Journal Article

Informer-enhanced digital twin framework for forced draft fans status forecasting in coal-fired power units

Guoxiong Zhu
Xiaoning Zhang
Jian Xu
Kaijun Jiang
Jun Yao
Qinghua Wang
Ming Du
Yuguang Niu

Forced draft fans are critical auxiliary equipment in coal-fired power plants, and their operational status directly affects combustion stability and energy efficiency. Under flexible operation conditions, these systems exhibit strong non-stationary dynamics and gradual operational drift, posing significant challenges to the long-term reliability of forecasting models. Although data-driven approaches have been widely adopted, most existing methods rely on static training strategies and lack mechanisms to adapt to evolving operating conditions during deployment. To address this limitation, this study proposes an adaptive forecasting-oriented digital twin framework for forced draft fan status prediction. A long-sequence time-series model is employed as the predictive backbone, while an error-driven incremental updating mechanism is introduced to mitigate performance degradation under operational drift. A robust dynamic threshold based on median absolute deviation is designed to detect prediction deviation and trigger model updating only when necessary. In addition, a temporally structured sampling strategy is developed to preserve representative data characteristics while reducing retraining cost. Experiments based on real operational data demonstrate that the proposed framework achieves high prediction accuracy in short-term forecasting tasks and maintains stable performance under extended prediction horizons. The results indicate that the proposed adaptive deployment strategy enhances long-term reliability compared with conventional static forecasting models, providing practical support for intelligent monitoring and predictive maintenance of industrial auxiliary equipment.

Details DOI

ICML Conference 2025 Conference Paper

FlatQuant: Flatness Matters for LLM Quantization

Yuxuan Sun
Ruikang Liu
Haoli Bai
Han Bao
Kang Zhao
Yuening Li
Jiaxin Hu
Xianzhi Yu

Recently, quantization has been widely used for the compression and acceleration of large language models (LLMs). Due to the outliers in LLMs, it is crucial to flatten weights and activations to minimize quantization error with equally spaced quantization points. Prior research explores various pre-quantization transformations to suppress outliers, such as per-channel scaling and Hadamard transformation. However, we observe that these transformed weights and activations can still exhibit steep and dispersed distributions. In this paper, we propose FlatQuant (Fast and Learnable Affine Transformation), a new post-training quantization approach that enhances the flatness of weights and activations. Our approach identifies optimal affine transformations for each linear layer, calibrated in hours via a lightweight objective. To reduce runtime overhead of affine transformation, we apply Kronecker product with two lightweight matrices, and fuse all operations in FlatQuant into a single kernel. Extensive experiments demonstrate that FlatQuant establishes a new state-of-the-art benchmark for quantization. For example, it achieves less than 1% accuracy drop for W4A4 quantization on the LLaMA-3-70B model, surpassing SpinQuant by 7. 5%. Additionally, it provides up to 2. 3x prefill speedup and 1. 7x decoding speedup compared to the FP16 model. Code is available at: https: //github. com/ruikangliu/FlatQuant.

Details

IJCAI Conference 2025 Conference Paper

GSDNet: Revisiting Incomplete Multimodality-Diffusion Emotion Recognition from the Perspective of Graph Spectrum

Yuntao Shou
Jun Yao
Tao Meng
Wei Ai
Cen Chen
Keqin Li

Multimodal Emotion Recognition (MER) combines technologies from multiple fields (e. g. , computer vision, natural language processing, and audio signal processing), aiming to infer an individual's emotional state by analyzing information from different sources (i. e. , video, audio, and text). Compared with single modality, by fusing complementary semantic information from different modalities, the model can obtain more robust knowledge representation. However, the modality missing problem limits the performance of MERC in practical scenarios. Recent work has achieved impressive performance on modality completion using graph neural networks and diffusion models, respectively. This inspires us to combine these two dimensions in the completion network to obtain more powerful representation capabilities. However, we argue that directly running a full-rank score-based diffusion model on the entire graph adjacency matrix space may adversely affect the learning process of the diffusion model. This is because the model assumes a direct relationship between each pair of nodes and ignores local structural features and sparse connections between nodes, thereby significantly reducing the quality of the generated data. Based on the above ideas, we propose a novel Graph Spectral Diffusion Network (GSDNet), which utilizes a low-rank score-based diffusion model to map Gaussian noise to the graph spectral distribution space of missing modalities and recover the missing data according to its original distribution. Extensive experiments have demonstrated that GSDNet achieves state-of-the-art emotion recognition performance in various modality loss scenarios.

PDF Details DOI

YNIMG Journal 2025 Journal Article

Predominant white matter microstructural changes over gray matter in tinnitus brain

Qianhui Xu
Tingting Chai
Jun Yao
Chunhua Xing
Xiaomin Xu
Xindao Yin
Fei Zhao
Richard Salvi

INTRODUCTION: To explore microstructure changes across brain white matter and gray matter in tinnitus patients and its effect on neuropsychological performance. METHODS: The cross-sectional study used Multi-shell Diffusion Weighted Imaging data and neuropsychological assessment from 48 tinnitus patients and 48 healthy controls. Microstructural features across over white matter and gray matter based on Diffusion Tensor Imaging (DTI) and Neurite Orientation Dispersion and Density Imaging (NODDI) model using Tract-Based Spatial Statistics (TBSS) and Gray Matter-Based Spatial Statistics (GBSS), as well as topological properties were derived from an advanced tractography model in subjects. Brain-neuropsychological performance correlations were analyzed. RESULTS: Tinnitus patients showed decreased axial diffusivity in forceps minor and right corticospinal tract, increased orientation dispersion in forceps minor, decreased connection strength between the right caudate and pericalcarine, right caudate and superior temporal lobe, and left putamen and cuneus. Global network efficiency and local network efficiency were significantly less in tinnitus patients while feeder connection strength was significantly less in tinnitus patients. The orientation dispersion value mediated the relationship between tinnitus status and Trail Making Test-Part B scores. However, no obvious microstructural changes in gray matter were observed. CONCLUSION: Leveraging multi-shell DWI data, the current study indicated that fiber disruption and internal connectivity organizational changes in brain white matter, rather than gray matter, were more susceptible in tinnitus patients. These microstructural changes in white matter could be associated with changes in cognitive function in tinnitus patients.

Details DOI

IJCAI Conference 2024 Conference Paper

Sparse Multi-Relational Graph Convolutional Network for Multi-type Object Trajectory Prediction

Jianhui Zhang
Jun Yao
Liqi Yan
Yanhong Xu
Zheng Wang

Object trajectory prediction is a hot research issue with wide applications in video surveillance and autonomous driving. The previous studies consider the interaction sparsity mainly among the pedestrians instead of multi-type of objects, which brings new types of interactions and consequently superfluous ones. This paper proposes a Multi-type Object Trajectory Prediction (MOTP) method with a Sparse Multi-relational Graph Convolutional Network (SMGCN) and a novel multi-round Global Temporal Aggregation (GTA). MOTP introduces a novel adaptive sparsification and multi-scale division method to model interactions among multitype of objects. It further incorporates a Sparse Multi-relational Temporal Graph to capture the temporal division of multi-type trajectories, along with a multi-round Global Temporal Aggregation (GTA) mechanism to mitigate error accumulation, and enhances the trajectory prediction accuracy. The extensive evaluation on the ETH, UCY and SDD datasets shows that our method outperforms the typical state-of-the-art works by significant margins. Codes will be available in https: //github. com/ sounio/SMGCN.

PDF Details DOI

EAAI Journal 2024 Journal Article

Theory guided Lagrange programming neural network for subsurface flow problems

Jian Wang
Xiaofeng Xue
Zhixue Sun
Jun Yao
El-Sayed M. El-Alfy
Kai Zhang
Witold Pedrycz
Jacek Mańdziuk

A deep learning model can perform efficient uncertainty quantification (UQ) for reservoir flow with uncertain model parameters, but usually requires large amounts of training data to ensure accuracy. However, the cost of obtaining large amounts of data is prohibitive, and the performance will deteriorate if sufficient training data is lacking. Alternatively, more interpretable neural networks with embedded physical laws have recently been used to solve partial differential equations as well as to solve UQ problems. This approach has received a lot of attention due to its low data volume requirements and its adherence to the laws of physics during the training process. In this paper, we propose a theory-guided framework based on a bilevel programming model with hard constraints to embed physical meaning in the model. Theory guided Lagrange programming neural network (TGLPNN) combines the method of Lagrange programming neural network approach where physical laws such as stochastic partial differential equations and boundary conditions are incorporated into the training process of a convolutional neural network. At the same time, the upper-level variables are iteratively optimized. The method based on Lagrange programming neural network inherently embeds physical laws in the network. Practical applications have shown that TGLPNN can provide higher prediction accuracy compared to state-of-the-art physics-driven methods and improved efficiency compared to numerical methods.

Details DOI

ICLR Conference 2023 Conference Paper

ArCL: Enhancing Contrastive Learning with Augmentation-Robust Representations

Xuyang Zhao
Tianqi Du
Yisen Wang 0001
Jun Yao
Weiran Huang 0001

Self-Supervised Learning (SSL) is a paradigm that leverages unlabeled data for model training. Empirical studies show that SSL can achieve promising performance in distribution shift scenarios, where the downstream and training distributions differ. However, the theoretical understanding of its transferability remains limited. In this paper, we develop a theoretical framework to analyze the transferability of self-supervised contrastive learning, by investigating the impact of data augmentation on it. Our results reveal that the downstream performance of contrastive learning depends largely on the choice of data augmentation. Moreover, we show that contrastive learning fails to learn domain-invariant features, which limits its transferability. Based on these theoretical insights, we propose a novel method called Augmentation-robust Contrastive Learning (ArCL), which guarantees to learn domain-invariant features and can be easily integrated with existing contrastive learning algorithms. We conduct experiments on several datasets and show that ArCL significantly improves the transferability of contrastive learning.

Details

TMLR Journal 2023 Journal Article

Complementary Sparsity: Accelerating Sparse CNNs with High Accuracy on General-Purpose Computing Platforms

Kang Zhao
Yijun Tan
Kai Han
Ting Hu
Hanting Chen
Tao Yuan
Yunhe Wang
Jun Yao

Model sparsity is a promising approach to reducing parameters or FLOPs of convolutional neural networks (CNNs). Compared to unstructured or coarse-grained structured sparsity, fine-grained structured sparsity, e.g., N:M sparse pattern, can achieve a better balance between accuracy and efficiency on general computing platforms like CPUs and GPUs. In particular, the 2:4 sparsity can accelerate CNN inference by 2$\times$ speed and with negligible accuracy drop. However, N:M sparsity needs to be supported by GPU within specific hardware circuits and hardly achieves significant speedups on common GPUs. To accelerate CNNs with general-purposed computing resources and simultaneously retain the model accuracy as much as possible, this paper proposes complementary sparsity (CS). CS denotes that only one weight can be retained for weights spaced at the same distance. On the one hand, CS features high mask flexibility, which is naturally favorable to high model accuracy. Moreover, we propose a CS-specific sparse training method to improve CS-based CNNs' accuracy under high parameter sparsities ($>$75\%). On the other hand, CS itself is memory-access balanced and robust to pattern hyperparameters, which can be utilized to speedup CS-based convolution computation on CPUs and common GPUs. We thus propose a CS convolution parallel computing algorithm that adapts to common GPUs without sparse tensor cores. Experimental results show that compared to other sparsity patterns, the proposed CS can achieve the optimal trade-off in terms of accuracy and latency for CPUs and common GPUs, respectively. Codes will be available at https://gitee.com/mindspore/models/tree/master/research/cv/CS.

PDF Details

NeurIPS Conference 2022 Conference Paper

Accelerating Sparse Convolution with Column Vector-Wise Sparsity

Yijun Tan
Kai Han
Kang Zhao
Xianzhi Yu
Zidong Du
Yunji Chen
Yunhe Wang
Jun Yao

Weight sparsity is a promising approach to reducing the model size and computation cost of convolutional neural networks (CNNs). Nevertheless, non-zero weights often distribute randomly in sparse CNN models, introducing enormous difficulty in obtaining actual speedup on common hardware (e. g. , GPU) over their dense counterparts. Existing acceleration solutions either require hardware modifications for irregular memory access support or rely on a partially structured sparsity pattern. Neither of these methods is capable of achieving fruitful speedup on convolution layers. In this work, we propose an algorithm-software co-designed sparse convolution based on a novel out-vector-wise (OVW) sparse pattern. Building on the insight that vertical vector integrity can preserve continuous memory access in IM2COL, the OVW pattern treats a $V\times1$ vector as an entirety. To reduce the error caused by sparsity, we propose an equivalent transformation process, i. e. , clustering-based channel permutation, to gather similar rows together. Experimental evaluations demonstrate that our method achieves a $1. 7\times$ and $3. 2\times$ speedup over the SOTA solution and the dense convolution of ResNet50 on NVIDIA V100 at 75\% sparsity, respectively, with only negligible accuracy loss. Moreover, compared to the SOTA solution that achieves speedups only on data with 60\% sparsity or more, our method begins to obtain speedups on data with only 10\% sparsity.

PDF Details

AAAI Conference 2021 Conference Paper

Commission Fee is not Enough: A Hierarchical Reinforced Framework for Portfolio Management

Rundong Wang
Hongxin Wei
Bo An
Zhouyan Feng
Jun Yao

Portfolio management via reinforcement learning is at the forefront of fintech research, which explores how to optimally reallocate a fund into different financial assets over the long term by trial-and-error. Existing methods are impractical since they usually assume each reallocation can be finished immediately and thus ignoring the price slippage as part of the trading cost. To address these issues, we propose a hierarchical reinforced stock trading system for portfolio management (HRPM). Concretely, we decompose the trading process into a hierarchy of portfolio management over trade execution and train the corresponding policies. The high-level policy gives portfolio weights at a lower frequency to maximize the long term profit and invokes the low-level policy to sell or buy the corresponding shares within a short time window at a higher frequency to minimize the trading cost. We train two levels of policies via pre-training scheme and iterative training scheme for data efficiency. Extensive experimental results in the U. S. market and the China market demonstrate that HRPM achieves significant improvement against many state-of-the-art approaches.

PDF Details

YNIMG Journal 2020 Journal Article

Coordination of multiple joints increases bilateral connectivity with ipsilateral sensorimotor cortices

Kevin B. Wilkins
Jun Yao

Although most activities of daily life require simultaneous coordination of both proximal and distal joints, motor preparation during such movements has not been well studied. Previous results for motor preparation have focused on hand/finger movements. For simple hand/finger movements, results have found that such movements typically evoke activity primarily in the contralateral motor cortices. However, increasing the complexity of the finger movements, such as during a distal sequential finger-pressing task, leads to additional recruitment of ipsilateral resources. It has been suggested that this involvement of the ipsilateral hemisphere is critical for temporal coordination of distal joints. The goal of the current study was to examine whether increasing simultaneous coordination of multiple joints (both proximal and distal) leads to a similar increase in coupling with ipsilateral sensorimotor cortices during motor preparation compared to a simple distal movement such as hand opening. To test this possibility, 12 healthy individuals participated in a high-density EEG experiment in which they performed either hand opening or simultaneous hand opening while lifting at the shoulder on a robotic device. We quantified within- and cross-frequency cortical coupling across the sensorimotor cortex for the two tasks using dynamic causal modeling. Both hand opening and simultaneous hand opening while lifting at the shoulder elicited coupling from secondary motor areas to primary motor cortex within the contralateral hemisphere exclusively in the beta band, as well as from ipsilateral primary motor cortex. However, increasing the task complexity by combining hand opening while lifting at the shoulder also led to an increase in cross-frequency coupling within the ipsilateral hemisphere including theta, beta, and gamma frequencies, as well as a change in the coupling frequency of the interhemispheric coupling between the primary motor and premotor cortices. These findings demonstrate that increasing the demand of joint coordination between proximal and distal joints leads to increases in communication with the ipsilateral hemisphere as previously observed in distal sequential finger tasks.

Details DOI

YNICL Journal 2015 Journal Article

Sensory cortical re-mapping following upper-limb amputation and subsequent targeted reinnervation: A case report

Jun Yao
Albert Chen
Todd Kuiken
Carolina Carmona
Julius Dewald

This case study demonstrates the change of sensory cortical representations of the residual parts of the arm in an individual who underwent a trans-humeral amputation and subsequent targeted reinnervation (TR). As a relatively new surgical technique, TR restores a direct neural connection from amputated sensorimotor nerves to specific target muscles. This method has been successfully applied to upper-limb and lower-limb amputees, and has shown effectiveness in regaining control signals via the newly re-innervated muscles. Correspondingly, recent study results have shown that motor representations for the missing limb move closer to their original locations following TR. Besides regaining motor control signals, TR also restores the sensation in the re-innervated skin areas. We therefore hypothesize that TR causes analogous cortical sensory remapping that may return closer to their original locations. In order to test this hypothesis, cortical activity in response to sensory-level electrical stimulation in different parts of the arm was studied longitudinally in one amputated individual before and up to 2 years after TR. Our results showed that 1) before TR, the cortical response to sensory electrical stimulation in the residual limb showed a diffuse bilateral pattern without a clear focus in either the time or spatial domain; and 2) 2 years after TR, the sensory map of the reinnervated median nerve reorganized, showing predominant activity over the contralateral S1 hand area as well as moderate activity over the ipsilateral S1. Therefore, this work provides new evidence for long-term sensory cortical plasticity in the human brain after TR.

Details DOI

YNICL Journal 2013 Journal Article

Cortical motor activity and reorganization following upper-limb amputation and subsequent targeted reinnervation

Albert Chen
Jun Yao
Todd Kuiken
Julius P.A. Dewald

Previous studies have postulated that the amount of brain reorganization following peripheral injuries may be correlated with negative symptoms or consequences. However, it is unknown whether restoring effective limb function may then be associated with further changes in the expression of this reorganization. Recently, targeted reinnervation (TR), a surgical technique that restores a direct neural connection from amputated sensorimotor nerves to new peripheral targets such as muscle, has been successfully applied to upper-limb amputees. It has been shown to be effective in restoring both peripheral motor and sensory functions via the reinnervated nerves as soon as a few months after the surgery. However, it was unclear whether TR could also restore normal cortical motor representations for control of the missing limb. To answer this question, we used high-density electroencephalography (EEG) to localize cortical activity related to cued motor tasks generated by the intact and missing limb. Using a case study of 3 upper-limb amputees, 2 of whom went through pre and post-TR experiments, we present unique quantitative evidence for the re-mapping of motor representations for the missing limb closer to their original locations following TR. This provides evidence that an effective restoration of peripheral function from TR can be linked to the return of more normal cortical expression for the missing limb. Therefore, cortical mapping may be used as a potential guide for monitoring rehabilitation following peripheral injuries.

Details DOI

YNIMG Journal 2009 Journal Article

Cortical overlap of joint representations contributes to the loss of independent joint control following stroke

Jun Yao
Albert Chen
Carolina Carmona
Julius P.A. Dewald

The loss of independent joint control in the paretic upper limb is a cardinal sign of movement disorders following stroke. However, the underlying neural mechanisms for such a loss following stroke are still largely unknown. In order to investigate the possible contribution of altered sensorimotor cortical activity to the loss of independent joint control, we measured electroencephalographic (EEG) and torque signals during the generation of static shoulder/elbow torques. We found significant increases in the overlap of shoulder and elbow joint representations at the cortical level in stroke subjects as compared to control subjects. Linear regression results demonstrated significant associations between the cortical overlap of joint representations and the degree of the loss of independent joint control. Therefore, we conclude that an increased overlap of cortical representations for shoulder and elbow contributes to the expression of the loss of independent shoulder/elbow control of the paretic upper limb in chronic hemiparetic stroke survivors.

Details DOI

YNIMG Journal 2005 Journal Article

Evaluation of different cortical source localization methods using simulated and experimental EEG data

Jun Yao
Julius P.A. Dewald

Details DOI