Author name cluster

Yan Chen

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

37 papers

2 author rows

EAAI Journal 2026 Journal Article

An Adaptive Image Dehazing Network with Multi-Color Feature for Complex Real-World Hazy Scenes

Zhiyu Lyu
Qi An
Yan Chen

Details DOI

YNICL Journal 2026 Journal Article

Brain topology alteration in Alzheimer’s disease brain networks: A multi-center study

Longhao Ma
Pan Wang
Dawei Wang
Hongxiang Yao
Bo Zhou
Yonghua Zhao
Zhengluan Liao
Yan Chen

Details DOI

EAAI Journal 2026 Journal Article

Consistency and consensus-based decision-making for probabilistic linguistic information reliability

Yan Chen
Dan Li
Lin Liu
Xiao Wang
Lei Xu

Details DOI

EAAI Journal 2026 Journal Article

Mamba-CorRL: Mamba-correlation graph convolutional networks with reinforcement learning for traffic flow prediction

Yan Chen
Dawen Xia
Yanmin Liu
Fuchu Zhang
Wenyong Zhang
Yang Hu
Yantao Li
Huaqing Li

Details DOI

JBHI Journal 2026 Journal Article

Radar HRV Monitoring With Physiological Prior Inspired Deep Neural Networks

Haoyu Wang
Jinbo Chen
Dongheng Zhang
Zhi Lu
Yang Hu
Qibin Sun
Yan Chen

Radar sensing has emerged as a promising solution for the contactless monitoring of Heart Rate Variability (HRV), a crucial indicator of the cardiovascular and autonomic nervous systems. However, due to signal noise and interference that easily obscure heartbeat details, along with variations in heartbeat across different physiological conditions, existing methods remain restricted to laboratory settings with healthy subjects and fail in real-world scenarios involving more complex physiological conditions. In this study, we propose a physiological prior-inspired deep learning framework for robust radar-based HRV monitoring. Specifically, we leverage the prior that internal heartbeats drive movements across the entire torso surface and design a hybrid deep neural network to model the spatio-temporal relationship between full-body radio reflections and heartbeats, effectively mitigating interference. Then, we incorporate the cardiac motion's self-similarity prior to establish a signal augmentation strategy, effectively remodeling the HRV distribution and enhancing performance across diverse physiological conditions. We build and validate our method on a large-scale dataset comprising 7, 150 outpatients with complex physiological conditions in real-world scenarios. The experimental results demonstrate that our method achieves a mean IBI error of 19. 21 ms, an RMSSD error of 16. 23 ms, an SDSD error of 16. 70 ms, and a pNN50 error of 7. 28%. We further validate the performance by classifying five common cardiac conditions based on HRV results, demonstrating performance comparable to ECG-based methods. These results highlight the great potential of our approach for accurate, contactless HRV monitoring in real-world applications.

Details DOI

AAAI Conference 2026 Conference Paper

Semi-Supervised Synthetic Data Generation with Fine-Grained Relevance Control for Short Video Search Relevance Modeling

Haoran Li
Zhiming Su
Junyan Yao
Enwei Zhang
Yang Ji
Yan Chen
Kan Zhou
Chao Feng

Synthetic data is widely adopted in embedding models to ensure diversity in training data distributions across dimensions such as difficulty, length, and language. However, existing prompt-based synthesis methods struggle to capture domain-specific data distributions, particularly in data-scarce domains, and often overlook fine-grained relevance diversity. In this paper, we present a Chinese short video dataset with 4-level relevance annotations, filling a critical resource void. Further, we propose a semi-supervised synthetic data pipeline where two collaboratively trained models generate domain-adaptive short video data with controllable relevance labels. Our method enhances relevance-level diversity by synthesizing samples for underrepresented intermediate relevance labels, resulting in a more balanced and semantically rich training data set. Extensive offline experiments show that the embedding model trained on our synthesized data outperforms those using data generated based on prompting or vanilla supervised fine-tuning(SFT). Moreover, we demonstrate that incorporating more diverse fine-grained relevance levels in training data enhances the model's sensitivity to subtle semantic distinctions, highlighting the value of fine-grained relevance supervision in embedding learning. In the search enhanced recommendation pipeline of Douyin's dual-column scenario, through online A/B testing, the proposed model increased click-through rate(CTR) by 1.45%, raised the proportion of Strong Relevance Ratio (SRR) by 4.9%, and improved the Image User Penetration Rate (IUPR) by 0.1054%.

PDF Details DOI

JBHI Journal 2026 Journal Article

WGB-GLFI: A Novel Graph-Based Global-Local Feature Interaction Framework for Automated Seizure Detection

Xiang Li
Mingxing Zhu
Chuqi Yang
Ke Zhang
Xin Wang
Sunday Timothy Aboyeji
Fei Chen
Chen Yao

Epilepsy detection faces significant challenges due to unpredictable seizures, ranging from brief awareness lapses to severe convulsions, posing risks to patients' safety and quality of life. In recent years, deep learning has become a mainstream approach in this field, leveraging advanced computational resources and EEG datasets. However, a key challenge remains: existing methods often lack unified spatial modeling and struggle to effectively handle local detailed features, thereby limiting their accuracy and robustness. To address these issues, we propose the Weighted Graph Building Global-Local Feature Interaction (WGB-GLFI) framework, which integrates spatial connectivity and dynamic patterns through a Weighted Graph Building (WGB) module and a Global-Local Feature Interaction (GLFI) module. This approach excels by comprehensively capturing the dynamic spatial relationships during epileptic seizures and achieving seamless global-local feature integration, significantly enhancing seizure detection performance. Its effectiveness has been validated across multiple datasets, including CHB-MIT, Siena Scalp, and private datasets, demonstrating robust and reliable results. Evaluated on these datasets, our model achieves accuracy rates of 99. 28%, 99. 21%, and 99. 30%, respectively. The reliability and robustness of our framework provide epilepsy patients with faster and more reliable seizure detection, which helps to intervene in a timely manner and improve the quality of life of patients.

Details DOI

YNIMG Journal 2026 Journal Article

When More Control Means Better Choices: Cognitive Control Networks Drive Expected-Value Maximization Under Uncertainty

Xia Wu
Yuning Geng
Yan Chen
Shuoxian Zhang
Tianhao Liu
Shuaipeng You
Fang Liu
Yunpeng Jiang

Details DOI

JBHI Journal 2026 Journal Article

WN-Sleep: Modeling Whole-Night Data for Improved Sleep Staging Classification

Fang Zhou
Zhi Lu
Zhi Wu
Gaohan Ye
Lingjie Shu
Yu Pu
Beilei Wang
Dong Zhang

Sleep staging, crucial for diagnosing sleep disorders, requires precise recognition of physiological signals within 30-second epochs, a task fundamentally different from managing long-term semantic dependencies in natural language processing (NLP). Our model aims to refine the integration of local and global features for more accurate sleep stage classification. Following the American Academy of Sleep Medicine (AASM) guidelines, it focuses on rigorous intra-epoch feature extraction to ensure reliable identification of sleep stages. Moreover, our approach incorporates a global perspective by analyzing whole-night data, which is essential for handling transitional periods and ambiguities. Existing sequential modeling techniques often overlook the unique requirements of sleep staging, leading to performance declines when epochs extend beyond approximately 200. Our model addresses this by structurally processing local and global information and carefully balancing detailed intra-epoch analysis with an overarching view of sleep cycles through a gating mechanism. This gate mechanism selectively integrates long-term dependencies, optimizing the balance between local accuracy and global context. This approach represents a significant advancement over existing models, offering more accurate, reliable, and clinically relevant sleep staging. Extensive experiments on the SHHS, SleepEDF-20, and SleepEDF-78 datasets demonstrate that our method outperforms state-of-the-art approaches.

Details DOI

IROS Conference 2025 Conference Paper

Compact R-X-Y Stage and Dual-Finger Micromanipulator under Inverted Optical Microscope for Microassembly

Jichao Pang
Zhuo Chen 0006
Yan Chen
Yuke Li
Yunsheng Li
Qiang Huang 0002
Tatsuo Arai
Xiaoming Liu 0007

Microassembly plays an important role in fabricating complex structures with small basic components in industrial and biomedical fields. Inverted optical microscope could provide high-quality image feedback for microassembly with its continuously improving resolution. However, a compact stage capable of positioning and reorienting micro-objects while fitting within the limited space under an inverted optical microscope remains unavailable. This paper proposes a compact R-X-Y stage that can transport micro-objects over long distances in the X and Y directions, and reorient the objects by the 360-degree continuous rotation. Additionally, different from commonly putting the rotational stage on the X-Y stage, we mount the thin X-Y stage on a rotational stage. Thus, after aligning the centers of the visual field and rotational stage at the beginning, all the visiable micro-objects will not move out of the visual field during the rotation. We further integrate the R-X-Y stage and the dual-finger micromanipulator, and then use them to assemble 2-D patterns and complex 3-D micromachine. The obtained results and preliminary demonstration indicate that the proposed compact R-X-Y has great potential in assembling complex micromachines.

Details

ICML Conference 2025 Conference Paper

Concurrent Reinforcement Learning with Aggregated States via Randomized Least Squares Value Iteration

Yan Chen
Qinxun Bai
Yiteng Zhang
Maria Dimakopoulou
Shi Dong
Qi Sun
Zhengyuan Zhou

Designing learning agents that explore efficiently in a complex environment has been widely recognized as a fundamental challenge in reinforcement learning. While a number of works have demonstrated the effectiveness of techniques based on randomized value functions on a single agent, it remains unclear, from a theoretical point of view, whether injecting randomization can help a society of agents concurently explore an environment. The theoretical results established in this work tender an affirmative answer to this question. We adapt the concurrent learning framework to randomized least-squares value iteration (RLSVI) with aggregated state representation. We demonstrate polynomial worst-case regret bounds in both finite- and infinite-horizon environments. In both setups the per-agent regret decreases at an optimal rate of $\Theta\left(\frac{1}{\sqrt{N}}\right)$, highlighting the advantage of concurent learning. Our algorithm exhibits significantly lower space complexity compared to Russo (2019) and Agrawal et. al (2021). We reduce the space complexity by a factor of $K$ while incurring only a $\sqrt{K}$ increase in the worst-case regret bound, compared to Russo (2019) and Agrawal et. al (2021). Interestingly, our algorithm improves the worst-case regret bound of Russo (2019) by a factor of $H^{1/2}$, matching the improvement in Agrawal et. al (2021). However, this result is achieved through a fundamentally different algorithmic enhancement and proof technique. Additionally, we conduct numerical experiments to demonstrate our theoretical findings.

Details

YNIMG Journal 2025 Journal Article

Corrigendum to “Exploring the clinical diagnostic value of linguistic learning ability in patients with disorders of consciousness using electrooculography” [NeuroImage Volume 297 (2024) 120753]

Xiangyue Xiao
Junhua Ding
Mingyan Yu
Zhicai Dong
Sara Cruz
Nai Ding
Charlène Aubinet
Steven Laureys

Details DOI

IROS Conference 2025 Conference Paper

DMPBot: A high-speed, high-precision, omnidirectional, insect-scale piezoelectric robot

Yan Chen
Shu Chen
Zheyu Yang
Pengyu Liu
Sicheng Chen
Ziru Deng
Junqi An
Qiang Huang 0002

Microrobots have garnered significant attention due to their vast potential applications across various fields. Among various types of microrobots, piezoelectric robots stand out due to their exceptional motion accuracy, low power consumption, and simple structural design. This work introduces a novel piezoelectric microrobot, the Dual-Modal Piezoelectric Robot (DMPBot), which is fabricated with an innovative carbon fiber substrate through a heat-pressing process with a compact size of 6 mm × 9 mm × 1. 1 mm and a weight of only 0. 05 g. DMPBot can achieve both high-speed and high-precision motion in non-resonant mode, as well as omnidirectional movement by integrating non-resonant and resonant modes. In non-resonant mode, the robot can reach a speed of 33 mm/s (3. 67 body lengths per second) and a sub-micron resolution of 0. 4 μm by adjusting the applied signal. This work presents an analysis of the design, fabrication, and performance of DMPBot, focusing on its dynamic response, motion mechanisms, high-speed and high-precision motion, and omnidirectional movement capabilities. Experimental results validate the ability of DMPBot to perform high-speed, high-precision, and omnidirectional motion, demonstrating its promising potential in the field of micromanipulation.

Details

EAAI Journal 2025 Journal Article

DRL-ED: A deep reinforcement learning with encoder–decoder method for traffic flow prediction

Dawen Xia
Yan Chen
Wenyong Zhang
Xiaoduo Wei
Yang Hu
Yantao Li
Huaqing Li

Details DOI

AAAI Conference 2025 Conference Paper

Motion-adaptive Transformer for Event-based Image Deblurring

Senyan Xu
Zhijing Sun
Mingchen Zhong
Chengzhi Cao
Yidi Liu
Xueyang Fu
Yan Chen

Event cameras, which capture pixel-level brightness changes asynchronously, provide rich motion information that is often missed during traditional frame-based camera exposures, thereby offering fresh perspectives for motion deblurring. Although current approaches incorporate event intensity, they neglect essential spatial motion information. Unlike their CNN architectures, Transformers excel in modeling long-range dependencies but struggle with establishing relevant non-local connections in sparse events and fail to highlight significant interactions in dense images. To address these limitations, we introduce a Motion-Adaptive Transformer network (MAT) that utilizes spatial motion information to forge robust global connections. The core design is an Adaptive Motion Mask Predictor (AMMP) that identifies key motion regions, guiding the Motion-Sparse Attention (MSA) to eliminate irrelevant event tokens and enabling the Motion-Aware Attention (MAA) to focus on relevant ones, thereby enhancing long-range dependency modeling. Additionally, we elaborately design a Cross-Modal Intensity Gating mechanism that efficiently merges intensity data across modalities while minimizing parameter use. The learnable Expansion-Controlled Spatial Gating further optimizes the transmission of event features. Comprehensive testing confirms that our approach sets a new benchmark in image deblurring, surpassing previous methods by up to 0.60dB on the GoPro dataset, 1.04dB on the HS-ERGB dataset, and achieving an average improvement of 0.52dB across two real-world datasets.

PDF Details DOI

NeurIPS Conference 2025 Conference Paper

RHYTHM: Reasoning with Hierarchical Temporal Tokenization for Human Mobility

Haoyu He
Haozheng Luo
Yan Chen
Qi Wang

Predicting human mobility is inherently challenging due to complex long-range dependencies and multi-scale periodic behaviors. To address this, we introduce RHYTHM (Reasoning with Hierarchical Temporal Tokenization for Human Mobility), a unified framework that leverages large language models (LLMs) as general-purpose spatio-temporal predictors and trajectory reasoners. Methodologically, RHYTHM employs temporal tokenization to partition each trajectory into daily segments and encode them as discrete tokens with hierarchical attention that captures both daily and weekly dependencies, thereby quadratically reducing the sequence length while preserving cyclical information. Additionally, we enrich token representations by adding pre-computed prompt embeddings for trajectory segments and prediction targets via a frozen LLM, and feeding these combined embeddings back into the LLM backbone to capture complex interdependencies. Computationally, RHYTHM keeps the pretrained LLM backbone frozen, yielding faster training and lower memory usage. We evaluate our model against state-of-the-art methods using three real-world datasets. Notably, RHYTHM achieves a 2. 4% improvement in overall accuracy, a 5. 0% increase on weekends, and a 24. 6% reduction in training time. Code is publicly available at https: //github. com/he-h/rhythm.

PDF Details

AAAI Conference 2025 Conference Paper

Sharper Error Bounds in Late Fusion Multi-view Clustering with Eigenvalue Proportion Optimization

Liang Du
Henghui Jiang
Xiaodong Li
Yiqing Guo
Yan Chen
Feijiang Li
Peng Zhou
Yuhua Qian

Multi-view clustering (MVC) aims to integrate complementary information from multiple views to enhance clustering performance. Late Fusion Multi-View Clustering (LFMVC) has shown promise by synthesizing diverse clustering results into a unified consensus. However, current LFMVC methods struggle with noisy and redundant partitions and often fail to capture high-order correlations across views. To address these limitations, we present a novel theoretical framework for analyzing the generalization error bounds of multiple kernel k-means, leveraging local Rademacher complexity and principal eigenvalue proportions. Our analysis establishes a convergence rate of O(1/n), significantly improving upon the existing rate in the order of O(sqrt(k/n)). Building on this insight, we propose a low-pass graph filtering strategy within a multiple linear K-means framework to mitigate noise and redundancy, further refining the principal eigenvalue proportion and enhancing clustering accuracy. Experimental results on benchmark datasets confirm that our approach outperforms state-of-the-art methods in clustering performance and robustness.

PDF Details DOI

EAAI Journal 2024 Journal Article

A lightweight model combining convolutional neural network and Transformer for driver distraction recognition

Xuexi Tang
Yan Chen
Yifan Ma
Wenxuan Yang
Houpan Zhou
Jingzhou Huang

Details DOI

JBHI Journal 2024 Journal Article

A Real-Time Hand Gesture Recognition System for Low-Latency HMI via Transient HD-SEMG and In-Sensor Computing

Haomeng Qiu
Zhitao Chen
Yan Chen
Chaojie Yang
Sihan Wu
Fanglin Li
Longhan Xie

In real-time human-machine interaction (HMI) applications, hand gesture recognition (HGR) requires high accuracy with low latency. Surface electromyography (sEMG), a physiological electrical signal reflecting muscle activation, is extensively used in HMI. Recently, transient sEMG, generated during the gesture transitions, has been employed in HGR to achieve lower observational latency compared to steady-state sEMG. However, the use of long feature windows (up to 200 ms) still make it less desirable in low-latency HMI. In addition, most studies have relied on remote computing, where remote data processing and large data transfer result in high computation and network latency. In this paper, we proposed a method leveraging transient high density sEMG (HD-sEMG) and in-sensor computing to achieve low-latency HGR. An sEMG contrastive convolution network (sCCN) was proposed for HGR. The mean absolute value and its average integration were used to train the sCCN in a contrastive learning manner. In addition, all signal acquisition, data processing, and pattern recognition processes were deployed within designed sensor for in-sensor computing. Compared to the state-of-the-art study using multi-channel 200-ms transient sEMG, our proposed method achieved a comparable HGR accuracy of 0. 963, and a 58% lower observational latency of only 84 ms. In-sensor computing realizes a 4 times lower computation latency of 3 ms, and significantly reduces the network latency to 2 ms. The proposed method offers a promising approach to achieving low-latency HGR without compromising accuracy. This facilitates real-time HMI in biomedical applications such as prostheses, exoskeletons, virtual reality, and video games.

Details DOI

AAAI Conference 2024 Conference Paper

A Unified Knowledge Transfer Network for Generalized Category Discovery

Wenkai Shi
Wenbin An
Feng Tian
Yan Chen
Yaqiang Wu
Qianying Wang
Ping Chen

Generalized Category Discovery (GCD) aims to recognize both known and novel categories in an unlabeled dataset by leveraging another labeled dataset with only known categories. Without considering knowledge transfer from known to novel categories, current methods usually perform poorly on novel categories due to the lack of corresponding supervision. To mitigate this issue, we propose a unified Knowledge Transfer Network (KTN), which solves two obstacles to knowledge transfer in GCD. First, the mixture of known and novel categories in unlabeled data makes it difficult to identify transfer candidates (i.e., samples with novel categories). For this, we propose an entropy-based method that leverages knowledge in the pre-trained classifier to differentiate known and novel categories without requiring extra data or parameters. Second, the lack of prior knowledge of novel categories presents challenges in quantifying semantic relationships between categories to decide the transfer weights. For this, we model different categories with prototypes and treat their similarities as transfer weights to measure the semantic similarities between categories. On the basis of two treatments, we transfer knowledge from known to novel categories by conducting pre-adjustment of logits and post-adjustment of labels for transfer candidates based on the transfer weights between different categories. With the weighted adjustment, KTN can generate more accurate pseudo-labels for unlabeled data, which helps to learn more discriminative features and boost model performance on novel categories. Extensive experiments show that our method outperforms state-of-the-art models on all evaluation metrics across multiple benchmark datasets. Furthermore, different from previous clustering-based methods that can only work offline with abundant data, KTN can be deployed online conveniently with faster inference speed. Code and data are available at https://github.com/yibai-shi/KTN.

PDF Details DOI

AIIM Journal 2024 Journal Article

Detecting mental and physical disorders using multi-task learning equipped with knowledge graph attention network

Wei Zhang
Ling Kong
Soobin Lee
Yan Chen
Guangxu Zhang
Hao Wang
Min Song

Details DOI

ICRA Conference 2024 Conference Paper

Development of a 3-RRS Micromanipulator Based on Origami-Inspired Spherical Joint

Haoqi Han
Xiaoming Liu 0007
Yan Chen
Hao Pang
Xiaoqing Tang
Dan Liu 0009
Qiang Huang 0002
Tatsuo Arai

In recent years, micromanipulation technology has achieved extensive applications in industry and life science. Improving the precision and bandwidth of the micromanipulator and simultaneously reducing size, weight, and cost pose significant challenges to the existing micromanipulator design and fabrication methods. Here, we propose a 3-RRS micromanipulator with an origami-inspired spherical joint based on the PC-MEMS process, aiming for miniaturization and cost-effectiveness. The spherical joint allows rotations of 140° around the x-axis approximately, 140° around the y-axis approximately, and 20° around the z-axis approximately. The micromanipulator has weights of 0. 8 g, dimensions of 16 mm × 16 mm × 22 mm, and workspace of 0. 7 mm 3. The end platform of the micromanipulator can be equipped with various effectors to accomplish different kinds of tasks. Experimental results validated its high precision and bandwidth, exhibiting its potential to perform intricate micromanipulation tasks.

Details

YNIMG Journal 2024 Journal Article

Exploring the clinical diagnostic value of linguistic learning ability in patients with disorders of consciousness using electrooculography

Xiangyue Xiao
Junhua Ding
Mingyan Yu
Zhicai Dong
Sara Cruz
Nai Ding
Charlène Aubinet
Steven Laureys

Details DOI

NeurIPS Conference 2024 Conference Paper

Flipped Classroom: Aligning Teacher Attention with Student in Generalized Category Discovery

Haonan Lin
Wenbin An
Jiahao Wang
Yan Chen
Feng Tian
Mengmeng Wang
Guang Dai
Qianying Wang

Recent advancements have shown promise in applying traditional Semi-Supervised Learning strategies to the task of Generalized Category Discovery (GCD). Typically, this involves a teacher-student framework in which the teacher imparts knowledge to the student to classify categories, even in the absence of explicit labels. Nevertheless, GCD presents unique challenges, particularly the absence of priors for new classes, which can lead to the teacher's misguidance and unsynchronized learning with the student, culminating in suboptimal outcomes. In our work, we delve into why traditional teacher-student designs falter in generalized category discovery as compared to their success in closed-world semi-supervised learning. We identify inconsistent pattern learning as the crux of this issue and introduce FlipClass—a method that dynamically updates the teacher to align with the student's attention, instead of maintaining a static teacher reference. Our teacher-attention-update strategy refines the teacher's focus based on student feedback, promoting consistent pattern recognition and synchronized learning across old and new classes. Extensive experiments on a spectrum of benchmarks affirm that FlipClass significantly surpasses contemporary GCD methods, establishing new standards for the field.

PDF Details DOI

IJCAI Conference 2024 Conference Paper

Learning-Based Tracking-before-Detect for RF-Based Unconstrained Indoor Human Tracking

Zhi Wu
Dongheng Zhang
Zixin Shang
Yuqin Yuan
Hanqin Gong
Binquan Wang
Zhi Lu
Yadong Li

Existing efforts on human tracking using wireless signal are primarily focused on constrained scenarios with only a few individuals in empty spaces. However, in practical unconstrained scenarios with severe interference and attenuation, accurate multi-person tracking has been intractable. In this paper, we propose NeuralTBD, utilizing the capability of deep models and advancement of Tracking-Before-Detect (TBD) methodology to achieve accurate human tracking. TBD is a classical tracking methodology from signal processing accumulating measurement in time domain to distinguish target traces from interference, which however relies on handcrafted shape/motion models, impeding efficacy in complex indoor scenarios. To tackle this challenge, we build an end-to-end learning-based TBD framework leverages the advanced modeling capabilities of deep models to significantly enhance the performance of TBD. To evaluate NeuralTBD, we collect an RF-based tracking dataset in unconstrained scenarios, which encompasses 4 million annotated radar frames with up to 19 individuals acting in 6 different scenarios. NeuralTBD realizes a 70% improvement in performance compared to conventional TBD methods. To our knowledge, this is the first attempt dealing with RF-based unconstrained human tracking. The code and dataset will be released.

PDF Details DOI

JBHI Journal 2024 Journal Article

Robust Epileptic Seizure Detection Based on Biomedical Signals Using an Advanced Multi-View Deep Feature Learning Approach

Ijaz Ahmad
Zhenzhen Liu
Lin Li
Inam Ullah
Sunday Timothy Aboyeji
Xin Wang
Oluwarotimi Williams Samuel
Guanglin Li

Epilepsy is a neurological disorder characterized by abnormal neuronal discharges that manifest in life-threatening seizures. These are often monitored via EEG signals, a key aspect of biomedical signal processing (BSP). Accurate epileptic seizure (ES) detection significantly depends on the precise identification of key EEG features, which requires a deep understanding of the data's intrinsic domain. Therefore, this study presents an Advanced Multi-View Deep Feature Learning (AMV-DFL) framework based on machine learning (ML) technology to enhance the detection of relevant EEG signal features for ES. Our method initially applies a fast Fourier transform (FFT) on EEG data for traditional frequency domain feature (TFD-F) extraction and directly incorporates time domain (TD) features from the raw EEG signals, establishing a comprehensive traditional multi-view feature (TMV-F). Deep features are subsequently extracted autonomously from optimal layers of one-dimensional convolutional neural networks (1D CNN), resulting in multi-view deep features (MV-DF) integrating both time and frequency domains. A multi-view forest (MV-F) is an interpretable rule-based advanced ML classifier used to construct a robust, generalized classification. Tree-based SHAP explainable artificial intelligence (T-XAI) is incorporated for interpreting and explaining the underlying rules. Experimental results confirm our method's superiority, surpassing models using TMV-FL and single-view deep features (SV-DF) by 4% and outperforming other state-of-the-art methods by an average of 3% in classification accuracy. The AMV-DFL approach aids clinicians in identifying EEG features indicative of ES, potentially discovering novel biomarkers, and improving diagnostic capabilities in epilepsy management.

Details DOI

NeurIPS Conference 2024 Conference Paper

Schedule Your Edit: A Simple yet Effective Diffusion Noise Schedule for Image Editing

Haonan Lin
Yan Chen
Jiahao Wang
Wenbin An
Mengmeng Wang
Feng Tian
Yong Liu
Guang Dai

Text-guided diffusion models have significantly advanced image editing, enabling high-quality and diverse modifications driven by text prompts. However, effective editing requires inverting the source image into a latent space, a process often hindered by prediction errors inherent in DDIM inversion. These errors accumulate during the diffusion process, resulting in inferior content preservation and edit fidelity, especially with conditional inputs. We address these challenges by investigating the primary contributors to error accumulation in DDIM inversion and identify the singularity problem in traditional noise schedules as a key issue. To resolve this, we introduce the Logistic Schedule, a novel noise schedule designed to eliminate singularities, improve inversion stability, and provide a better noise space for image editing. This schedule reduces noise prediction errors, enabling more faithful editing that preserves the original content of the source image. Our approach requires no additional retraining and is compatible with various existing editing methods. Experiments across eight editing tasks demonstrate the Logistic Schedule's superior performance in content preservation and edit fidelity compared to traditional noise schedules, highlighting its adaptability and effectiveness. The project page is available at https: //lonelvino. github. io/SYE/.

PDF Details DOI

AAAI Conference 2024 Conference Paper

Transfer and Alignment Network for Generalized Category Discovery

Wenbin An
Feng Tian
Wenkai Shi
Yan Chen
Yaqiang Wu
Qianying Wang
Ping Chen

Generalized Category Discovery (GCD) is a crucial real-world task that aims to recognize both known and novel categories from an unlabeled dataset by leveraging another labeled dataset with only known categories. Despite the improved performance on known categories, current methods perform poorly on novel categories. We attribute the poor performance to two reasons: biased knowledge transfer between labeled and unlabeled data and noisy representation learning on the unlabeled data. The former leads to unreliable estimation of learning targets for novel categories and the latter hinders models from learning discriminative features. To mitigate these two issues, we propose a Transfer and Alignment Network (TAN), which incorporates two knowledge transfer mechanisms to calibrate the biased knowledge and two feature alignment mechanisms to learn discriminative features. Specifically, we model different categories with prototypes and transfer the prototypes in labeled data to correct model bias towards known categories. On the one hand, we pull instances with known categories in unlabeled data closer to these prototypes to form more compact clusters and avoid boundary overlap between known and novel categories. On the other hand, we use these prototypes to calibrate noisy prototypes estimated from unlabeled data based on category similarities, which allows for more accurate estimation of prototypes for novel categories that can be used as reliable learning targets later. After knowledge transfer, we further propose two feature alignment mechanisms to acquire both instance- and category-level knowledge from unlabeled data by aligning instance features with both augmented features and the calibrated prototypes, which can boost model performance on both known and novel categories with less noise. Experiments on three benchmark datasets show that our model outperforms SOTA methods, especially on novel categories. Theoretical analysis is provided for an in-depth understanding of our model in general. Our code and data are available at https://github.com/Lackel/TAN.

PDF Details DOI

EAAI Journal 2023 Journal Article

IPDNet: A dual convolutional network combined with image prior for single image dehazing

Yan Chen
Zhiyu Lyu
Yimin Hou

Details DOI

IJCAI Conference 2023 Conference Paper

Less Learn Shortcut: Analyzing and Mitigating Learning of Spurious Feature-Label Correlation

Yanrui Du
Jing Yan
Yan Chen
Jing Liu
Sendong Zhao
Qiaoqiao She
Hua Wu
Haifeng Wang

Recent research has revealed that deep neural networks often take dataset biases as a shortcut to make decisions rather than understand tasks, leading to failures in real-world applications. In this study, we focus on the spurious correlation between word features and labels that models learn from the biased data distribution of training data. In particular, we define the word highly co-occurring with a specific label as biased word, and the example containing biased word as biased example. Our analysis shows that biased examples are easier for models to learn, while at the time of prediction, biased words make a significantly higher contribution to the models' predictions, and models tend to assign predicted labels over-relying on the spurious correlation between words and labels. To mitigate models' over-reliance on the shortcut (i. e. spurious correlation), we propose a training strategy Less-Learn-Shortcut (LLS): our strategy quantifies the biased degree of the biased examples and down-weights them accordingly. Experimental results on Question Matching, Natural Language Inference and Sentiment Analysis tasks show that LLS is a task-agnostic strategy and can improve the model performance on adversarial data while maintaining good performance on in-domain data.

PDF Details DOI

JBHI Journal 2023 Journal Article

Sleep Classification With Artificial Synthetic Imaging Data Using Convolutional Neural Networks

Lan Shi
Marianthie Wank
Yan Chen
Yibo Wang
Yachuan Liu
Emily C. Hector
Peter X.K. Song

Objective: We propose a new analytic framework, “Artificial Synthetic Imaging Data (ASID) Workflow, ” for sleep classification from a wearable device comprising: 1) the creation of ASID from data collected by a non-invasive wearable device that permits real-time multi-modal physiological monitoring on heart rate (HR), 3-axis accelerometer, electrodermal activity, and skin temperature, denoted as “Temporal E4 Data” (TED) and 2) the use of an image classification supervised learning algorithm, convolutional neural network (CNN), to classify periods of sleep. Methods: We investigate ASID Workflow under 6 settings (3 data resolutions × 2 HR scenarios). Competing machine/deep learning classification algorithms, including logistic regression, support vector machine, random forest, k-nearest neighbors, and Long Short-Term Memory, are applied to TED as comparisons, termed “Competing Workflow. ” Results: The ASID Workflow achieves excellent performance with mean weighted accuracy across settings of 94. 7%, and is superior to the Competing Workflow with high and low resolution data regardless of the inclusion of HR modality. This superiority is maximized for low resolution data without HR. Additionally, CNN has a relatively low subject-wise test computational cost compared with competing algorithms. Conclusion: We demonstrate the utility of creating ASID from multi-modal physiological data and applying a preexisting image classification algorithm to achieve better classification accuracy. We shed light on the influence of data resolution and HR modality on the Workflow's performance. Significance: Applying CNN to ASID allows us to capture both temporal and spatial dependency among physiological variables and modalities by using 2D images' topological structure that competing algorithms fail to utilize.

Details DOI

NeurIPS Conference 2022 Conference Paper

Society of Agents: Regret Bounds of Concurrent Thompson Sampling

Yan Chen
Perry Dong
Qinxun Bai
Maria Dimakopoulou
Wei Xu
Zhengyuan Zhou

We consider the concurrent reinforcement learning problem where $n$ agents simultaneously learn to make decisions in the same environment by sharing experience with each other. Existing works in this emerging area have empirically demonstrated that Thompson sampling (TS) based algorithms provide a particularly attractive alternative for inducing cooperation, because each agent can independently sample a belief environment (and compute a corresponding optimal policy) from the joint posterior computed by aggregating all agents' data, which induces diversity in exploration among agents while benefiting shared experience from all agents. However, theoretical guarantees in this area remain under-explored; in particular, no regret bound is known on TS based concurrent RL algorithms. In this paper, we fill in this gap by considering two settings. In the first, we study the simple finite-horizon episodic RL setting, where TS is naturally adapted into the concurrent setup by having each agent sample from the current joint posterior at the beginning of each episode. We establish a $\tilde{O}(HS\sqrt{\frac{AT}{n}})$ per-agent regret bound, where $H$ is the horizon of the episode, $S$ is the number of states, $A$ is the number of actions, $T$ is the number of episodes and $n$ is the number of agents. In the second setting, we consider the infinite-horizon RL problem, where a policy is measured by its long-run average reward. Here, despite not having natural episodic breakpoints, we show that by a doubling-horizon schedule, we can adapt TS to the infinite-horizon concurrent learning setting to achieve a regret bound of $\tilde{O}(DS\sqrt{ATn})$, where $D$ is the standard notion of diameter of the underlying MDP and $T$ is the number of timesteps. Note that in both settings, the per-agent regret decreases at an optimal rate of $\Theta(\frac{1}{\sqrt{n}})$, which manifests the power of cooperation in concurrent RL.

PDF Details

YNICL Journal 2019 Journal Article

Abnormal dynamic properties of functional connectivity in disorders of consciousness

Bolin Cao
Yan Chen
Ronghao Yu
Lixiang Chen
Ping Chen
Yihe Weng
Qinyuan Chen
Jie Song

Details DOI

IJCAI Conference 2018 Conference Paper

IncepText: A New Inception-Text Module with Deformable PSROI Pooling for Multi-Oriented Scene Text Detection

Qiangpeng Yang
Mengli Cheng
Wenmeng Zhou
Yan Chen
Minghui Qiu
Wei Lin

Incidental scene text detection, especially for multi-oriented text regions, is one of the most challenging tasks in many computer vision applications. Different from the common object detection task, scene text often suffers from a large variance of aspect ratio, scale, and orientation. To solve this problem, we propose a novel end-to-end scene text detector IncepText from an instance-aware segmentation perspective. We design a novel Inception-Text module and introduce deformable PSROI pooling to deal with multi-oriented text detection. Extensive experiments on ICDAR2015, RCTW-17, and MSRA-TD500 datasets demonstrate our method's superiority in terms of both effectiveness and efficiency. Our proposed method achieves 1st place result on ICDAR2015 challenge and the state-of-the-art performance on other datasets. Moreover, we have released our implementation as an OCR product which is available for public access.

PDF Details

AIIM Journal 2015 Journal Article

A comparative analysis of the density of the SNOMED CT conceptual content for semantic harmonization

Zhe He
James Geller
Yan Chen

Details DOI

AAAI Conference 2013 Conference Paper

From Interest to Function: Location Estimation in Social Media

Yan Chen
Jichang Zhao
Xia Hu
Xiaoming Zhang
Zhoujun Li
Tat-Seng Chua

Recent years have witnessed the tremendous development of social media, which attracts a vast number of Internet users. The high-dimension content generated by these users provides an unique opportunity to understand their behavior deeply. As one of the most fundamental topics, location estimation attracts more and more research efforts. Different from the previous literature, we find that user’s location is strongly related to user interest. Based on this, we first build a detection model to mine user interest from short text. We then establish the mapping between location function and user interest before presenting an efficient framework to predict the user’s location with convincing fidelity. Thorough evaluations and comparisons on an authentic data set show that our proposed model significantly outperforms the state-of-the-arts approaches. Moreover, the high efficiency of our model also guarantees its applicability in real-world scenarios.

PDF Details

AIIM Journal 2013 Journal Article

The readiness of SNOMED problem list concepts for meaningful use of electronic health records

Ankur Agrawal
Zhe He
Yehoshua Perl
Duo Wei
Michael Halper
Gai Elhanan
Yan Chen

Details DOI