Author name cluster

Yi Xiao

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

13 papers

2 author rows

AAAI Conference 2026 Conference Paper

A Hybrid Space Model for Misaligned Multi-modality Image Fusion

Yi Xiao
Jia Wang
Zhu Liu
Di Wang
Jinyuan Liu
Risheng Liu

Infrared and visible image fusion aims to integrate complementary information, such as thermal saliency from infrared imagery and fine-grained texture details from visible imagery. However, real-world multi-modal misalignment and geometric deformation often introduce severe artifacts. Most existing methods focus on feature extraction within Euclidean space, thereby neglecting the inherent hierarchical structures embedded in multimodal representations. While Euclidean space excels at preserving local structural details and supporting efficient computation, hyperbolic space is naturally suited for modeling hierarchical relationships due to its geometric properties. Building upon these observations, this paper proposes a unified framework that jointly optimizes image registration and fusion through a dual-space architecture. This architecture synergistically combines the local fidelity of Euclidean geometry with the hierarchical modeling capability of hyperbolic geometry to enhance multimodal representation learning. Specifically, this paper introduces Hyperbolic Coupled Contrastive Learning Optimization (HCCLO), which aligns and optimizes the hierarchical structures of infrared and visible embeddings in hyperbolic space. Moreover, this paper designs a task-adaptive dual-space features fusion mechanism, which dynamically balances and fuses Euclidean local features with hyperbolic hierarchical representations, thereby improving adaptability for downstream tasks. Extensive experiments on misaligned multimodal datasets demonstrate that our method achieves state-of-the-art performance, while effectively capturing both spatial dependencies and hierarchical semantics.

PDF Details DOI

AAAI Conference 2026 Conference Paper

Content-aware Information Compression and Selection for Whole Slide Image Analysis

Tingting Zheng
Hongxun Yao
Sicheng Zhao
Yi Xiao

Recent advances in multi-instance learning (MIL) have demonstrated impressive performance in whole slide image (WSI) analysis. However, current methods search for cues and draw conclusions from all instances or regions, resulting in excessive redundant computation and suboptimal representation quality due to irrelevant and uninformative feature interference. To address these issues, we propose CICS, an efficient and general framework that performs compact information compression and selection for high-efficiency WSI analysis. In particular, CICS features two key components: (1) context-aware compression (CAC), which partitions the instance space into sub-regions and applies learnable compression to discard irrelevant components, reduce computational complexity while facilitating information selection, and (2) global-proximity selective attention (GPSA), which cherry-picks the most informative representation with a proximity-assisted global dynamic selection strategy. Building upon these innovations, CICS forms a plug-and-play module that reduces computational complexity through compact instance representations while improving feature quality by preserving the most informative cues. Extensive experiments on six WSI classification and survival prediction datasets show that CICS consistently improves the performance of multiple representative MIL methods. It achieves 2.5%, 7.7%, and 3.9% accuracy gain over the state-of-the-art Transformer-based TransMIL, Mamba-based MambaMIL, and graph-based WIKG methods on the ESCA dataset.

PDF Details DOI

JBHI Journal 2026 Journal Article

Piezoelectric Ceramic Sensor Array Based Obstructive Sleep Apnea Event Detection

Yi Liu
Zhengdong Li
Xiaomao Fan
Yingying Shao
Dikun Hu
Rong Huang
Yi Xiao
Boxuan Lv

Obstructive sleep apnea (OSA) is one of the major sleep disorders, which has been demonstrated to be a high-risk factor for cardiovascular disease, hypertension, and motor vehicle accidents. Pressure sensors in a contactless manner are a promising way to monitor sleep conditions outside of the hospital. However, previous studies mainly based on limited sensors are often subjected to noise contamination and constrained by the sleeper position to pressure sensors. The acquired pressure signals are of poor quality or even lost, which are not appropriate for the downstream task of OSA event detection. To address this issue, we designed a sensitive piezoelectric ceramic sensor array (PCSA) by aligning sixteen sensors embedded into a mat covering the chest and abdomen area, which can capture the changes of weak pressure signals under a sleeping mattress with a thickness of up to 30 cm. Based on PCSA, we recruited 36 adult volunteers from the Peking Union Medical College Hospital and conducted a pilot study to acquire overnight pressure signals along with polysomnography recordings. Subsequently, we developed an automated OSA event detection method named DRFNet. The main advantage of DRFNet is that it can well capture the time-domain and frequency-domain features from different views by fusing ResNet18 and DenseNet121 networks. Experiment results showed that DRFNet can achieve 75. 19 % sensitivity, 87. 78 % specificity, and 81. 48 % accuracy, which is competitive with existing state-of-the-art methods. Combined with PCSA, it can be potentially deployed into an embedded device and provide contactless sleep monitoring service in home settings.

Details DOI

AAAI Conference 2026 Conference Paper

Semantics and Content Matter: Towards Multi-Prior Hierarchical Mamba for Image Deraining

Zhaocheng Yu
Kui Jiang
Junjun Jiang
Xianming Liu
Guanglu Sun
Yi Xiao

Rain significantly degrades the performance of computer vision systems, particularly in applications like autonomous driving and video surveillance. While existing deraining methods have made considerable progress, they often struggle with fidelity of semantic and spatial details. To address these limitations, we propose the Multi-Prior Hierarchical Mamba (MPHM) network for image deraining. This novel architecture synergistically integrates macro-semantic textual priors (CLIP) for task-level semantic guidance and micro-structural visual priors (DINOv2) for scene-aware structural information. To alleviate potential conflicts between heterogeneous priors, we devise a progressive Priors Fusion Injection (PFI) that strategically injects complementary cues at different decoder levels. Meanwhile, we equip the backbone network with an elaborate Hierarchical Mamba Module (HMM) to facilitate robust feature representation, featuring a Fourier-enhanced dual-path design that concurrently addresses global context modeling and local detail recovery. Comprehensive experiments demonstrate MPHM's state-of-the-art performance, achieving a 0.57 dB PSNR gain on the Rain200H dataset while delivering superior generalization on real-world rainy scenarios.

PDF Details DOI

ICML Conference 2025 Conference Paper

Designing Cyclic Peptides via Harmonic SDE with Atom-Bond Modeling

Xiangxin Zhou
Mingyu Li
Yi Xiao
Jiahan Li
Dongyu Xue
Zaixiang Zheng
Jianzhu Ma
Quanquan Gu

Cyclic peptides offer inherent advantages in pharmaceuticals. For example, cyclic peptides are more resistant to enzymatic hydrolysis compared to linear peptides and usually exhibit excellent stability and affinity. Although deep generative models have achieved great success in linear peptide design, several challenges prevent the development of computational methods for designing diverse types of cyclic peptides. These challenges include the scarcity of 3D structural data on target proteins and associated cyclic peptide ligands, the geometric constraints that cyclization imposes, and the involvement of non-canonical amino acids in cyclization. To address the above challenges, we introduce CpSDE, which consists of two key components: AtomSDE, a generative structure prediction model based on harmonic SDE, and ResRouter, a residue type predictor. Utilizing a routed sampling algorithm that alternates between these two models to iteratively update sequences and structures, CpSDE facilitates the generation of cyclic peptides. By employing explicit all-atom and bond modeling, CpSDE overcomes existing data limitations and is proficient in designing a wide variety of cyclic peptides. Our experimental results demonstrate that the cyclic peptides designed by our method exhibit reliable stability and affinity.

Details

ICLR Conference 2025 Conference Paper

Integrating Protein Dynamics into Structure-Based Drug Design via Full-Atom Stochastic Flows

Xiangxin Zhou
Yi Xiao
Haowei Lin
Xinheng He
Jiaqi Guan
Yang Wang 0103
Qiang Liu 0006
Feng Zhou

The dynamic nature of proteins, influenced by ligand interactions, is essential for comprehending protein function and progressing drug discovery. Traditional structure-based drug design (SBDD) approaches typically target binding sites with rigid structures, limiting their practical application in drug development. While molecular dynamics simulation can theoretically capture all the biologically relevant conformations, the transition rate is dictated by the intrinsic energy barrier between them, making the sampling process computationally expensive. To overcome the aforementioned challenges, we propose to use generative modeling for SBDD considering conformational changes of protein pockets. We curate a dataset of apo and multiple holo states of protein-ligand complexes, simulated by molecular dynamics, and propose a full-atom flow model (and a stochastic version), named DynamicFlow, that learns to transform apo pockets and noisy ligands into holo pockets and corresponding 3D ligand molecules. Our method uncovers promising ligand molecules and corresponding holo conformations of pockets. Additionally, the resultant holo-like states provide superior inputs for traditional SBDD approaches, playing a significant role in practical drug discovery.

Details

NeurIPS Conference 2025 Conference Paper

LoRA-EnVar: Parameter-Efficient Hybrid Ensemble Variational Assimilation for Weather Forecasting

Yi Xiao
Hang Fan
Kun Chen
Ye Cao
Ben Fei
Wei Xue
Lei Bai

Accurate estimation of background error (i. e. , forecast error) distribution is critical for effective data assimilation (DA) in numerical weather prediction (NWP). In state-of-the-art operational DA systems, it is common to account for the temporal evolution of background errors by employing hybrid methods, which blend a static climatological covariance with a flow-dependent ensemble-derived component. While effective to some extent, these methods typically assume Gaussian-distributed errors and rely heavily on hand-crafted covariance structures and domain expertise, limiting their ability to capture the complex, non-Gaussian nature of atmospheric dynamics. In this work, we propose LoRA-EnVar, a novel hybrid ensemble variational DA algorithm that integrates low-rank adaptation (LoRA) into a deep generative modeling framework. We first learn a climatological background error distribution using a variational autoencoder (VAE) trained on historical data. To incorporate flow-dependent uncertainty, we introduce LoRA modules that efficiently adapt the learned distribution in response to flow-dependent ensemble perturbations. Our approach supports online finetuning, enabling dynamic updates of the background error distribution without catastrophic forgetting. We validate LoRA-EnVar in high-resolution assimilation settings using the FengWu forecast model and simulated observations from ERA5 reanalysis. Experimental results show that LoRA-EnVar significantly improves assimilation accuracy over models assuming static background error distribution and achieves comparable or better performance than full finetuning while reducing the number of trainable parameters by three orders of magnitude. This demonstrates the potential of parameter-efficient adaptation for scalable, non-Gaussian DA in operational meteorology.

PDF Details

AAAI Conference 2025 Conference Paper

OODML: Whole Slide Image Classification Meets Online Pseudo-Supervision and Dynamic Mutual Learning

Tingting Zheng
Kui Jiang
Hongxun Yao
Yi Xiao
Zhongyuan Wang

Bag-label-based multi-instance learning (MIL) has demonstrated significant performance in whole slide image (WSI) analysis, particularly in pseudo-label-based learning schemes. However, due to inaccurate feature representation and interference, existing MIL methods often yield unreliable pseudo-labels, which spawn undesired predictions. To address these issues, we propose an Online Pseudo-Supervision and Dynamic Mutual Learning (OODML) framework that enhances pseudo-label generation and feature representation while exploring their mutual learning to improve bag-level prediction. Specifically, we design an Adaptive Memory Bank (AMB) to collect the most informative components of the current WSI. We also introduce a Self-Progressive Feature Fusion (SPFF) module that integrates label-related historical information from the AMB with current semantic variations, thereby enhancing the representation of pseudo-bag tokens. Furthermore, we propose a Decision Revision Pseudo-Label (DRPL) generation scheme to explore intrinsic connections between pseudo-bag representations and bag-label predictions, resulting in more reliable pseudo-label generation. To alleviate redundant and ambiguous representations, the class-wise prior of pseudo-label prediction is borrowed to facilitate label-related feature learning and to update the AMB, forming a mutual refinement between feature representation and pseudo-label generation. Additionally, a Dynamic Decision-Making (DDM) module is developed to harmonize explicit and implicit representations of bag information for more robust decision-making. Extensive experiments on four datasets demonstrate that our OODML surpasses the state-of-the-art by 3.3% and 6.9% on the CAMELYON16 and TCGA Lung datasets.

PDF Details DOI

NeurIPS Conference 2025 Conference Paper

Spiking Meets Attention: Efficient Remote Sensing Image Super-Resolution with Attention Spiking Neural Networks

Yi Xiao
Qiangqiang Yuan
Kui Jiang
Wenke Huang
Qiang Zhang
Tingting Zheng
Chia-Wen Lin
Liangpei Zhang

Spiking neural networks (SNNs) are emerging as a promising alternative to traditional artificial neural networks (ANNs), offering biological plausibility and energy efficiency. Despite these merits, SNNs are frequently hampered by limited capacity and insufficient representation power, yet remain underexplored in remote sensing image (RSI) super-resolution (SR) tasks. In this paper, we first observe that spiking signals exhibit drastic intensity variations across diverse textures, highlighting an active learning state of the neurons. This observation motivates us to apply SNNs for efficient SR of RSIs. Inspired by the success of attention mechanisms in representing salient information, we devise the spiking attention block (SAB), a concise yet effective component that optimizes membrane potentials through inferred attention weights, which, in turn, regulates spiking activity for superior feature representation. Our key contributions include: 1) we bridge the independent modulation between temporal and channel dimensions, facilitating joint feature correlation learning, and 2) we access the global self-similar patterns in large-scale remote sensing imagery to infer spatial attention weights, incorporating effective priors for realistic and faithful reconstruction. Building upon SAB, we proposed SpikeSR, which achieves state-of-the-art performance across various remote sensing benchmarks such as AID, DOTA, and DIOR, while maintaining high computational efficiency. Code of SpikeSR will be available at https: //github. com/XY-boy/SpikeSR.

PDF Details

ICLR Conference 2025 Conference Paper

VAE-Var: Variational Autoencoder-Enhanced Variational Methods for Data Assimilation in Meteorology

Yi Xiao
Qilong Jia
Kun Chen 0004
Lei Bai 0001
Wei Xue 0003

Data assimilation (DA) is an essential statistical technique for generating accurate estimates of a physical system's states by combining prior model predictions with observational data, especially in the realm of weather forecasting. Effectively modeling the prior distribution while adapting to diverse observational sources presents significant challenges for both traditional and neural network-based DA algorithms. This paper introduces VAE-Var, a novel neural network-based data assimilation algorithm aimed at 1) enhancing accuracy by capturing the non-Gaussian characteristics of the conditional background distribution $p(\mathbf{x}|\mathbf{x}_b)$, and 2) efficiently assimilating real-world observational data. VAE-Var utilizes a variational autoencoder to learn the background error distribution, with its decoder creating a variational cost function to optimize the analysis states. The advantages of VAE-Var include: 1) it maintains the framework of traditional variational assimilation, enabling it to accommodate various observation operators, particularly irregular observations; 2) it lessens the dependence on expert knowledge for constructing the background distribution, allowing for improved modeling of non-Gaussian structures; and 3) experimental results indicate that, when applied to the FengWu weather forecasting model, VAE-Var outperforms DiffDA and two traditional algorithms (interpolation and 3DVar) in terms of assimilation accuracy in sparse observational contexts, and is capable of assimilating real-world GDAS prepbufr observations over a year.

Details

YNIMG Journal 2024 Journal Article

Dance of two brains: Interval subdivision in alternated condition enhances resistance to interference by others

Ruoyu Niu
Xiaodan Xu
Weicai Tang
Yi Xiao
Rixin Tang

The accomplishment of interpersonal sensorimotor synchronization is a challenging endeavor because it requires the achievement of a balance between accurate temporal control within individuals and smooth communication between them. This raises a critical question: How does the brain comprehend and process the perceptual information of others to guarantee accurate temporal control of action goals in a social context? A joint synchronization - continuation tapping task was conducted together with varying relative phases (0°/180°) and intervals of tempos (400 ms/800 ms/1600 ms) while neural data was collected using fNIRS (functional near-infrared spectroscopy). Individuals showed better behavioral performance and greater interpersonal brain synchronization(IBS) in the left dorsolateral prefrontal cortex at alternated condition (180° relative phase) compared to symmetric condition (0° relative phase), suggesting that the individual can better maintain behavioral performance and show improved IBS when the partner taps between the individual's gaps. Meanwhile, in most levels of alternated condition, IBS is inversely proportional to interference from partner, implying the counteraction of IBS against interference from others. In addition, when the interval of tempo was 1600 ms, behavioral performance showed a sharp decline, accompanied by a decrease in IBS, reflecting that IBS in SMS reflects effective information exchange between individuals rather than ineffective interference with each other. This study provides insight into the mechanisms underlying sensorimotor synchronization between individuals.

Details DOI

EAAI Journal 2024 Journal Article

Interpreting the influential factors in ship detention using a novel random forest algorithm considering dataset imbalance and uncertainty

Yi Xiao
Mengjie Jin
Guanqiu Qi
Wenming Shi
Kevin X. Li
Xianping Du

Port State Control inspects foreign ships in national ports to verify that ships' conditions and equipment obey international regulations and that the ships are crewed and operated in accordance with these regulations. Port State Control has proven useful in ensuring a “safer ship and cleaner ocean. ” To support the effectiveness and efficiency of inspections, targeted ships should only be considered if they are at a high risk for accidents. The key factors for ship selection have been included in inspection regimes, but their combined effect on causing ship detention is unclear. Meanwhile, certain factors are characterized by data uncertainty that may influence inspection results and even the time window of an inspection. Furthermore, although tens of thousands of inspection data items are produced yearly, the probability of ship detention is around 3%. Therefore, a new uncertain random forest algorithm has been developed to address factor uncertainty and data imbalances. This algorithm generates rules for the relationships between the multi-factors and ship detention with high accuracy and robustness performance. Based on uncertain random forest models, the following three results are presented. First, the optimal data balancing strategy is a detention ratio of 30% rather than 50%, which could better balance inspection accuracy and efficiency. Second, data uncertainty influences the prediction probability of ship detention; as the uncertainty interval range increases, the prediction probability decreases. Third, the uncertain random forest algorithm generates Port State Control's association rules. Thus, this algorithm can help port authorities identify substandard vessels more efficiently.

Details DOI

ICML Conference 2024 Conference Paper

Towards a Self-contained Data-driven Global Weather Forecasting Framework

Yi Xiao
Lei Bai 0001
Wei Xue 0003
Hao Chen 0045
Kun Chen 0004
Kang Chen
Tao Han 0002
Wanli Ouyang

Data-driven weather forecasting models are advancing rapidly, yet they rely on initial states (i. e. , analysis states) typically produced by traditional data assimilation algorithms. Four-dimensional variational assimilation (4DVar) is one of the most widely adopted data assimilation algorithms in numerical weather prediction centers; it is accurate but computationally expensive. In this paper, we aim to couple the AI forecasting model, FengWu, with 4DVar to build a self-contained data-driven global weather forecasting framework, FengWu-4DVar. To achieve this, we propose an AI-embedded 4DVar algorithm that includes three components: (1) a 4DVar objective function embedded with the FengWu forecasting model and its error representation to enhance efficiency and accuracy; (2) a spherical-harmonic-transform-based (SHT-based) approximation strategy for capturing the horizontal correlation of background error; and (3) an auto-differentiation (AD) scheme for determining the optimal analysis fields. Experimental results show that under the ERA5 simulated observational data with varying proportions and noise levels, FengWu-4DVar can generate accurate analysis fields; remarkably, it has achieved stable self-contained global weather forecasts for an entire year for the first time, demonstrating its potential for real-world applications. Additionally, our framework is approximately 100 times faster than the traditional 4DVar algorithm under similar experimental conditions, highlighting its significant computational efficiency.

Details