Author name cluster

Xiaoli Wang

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

6 papers

1 author row

AAAI Conference 2026 Conference Paper

Multimodal Graph Representation Learning with Dynamic Information Pathways

Xiaobin Hong
Mingkai Lin
Xiaoli Wang
Chaoqun Wang
Wenzhong Li

Multimodal graphs, where nodes contain heterogeneous features such as images and text, are increasingly common in real-world applications. Effectively learning on such graphs requires both adaptive intra-modal message passing and efficient inter-modal aggregation. However, most existing approaches to multimodal graph learning are typically extended from conventional graph neural networks and rely on static structures or dense attention, which limit flexibility and expressive node embedding learning. In this paper, we propose a novel multimodal graph representation learning framework with Dynamic information Pathways (DiP). By introducing modality-specific pseudo nodes, DiP enables dynamic message routing within each modality via proximity-guided pseudo-node interactions and captures inter-modality dependence through efficient information pathways in a shared state space. This design achieves adaptive, expressive, and sparse message propagation across modalities with linear complexity. We conduct the link prediction and node classification tasks to evaluate performance and carry out full experimental analyses. Extensive experiments across multiple benchmarks demonstrate that DiP consistently outperforms baselines.

PDF Details DOI

AAAI Conference 2026 Conference Paper

PRIMP: PRobabilistically-Informed Motion Primitives for Efficient Affordance Learning from Demonstration (Abstract Reprint)

Sipu Ruan
Weixiao Liu
Xiaoli Wang
Xin Meng
Gregory S. Chirikjian

This paper proposes a learning-from-demonstration (LfD) method using probability densities on the workspaces of robot manipulators. The method, named PRobabilistically-Informed Motion Primitives (PRIMP), learns the probability distribution of the end effector trajectories in the 6D workspace that includes both positions and orientations. It is able to adapt to new situations such as novel via points with uncertainty and a change of viewing frame. The method itself is robot-agnostic, in that the learned distribution can be transferred to another robot with the adaptation to its workspace density. Workspace-STOMP, a new version of the existing STOMP motion planner, is also introduced, which can be used as a post-process to improve the performance of PRIMP and any other reachability-based LfD method. The combination of PRIMP and Workspace-STOMP can further help the robot avoid novel obstacles that are not present during the demonstration process. The proposed methods are evaluated with several sets of benchmark experiments. PRIMP runs more than 5 times faster than existing state-of-the-art methods while generalizing trajectories more than twice as close to both the demonstrations and novel desired poses. They are then combined with our lab’s robot imagination method that learns object affordances, illustrating the applicability to learn tool use through physical experiments.

PDF Details DOI

AAAI Conference 2025 Conference Paper

Global-Semantic Alignment Distillation for Partial Multi-view Classification

Xiaoli Wang
Anqi Huang
Yongli Wang
Guanzhou Ke
Xiaobin Hong
Jun Liu

Partial multi-view classification (PMvC) poses a significant challenge due to the incomplete nature of multi-view data, which complicates effective information fusion and accurate classification. Existing PMvC methods typically rely on heuristic evaluations of view informativeness to achieve global alignment for downstream classification tasks. However, these approaches suffer from two critical issues: information redundancy and semantic misalignment. The complexity of missing data not only leads to over-reliance on redundant or less informative views but also exacerbates semantic misalignment across views, making it difficult for existing methods to effectively capture and discriminate the class-related features. To address these issues, this work proposes a novel GLobal-semantic Alignment Distillation (GLAD) model for partial multi-view classification without requiring imputation. Our approach incorporates a self-distillation mechanism that enables the model to extract informative features and achieve global semantic alignment across views. The key insight of GLAD is leveraging labels as semantic anchors to guide the alignment of partial multi-view features. By integrating labels with extracted features via a cross-attention mechanism, we generate ideal embeddings that consistently capture global semantics across views. These embeddings then serve as intermediate supervision for distilling the student model, ensuring robust semantic alignment even with missing views. We further introduce a margin-aware weighting strategy to enhance the model's discriminative ability. Extensive experimental results validate the effectiveness and superiority of the proposed method, showcasing significant improvements in classification performance over existing techniques.

PDF Details DOI

EAAI Journal 2024 Journal Article

Deep learning-based image analysis for in situ microscopic imaging of cell culture process

Xiaoli Wang
Guangzheng Zhou
Lipeng Liang
Yuan Liu
An Luo
Zhenguo Wen
Xue Zhong Wang

Mammalian cell culture is an important bioprocess that directly affects the quality and yield of biopharmaceuticals. Traditionally, condition monitoring of the operation is based on sampling periodically and off-line analysis, which is labor intensive, time consuming, and causing time delays. In this work, in situ microscope is investigated for on-line real-time monitoring of the culture process of Chinese hamster ovary cells with focus on investigation of deep learning-based Mask R-CNN algorithm for image analysis. The model is trained by 184 images with 183, 040 cells using data augmentation methods and transfer learning technique. Mask R-CNN segmented the clustered cells more effectively than the conventional one combining edge detection, intensity thresholding, and advanced watershed method as well as the multi-scale edge detection method. Its Dice score, accuracy, precision, sensitivity, F1 score, specificity, and relative volume difference reach 0. 862, 0. 945, 0. 901, 0. 827, 0. 862, 0. 977, and 0. 082, respectively. The evolution of geometrical features of cells were further analyzed, including equivalent diameter, circularity, aspect ratio, and eccentricity. The result demonstrated the great potential of deep learning technology in analysis of on-line images for optimization and control of the cell culture process.

Details DOI

YNIMG Journal 2024 Journal Article

Microstate-based brain network dynamics distinguishing temporal lobe epilepsy patients: A machine learning approach

Zihan Wei
Xinpei Wang
Chao Liu
Yan Feng
Yajing Gan
Yuqing Shi
Xiaoli Wang
Yonghong Liu

Temporal lobe epilepsy (TLE) stands as the predominant adult focal epilepsy syndrome, characterized by dysfunctional intrinsic brain dynamics. However, the precise mechanisms underlying seizures in these patients remain elusive. Our study encompassed 116 TLE patients compared with 51 healthy controls. Employing microstate analysis, we assessed brain dynamic disparities between TLE patients and healthy controls, as well as between drug-resistant epilepsy (DRE) and drug-sensitive epilepsy (DSE) patients. We constructed dynamic functional connectivity networks based on microstates and quantified their spatial and temporal variability. Utilizing these brain network features, we developed machine learning models to discriminate between TLE patients and healthy controls, and between DRE and DSE patients. Temporal dynamics in TLE patients exhibited significant acceleration compared to healthy controls, along with heightened synchronization and instability in brain networks. Moreover, DRE patients displayed notably lower spatial variability in certain parts of microstate B, E and F dynamic functional connectivity networks, while temporal variability in certain parts of microstate E and G dynamic functional connectivity networks was markedly higher in DRE patients compared to DSE patients. The machine learning model based on these spatiotemporal metrics effectively differentiated TLE patients from healthy controls and discerned DRE from DSE patients. The accelerated microstate dynamics and disrupted microstate sequences observed in TLE patients mirror highly unstable intrinsic brain dynamics, potentially underlying abnormal discharges. Additionally, the presence of highly synchronized and unstable activities in brain networks of DRE patients signifies the establishment of stable epileptogenic networks, contributing to the poor responsiveness to antiseizure medications. The model based on spatiotemporal metrics demonstrated robust predictive performance, accurately distinguishing both TLE patients from healthy controls and DRE patients from DSE patients.

Details DOI

AAAI Conference 2020 Conference Paper

Reinforced Curriculum Learning on Pre-Trained Neural Machine Translation Models

Mingjun Zhao
Haijiang Wu
Di Niu
Xiaoli Wang

The competitive performance of neural machine translation (NMT) critically relies on large amounts of training data. However, acquiring high-quality translation pairs requires expert knowledge and is costly. Therefore, how to best utilize a given dataset of samples with diverse quality and characteristics becomes an important yet understudied question in NMT. Curriculum learning methods have been introduced to NMT to optimize a model’s performance by prescribing the data input order, based on heuristics such as the assessment of noise and difﬁculty levels. However, existing methods require training from scratch, while in practice most NMT models are pre-trained on big data already. Moreover, as heuristics, they do not generalize well. In this paper, we aim to learn a curriculum for improving a pre-trained NMT model by re-selecting inﬂuential data samples from the original training set and formulate this task as a reinforcement learning problem. Speciﬁcally, we propose a data selection framework based on Deterministic Actor-Critic, in which a critic network predicts the expected change of model performance due to a certain sample, while an actor network learns to select the best sample out of a random batch of samples presented to it. Experiments on several translation datasets show that our method can further improve the performance of NMT when original batch training reaches its ceiling, without using additional new training data, and signiﬁcantly outperforms several strong baseline methods.

PDF Details