Author name cluster

Xunde Dong

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

2 papers

2 author rows

AAAI Conference 2025 Conference Paper

MSE-Adapter: A Lightweight Plugin Endowing LLMs with the Capability to Perform Multimodal Sentiment Analysis and Emotion Recognition

Yang Yang
Xunde Dong
Yupeng Qiang

Current multimodal sentiment analysis (MSA) and emotion recognition in conversations (ERC) methods based on pre-trained language models exhibit two primary limitations: 1) Once trained for MSA and ERC tasks, these pre-trained language models lose their original generalized capabilities. 2) They demand considerable computational resources. As the size of pre-trained language models continues to grow, training larger multimodal sentiment analysis models using previous approaches could result in unnecessary computational cost. In response to this challenge, we propose Multimodal Sentiment Analysis and Emotion Recognition Adapter (MSE-Adapter), a lightweight and adaptable plugin. This plugin enables a large language model (LLM) to carry out MSA or ERC tasks with minimal computational overhead (only introduces approximately 2.6M to 2.8M trainable parameters upon the 6/7B models), while preserving the intrinsic capabilities of the LLM. In the MSE-Adapter, the Text-Guide-Mixer (TGM) module is introduced to establish explicit connections between non-textual and textual modalities through the Hadamard product. This allows non-textual modalities to better align with textual modalities at the feature level, promoting the generation of higher-quality pseudo tokens. Extensive experiments were conducted on four public English and Chinese datasets using consumer-grade GPUs and open-source LLMs (Qwen-1.8B, ChatGLM3-6B-base, and LLaMA2-7B) as the backbone. The results demonstrate the effectiveness of the proposed plugin.

PDF Details DOI

ECAI Conference 2025 Conference Paper

MV-TSF: A Novel Multi-View Teacher-Student Framework for Myocardial Infarction Localization

Yupeng Qiang
Xunde Dong
Xiuling Liu
Yang Yang 0136
Fei Hu
Rongjia Wang

Myocardial infarction (MI) is a prevalent and serious cardiovascular condition. The 12-lead electrocardiogram (ECG) is essential for MI diagnosis, as it reveals unique electrical patterns from different heart locations. Accurate localization and assessment of MI require a comprehensive analysis of ECG signals from multiple views. However, previous researches have primarily analyzed the 12-lead ECG from a single view, neglecting the variations in MI localization across different leads. Therefore, this study proposes a Multi-View Teacher-Student Framework (MV-TSF) for MI localization, which integrates multi-view learning and knowledge distillation. MV-TSF consists of two sub-networks: Multi-View Teacher network (MVT-net) and Single-View Student network (SVS-net). MVT-net treats the 12-lead ECG as five distinct views based on the correspondence between different heart regions and leads, and SVS-net uses the 12-lead ECG as input. Both sub-networks employ a multi-layer convolutional neural network structure for varied-scale ECG feature extraction. Additionally, MVT-net introduces an effective method for merging feature vectors from different views. Through knowledge distillation, knowledge is transferred from MVT-net to SVS-net, resulting in a Distilled SVS-net (DSVS-net) with only 0. 35M parameters. Experimental results on two multi-label datasets indicate that DSVS-net is highly competitive, demonstrating exceptional parameter efficiency, inference speed, and model performance.

Details