EAAI Journal 2026 Journal Article
A dual-stream regional feature learning and adaptive fusion method for electroencephalogram-based emotion recognition
- Yong Yang
- Wenhao Wang
- Kaibo Shi
- Yuanlun Xie
- Nan Zhou
- Shiping Wen
- Ming Zhu
- Badong Chen
Author name cluster
Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.
EAAI Journal 2026 Journal Article
EAAI Journal 2026 Journal Article
AAAI Conference 2026 Conference Paper
Interactive segmentation aims to delineate a user-specified target in an image by leveraging positive and negative clicks. While effective on natural images, existing methods often fail in remote sensing scenarios, where satellite imagery is characterized by ultra-high resolution, sparse object distribution, and significant scale variation. These factors hinder accurate segmentation of fine-grained targets like roads, buildings, and aircraft. To overcome these problems, we propose CrossCut, a novel interactive segmentation framework tailored for remote sensing imagery. Unlike previous approaches that either process the entire image or treat each patch independently, CrossCut enables simultaneous segmentation across multiple patches by propagating user click information to all patches. This design allows the model to fully utilize click guidance regardless of object location, effectively resolving the challenge of inter-patch information isolation. Furthermore, CrossCut supports flexible inference by allowing segmentation results from different patch configurations to be fused, enhancing both accuracy and robustness. Extensive evaluations across multiple remote sensing datasets demonstrate that CrossCut achieves state-of-the-art performance. Quantitative results and visualizations show that CrossCut significantly advances the field of interactive segmentation for remote sensing imagery.
JBHI Journal 2026 Journal Article
Automatic medical report generation (MRG) has advanced significantly with retrieval-augmented strategies. However, existing methods face two persistent challenges: 1) a largely reliance on single-modal retrieval, which limits multimodal semantic capture and cross-modal alignment; and 2) a lack of reliable information control, leading to irrelevant noisy content and potential hallucinations. To address these limitations, we propose Uncertainty-aware Cross-modal Alignment and Refinement, named U-CAR, a unified framework that enhances both semantic integration and retrieval reliability. First, a cross-modal alignment module explicitly learns fine-grained correspondences between visual and textual representations, ensuring consistent semantics across modalities. This alignment guides the construction of dual-path retrieval-aware memory banks, with one in the visual domain and one in the textual domain, enabling retrieval to capture complementary cues from both modalities. Second, we design a cross-modal retrieval-augmented generation strategy that jointly attends to the retrieved visual and textual context, thereby enriching semantic coverage and reinforcing the integration of multi-modal evidence in the generated reports. In parallel, we introduce an uncertainty-aware refinement mechanism that quantifies generation confidence to adaptively determine the necessity of retrieval. Experiments on the IU X-Ray and MIMIC-CXR datasets demonstrate that U-CAR outperforms the current state-of-the-art methods, achieving a 9% improvement in CIDEr on IU X-Ray. and a 4% gain in BLEU-4 on MIMIC-CXR. These results underscore U-CAR's effectiveness in generating accurate, coherent, and clinically relevant medical reports. Codes are available in https://github.com/Zhounan1222/U-CAR/tree/main.
EAAI Journal 2025 Journal Article
NeurIPS Conference 2025 Conference Paper
Transferability estimation identifies the best pre-trained models for downstream tasks without incurring the high computational cost of full fine-tuning. This capability facilitates deployment and advances the pre-training and fine-tuning paradigm. However, existing methods often struggle to accurately assess transferability for emerging pre-trained models with diverse architectures, training strategies, and task alignments. In this work, we propose Implicit Transferability Modeling (ITM), a novel framework that implicitly models each model’s intrinsic transferability, coupled with a Divide-and-Conquer Variational Approximation (DVA) strategy to efficiently approximate embedding space evolution. This design enables generalization across a broader range of models and downstream tasks. Extensive experiments on a comprehensive benchmark—spanning extensive training regimes and a wider variety of model types—demonstrate that ITM consistently outperforms existing methods in terms of stability, effectiveness, and efficiency.
EAAI Journal 2025 Journal Article
ICLR Conference 2025 Conference Paper
Parameter Efficient Transfer Learning (PETL) excels in downstream classification fine-tuning with minimal computational overhead, demonstrating its potential within the pre-train and fine-tune paradigm. However, recent PETL methods consistently struggle when fine-tuning for semantic segmentation tasks, limiting their broader applicability. In this paper, we identify that fine-tuning for semantic segmentation requires larger parameter adjustments due to shifts in semantic perception granularity. Current PETL approaches are unable to effectively accommodate these shifts, leading to significant performance degradation. To address this, we introduce ProPETL, a novel approach that incorporates an additional midstream adaptation to progressively align pre-trained models for segmentation tasks. Through this process, ProPETL achieves state-of-the-art performance on most segmentation benchmarks and, for the first time, surpasses full fine-tuning on the challenging COCO-Stuff10k dataset. Furthermore, ProPETL demonstrates strong generalization across various pre-trained models and scenarios, highlighting its effectiveness and versatility for broader adoption in segmentation tasks. Code is available at: https://github.com/weeknan/ProPETL.
YNIMG Journal 2024 Journal Article
AAAI Conference 2022 Short Paper
Convolutional neural networks (CNNs) have been commonly applied in the area of the Electroencephalography (EEG)based Motor Imagery (MI) classification, significantly pushing the boundary of the state-of-the-art. In order to simultaneously decode the discriminative features and eliminate the negative effects of non-Gaussian noise and outliers in the motor imagery data, in this abstract, we propose a novel robust supervision signal, called Correntropy based Center Loss (CCL), for CNN training, which utilizes the correntropy induced distance as the objective measure. It is encouraging to see that the CNN model trained by the combination of softmax loss and CCL loss outperforms the state-of-the-art models on two public datasets.
TCS Journal 2022 Journal Article
TIST Journal 2019 Journal Article
Short text analysis is a challenging task as far as the sparsity and limitation of semantics. The semantic extension approach learns the meaning of a short text by introducing external knowledge. However, for the randomness of short text descriptions in microblogs, traditional extension methods cannot accurately mine the semantics suitable for the microblog theme. Therefore, we use the prominent and refined hashtag information in microblogs as well as complex social relationships to provide implicit guidance for semantic extension of short text. Specifically, we design a deep hash model based on social and conceptual semantic extension, which consists of dual semantic extension and deep hashing representation. In the extension method, the short text is first conceptualized to achieve the construction of hashtag graph under conceptual space. Then, the associated hashtags are generated by correlation calculation based on the integration of social relationships and concepts to extend the short text. In the deep hash model, we use the semantic hashing model to encode the abundant semantic features and form a compact and meaningful binary encoding. Finally, extensive experiments demonstrate that our method can learn and represent the short texts well by using more meaningful semantic signal. It can effectively enhance and guide the semantic analysis and understanding of short text in microblogs.