Arrow Research search

Author name cluster

Nan Zhou

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

12 papers
2 author rows

Possible papers

12

AAAI Conference 2026 Conference Paper

CrossCut: Cross-Patch Aware Interactive Segmentation for Remote Sensing Images

  • Zheng Lin
  • Nan Zhou
  • Yuhan Wang
  • Bojian Zhang

Interactive segmentation aims to delineate a user-specified target in an image by leveraging positive and negative clicks. While effective on natural images, existing methods often fail in remote sensing scenarios, where satellite imagery is characterized by ultra-high resolution, sparse object distribution, and significant scale variation. These factors hinder accurate segmentation of fine-grained targets like roads, buildings, and aircraft. To overcome these problems, we propose CrossCut, a novel interactive segmentation framework tailored for remote sensing imagery. Unlike previous approaches that either process the entire image or treat each patch independently, CrossCut enables simultaneous segmentation across multiple patches by propagating user click information to all patches. This design allows the model to fully utilize click guidance regardless of object location, effectively resolving the challenge of inter-patch information isolation. Furthermore, CrossCut supports flexible inference by allowing segmentation results from different patch configurations to be fused, enhancing both accuracy and robustness. Extensive evaluations across multiple remote sensing datasets demonstrate that CrossCut achieves state-of-the-art performance. Quantitative results and visualizations show that CrossCut significantly advances the field of interactive segmentation for remote sensing imagery.

JBHI Journal 2026 Journal Article

Uncertainty-Aware Cross-Modal Retrieval for Medical Report Generation

  • Nan Zhou
  • Meng Liu
  • Linchao He
  • Mengting Luo
  • Yidi Chen
  • Yi Zhang
  • Ke Zou
  • Hu Chen

Automatic medical report generation (MRG) has advanced significantly with retrieval-augmented strategies. However, existing methods face two persistent challenges: 1) a largely reliance on single-modal retrieval, which limits multimodal semantic capture and cross-modal alignment; and 2) a lack of reliable information control, leading to irrelevant noisy content and potential hallucinations. To address these limitations, we propose Uncertainty-aware Cross-modal Alignment and Refinement, named U-CAR, a unified framework that enhances both semantic integration and retrieval reliability. First, a cross-modal alignment module explicitly learns fine-grained correspondences between visual and textual representations, ensuring consistent semantics across modalities. This alignment guides the construction of dual-path retrieval-aware memory banks, with one in the visual domain and one in the textual domain, enabling retrieval to capture complementary cues from both modalities. Second, we design a cross-modal retrieval-augmented generation strategy that jointly attends to the retrieved visual and textual context, thereby enriching semantic coverage and reinforcing the integration of multi-modal evidence in the generated reports. In parallel, we introduce an uncertainty-aware refinement mechanism that quantifies generation confidence to adaptively determine the necessity of retrieval. Experiments on the IU X-Ray and MIMIC-CXR datasets demonstrate that U-CAR outperforms the current state-of-the-art methods, achieving a 9% improvement in CIDEr on IU X-Ray. and a 4% gain in BLEU-4 on MIMIC-CXR. These results underscore U-CAR's effectiveness in generating accurate, coherent, and clinically relevant medical reports. Codes are available in https://github.com/Zhounan1222/U-CAR/tree/main.

NeurIPS Conference 2025 Conference Paper

Implicit Modeling for Transferability Estimation of Vision Foundation Models

  • Yaoyan Zheng
  • Huiqun Wang
  • Nan Zhou
  • Di Huang

Transferability estimation identifies the best pre-trained models for downstream tasks without incurring the high computational cost of full fine-tuning. This capability facilitates deployment and advances the pre-training and fine-tuning paradigm. However, existing methods often struggle to accurately assess transferability for emerging pre-trained models with diverse architectures, training strategies, and task alignments. In this work, we propose Implicit Transferability Modeling (ITM), a novel framework that implicitly models each model’s intrinsic transferability, coupled with a Divide-and-Conquer Variational Approximation (DVA) strategy to efficiently approximate embedding space evolution. This design enables generalization across a broader range of models and downstream tasks. Extensive experiments on a comprehensive benchmark—spanning extensive training regimes and a wider variety of model types—demonstrate that ITM consistently outperforms existing methods in terms of stability, effectiveness, and efficiency.

ICLR Conference 2025 Conference Paper

Progressive Parameter Efficient Transfer Learning for Semantic Segmentation

  • Nan Zhou
  • Huiqun Wang
  • Yaoyan Zheng
  • Di Huang 0001

Parameter Efficient Transfer Learning (PETL) excels in downstream classification fine-tuning with minimal computational overhead, demonstrating its potential within the pre-train and fine-tune paradigm. However, recent PETL methods consistently struggle when fine-tuning for semantic segmentation tasks, limiting their broader applicability. In this paper, we identify that fine-tuning for semantic segmentation requires larger parameter adjustments due to shifts in semantic perception granularity. Current PETL approaches are unable to effectively accommodate these shifts, leading to significant performance degradation. To address this, we introduce ProPETL, a novel approach that incorporates an additional midstream adaptation to progressively align pre-trained models for segmentation tasks. Through this process, ProPETL achieves state-of-the-art performance on most segmentation benchmarks and, for the first time, surpasses full fine-tuning on the challenging COCO-Stuff10k dataset. Furthermore, ProPETL demonstrates strong generalization across various pre-trained models and scenarios, highlighting its effectiveness and versatility for broader adoption in segmentation tasks. Code is available at: https://github.com/weeknan/ProPETL.

AAAI Conference 2022 Short Paper

A Discriminative and Robust Feature Learning Approach for EEG-Based Motor Imagery Decoding (Student Abstract)

  • Xiuyu Huang
  • Nan Zhou
  • Kup-Sze Choi

Convolutional neural networks (CNNs) have been commonly applied in the area of the Electroencephalography (EEG)based Motor Imagery (MI) classification, significantly pushing the boundary of the state-of-the-art. In order to simultaneously decode the discriminative features and eliminate the negative effects of non-Gaussian noise and outliers in the motor imagery data, in this abstract, we propose a novel robust supervision signal, called Correntropy based Center Loss (CCL), for CNN training, which utilizes the correntropy induced distance as the objective measure. It is encouraging to see that the CNN model trained by the combination of softmax loss and CCL loss outperforms the state-of-the-art models on two public datasets.

TIST Journal 2019 Journal Article

Short Text Analysis Based on Dual Semantic Extension and Deep Hashing in Microblog

  • Wanqiu Cui
  • Junping Du
  • Dawei Wang
  • Xunpu Yuan
  • Feifei Kou
  • Liyan Zhou
  • Nan Zhou

Short text analysis is a challenging task as far as the sparsity and limitation of semantics. The semantic extension approach learns the meaning of a short text by introducing external knowledge. However, for the randomness of short text descriptions in microblogs, traditional extension methods cannot accurately mine the semantics suitable for the microblog theme. Therefore, we use the prominent and refined hashtag information in microblogs as well as complex social relationships to provide implicit guidance for semantic extension of short text. Specifically, we design a deep hash model based on social and conceptual semantic extension, which consists of dual semantic extension and deep hashing representation. In the extension method, the short text is first conceptualized to achieve the construction of hashtag graph under conceptual space. Then, the associated hashtags are generated by correlation calculation based on the integration of social relationships and concepts to extend the short text. In the deep hash model, we use the semantic hashing model to encode the abundant semantic features and form a compact and meaningful binary encoding. Finally, extensive experiments demonstrate that our method can learn and represent the short texts well by using more meaningful semantic signal. It can effectively enhance and guide the semantic analysis and understanding of short text in microblogs.