Arrow Research search

Author name cluster

Kun Sun

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

6 papers
1 author row

Possible papers

6

AAAI Conference 2026 Conference Paper

SGAT: Learning Feature Matching with Singularity-enhanced Graph Attention Network

  • Yizhuo Zhang
  • Kun Sun
  • Chang Tang
  • Yuanyuan Liu
  • Xin Li

The task of image feature matching aims to establish correct correspondences between images from two different views. While approaches based on attention mechanisms have demonstrated remarkable advancements in image feature matching, they still encounter substantial limitations. Specifically, current graph attention network approaches face performance bottlenecks in complex scenarios, such as low-texture regions or occlusions. This limitation stems from the self-attention mechanism, which, when lacking effective guidance, can lead to divergent attention weights or incorrect focus on regions with low discriminability, resulting in matching failures in low-texture environments. Inspired by how humans focus on distinctive regions when performing cross-view matching, we enhance attention to singular points in images that are salient, unique and have high cross-view matching potential during information aggregation, thereby improving matching capability. To realize the aforementioned strategies, we develop a novel Singularity-enhanced Graph Attention Network (SGAT). SGAT leverages Co-potentiality and Multi-Scale Singularity as prior guidance, and designs a Singularity-aware Attention mechanism and a Co-potentiality Guided Attention mechanism, specifically enhancing the perception of singularity and matching potential during feature interaction. Experimental results on multiple datasets, including ScanNet1500, demonstrate that our method outperforms current state-of-the-art sparse matching methods. In particular, the improvement is most pronounced in complex scenarios such as low-texture environments, significantly enhancing the accuracy and robustness of image matching and its downstream tasks.

AAAI Conference 2026 Conference Paper

When Genes Speak: A Semantic-Guided Framework for Spatially Resolved Transcriptomics Data Clustering

  • Jiangkai Long
  • Yanran Zhu
  • Chang Tang
  • Kun Sun
  • Yuanyuan Liu
  • Xuesong Yan

Spatial transcriptomics enables gene expression profiling with spatial context, offering unprecedented insights into the tissue microenvironment. However, most computational models treat genes as isolated numerical features, ignoring the rich biological semantics encoded in their symbols. This prevents a truly deep understanding of critical biological characteristics. To overcome this limitation, we present SemST, a semantic-guided deep learning framework for spatial transcriptomics data clustering. SemST leverages Large Language Models (LLMs) to enable genes to "speak" through their symbolic meanings, transforming gene sets within each tissue spot into biologically informed embeddings. These embeddings are then fused with the spatial neighborhood relationships captured by Graph Neural Networks (GNNs), achieving a coherent integration of biological function and spatial structure. We further introduce the Fine-grained Semantic Modulation (FSM) module to optimally exploit these biological priors. The FSM module learns spot-specific affine transformations that empower the semantic embeddings to perform an element-wise calibration of the spatial features, thus dynamically injecting high-order biological knowledge into the spatial context. Extensive experiments on public spatial transcriptomics datasets show that SemST achieves state-of-the-art clustering performance. Crucially, the FSM module exhibits plug-and-play versatility, consistently improving the performance when integrated into other baseline methods.

JBHI Journal 2025 Journal Article

An Improved Microbial Object Detection Method for Low-Contrast and Occluded Scenarios Based on SMA-YOLOv8s

  • Kun Sun
  • Zhenqiang Song
  • Jiaxing Zhang
  • Shiyu Liu
  • Yu Zhang
  • Qinghao Song
  • Qing Wu

Accurate detection and localization of microbial targets are critical for microbial trajectory tracking and analysis. However, microscopic microorganism images often exhibit low contrast and mutual occlusion between targets, which pose significant challenges for microbial object accuracy detection due to insufficient distinguishable shallow-layer information and occluded targets inadequate representation. To address these issues, a novel method of SMA-YOLOv8s is proposed for microbial object detection. Firstly, the traditional strided convolutions is replaced with SPD-Conv in downsampling of YOLOv8s to retain shallow-layer information. Secondly, a feature fusion strategy that integrates Cascaded Group Attention with Scale Sequence Feature Fusion (CSFF) is proposed, which could enrich contextual feature representation for better detecting occluded targets. Thirdly, the Wise-IoU loss function is employed to optimize bounding box regression, improving localization precision. Experimental evaluations on the BCCD, CTMCv1, and a self-constructed microscopic microorganism dataset demonstrate that SMA-YOLOv8s achieves mAP50 scores of 95. 5%, 90. 3%, and 81. 7%, respectively, surpassing baseline methods in overall performance. These results highlight the robustness and effectiveness of the proposed method in detecting microbial targets under low contrast and occlusion conditions.

NeurIPS Conference 2025 Conference Paper

SparseMVC: Probing Cross-view Sparsity Variations for Multi-view Clustering

  • Ruimeng Liu
  • Xin Zou
  • Chang Tang
  • Xiao Zheng
  • Xingchen Hu
  • Kun Sun
  • Xinwang Liu

Existing multi-view clustering methods employ various strategies to address data-level sparsity and view-level dynamic fusion. However, we identify a critical yet overlooked issue: varying sparsity across views. Cross-view sparsity variations lead to encoding discrepancies, heightening sample-level semantic heterogeneity and making view-level dynamic weighting inappropriate. To tackle these challenges, we propose Adaptive Sparse Autoencoders for Multi-View Clustering (SparseMVC), a framework with three key modules. Initially, the sparse autoencoder probes the sparsity of each view and adaptively adjusts encoding formats via an entropy-matching loss term, mitigating cross-view inconsistencies. Subsequently, the correlation-informed sample reweighting module employs attention mechanisms to assign weights by capturing correlations between early-fused global and view-specific features, reducing encoding discrepancies and balancing contributions. Furthermore, the cross-view distribution alignment module aligns feature distributions during the late fusion stage, accommodating datasets with an arbitrary number of views. Extensive experiments demonstrate that SparseMVC achieves state-of-the-art clustering performance. Our framework advances the field by extending sparsity handling from the data-level to view-level and mitigating the adverse effects of encoding discrepancies through sample-level dynamic weighting. The source code is publicly available at https: //github. com/cleste-pome/SparseMVC.

IJCAI Conference 2025 Conference Paper

Spatially Resolved Transcriptomics Data Clustering with Tailored Spatial-scale Modulation

  • Yuang Xiao
  • Yanran Zhu
  • Chang Tang
  • Xiao Zheng
  • Yuanyuan Liu
  • Kun Sun
  • Xinwang Liu

Spatial transcriptomics, comprising spatial location and high-throughput gene expression information, provides revolutionary insights into disease discovery and cellular evolution. Spatial transcriptomic clustering, which pinpoints distinct spatial domains within tissues, reveals cellular interactions and enhances our understanding of the intricate architecture of tissues. Existing methods typically construct spatial graphs using a static radius based on spatial coordinates, which hinders the accurate identification of spatial domains and complicates the precise partitioning of boundary nodes within clusters. To address this issue, we introduce a novel spatially resolved transcriptomics data clustering network (TSstc). Specifically, we employ a tailored spatial-scale modulation approach, constructing different spatial graphs incrementally as the radius of the spatial domain expands, and a Spatiality-Aware Sampling (SAS) strategy is proposed to aggregate node representations by considering the spatial dependencies between spots. We then use GCN encoders to learn gene embedding with gene graph and multiple spatial embeddings with spatial graphs. During training, we incorporate cross-view correlation-based tailored spatial regularization constraints to preserve high-quality neighbor relationships across spatial embeddings at different scales. Finally, a zero-inflated negative binomial model is utilized to capture the global probability distribution of gene expression profiles. Extensive experimental results demonstrate that our approach surpasses existing state-of-the-art methods in clustering tasks and related downstream applications.

AAAI Conference 2020 Conference Paper

R²MRF: Defocus Blur Detection via Recurrently Refining Multi-Scale Residual Features

  • Chang Tang
  • Xinwang Liu
  • Xinzhong Zhu
  • En Zhu
  • Kun Sun
  • Pichao Wang
  • Lizhe Wang
  • Albert Zomaya

Defocus blur detection aims to separate the in-focus and out-of-focus regions in an image. Although attracting more and more attention due to its remarkable potential applications, there are still several challenges for accurate defocus blur detection, such as the interference of background clutter, sensitivity to scales and missing boundary details of defocus blur regions. In order to address these issues, we propose a deep neural network which Recurrently Refines Multi-scale Residual Features (R2MRF) for defocus blur detection. We firstly extract multi-scale deep features by utilizing a fully convolutional network. For each layer, we design a novel recurrent residual refinement branch embedded with multiple residual refinement modules (RRMs) to more accurately detect blur regions from the input image. Considering that the features from bottom layers are able to capture rich low-level features for details preservation while the features from top layers are capable of characterizing the semantic information for locating blur regions, we aggregate the deep features from different layers to learn the residual between the intermediate prediction and the ground truth for each recurrent step in each residual refinement branch. Since the defocus degree is sensitive to image scales, we finally fuse the side output of each branch to obtain the final blur detection map. We evaluate the proposed network on two commonly used defocus blur detection benchmark datasets by comparing it with other 11 state-of-the-art methods. Extensive experimental results with ablation studies demonstrate that R2MRF consistently and significantly outperforms the competitors in terms of both efficiency and accuracy.