Author name cluster

Han Sun

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

11 papers

2 author rows

EAAI Journal 2026 Journal Article

An advanced detector and dual-shortest distance intersection algorithm for navigation path extraction in complex orchards

Pengfei Lv
Jinlin Xue
Wenbo Wei
Shaohua Liu
Weiwei Gao
Han Sun
Hanzhao Miao
Weihao Wang

Details DOI

EAAI Journal 2026 Journal Article

Lightweight multi-classification pear fruit high-precision detection model in complex orchard scenes

Shaohua Liu
Jinlin Xue
Tianyu Zhang
Pengfei Lv
Tianxing Zhao
Han Sun
ruikai liu
Yihang Chen

Details DOI

AAAI Conference 2026 Conference Paper

RSPlace: Rotation Sensing Macro Placement via Bidirectional Tree Expansion

Tianyi Liu
Yaxin Xu
Lin Geng
Ningzhong Liu
Han Sun
Yu Wang

Macro placement is a crucial subproblem of chip design, focusing on determining the locations of numerous macros while minimizing multiple metrics. In recent years, reinforcement learning (RL) has gained traction as a favorable technique to improve placement performance. However, existing RL-based placers ignore the orientation of macros, resulting in the state space constrained to two-dimensional discrete coordinates and greatly restricting the exploration opportunities. To address this issue, we propose a novel macro placement method, RSPlace, which guides the bidirectional expansion of the global search tree to offer the RL agent more exploration opportunities, incorporating rotation into the RL-based macro placement solution for the first time. RSPlace intelligently determines the optimal rotation angle to maximize placement benefits by leveraging rotation sensing and placement perturbations. Extensive experiments demonstrate that taking the macro orientation into account substantially broadens the feasible locations and effectively reduces the half-perimeter wirelength (HPWL), thus ensuring that our approach significantly improves the optimization effect compared to the state-of-the-art method.

PDF Details DOI

JBHI Journal 2025 Journal Article

A Novel Framework for Predicting Phage-Host Interactions via Host Specificity-Aware Graph Autoencoder

Zhen Xiao
Han Sun
Ankang Wei
Weizhong Zhao
Xingpeng Jiang

Due to the abuse of antibiotics, some pathogenic bacteria have developed resistance to most antibiotics, leading to the emergence of antibiotic-resistant superbugs. Therefore, researchers resort to phage therapy for bacterial infections. For phage therapy, the fundamental step is to accurately identify phage-host interactions. Although various methods have been proposed, the existing methods suffer from the following two shortcomings: 1) they fail to make full use of genetic information including both genome and protein sequence of phages; 2) host specificity of phages is not explicitly utilized when learning representations of phages and bacteria. In this paper, we present an efficient computational method called PHISGAE for predicting phage-host interactions, in which the host specificity is explicitly employed. Firstly, initial phage-phage connections are efficiently constructed via utilizing phage genome and protein sequence. Then, the refined heterogeneous network is derived by applying K-nearest neighbor strategy, keeping relatively more meaningful local semantics among phages and bacteria. Finally, a host specificity-aware graph autoencoder is proposed to learn high-quality representations of phages and bacteria for predicting phage-host interactions. Experimental results show that PHISGAE outperforms the state-of-the-art methods on predicting phage-host interactions at both species level and genus level (AUC values of 94. 73% and 96. 32%, respectively). Moreover, results of case study demonstrate that PHISGAE is able to identify candidate hosts with high probability for previously unseen phages identified from metagenomics, effectively predicting potential phage-host interactions in real-world applications.

Details DOI

IROS Conference 2025 Conference Paper

DCT-Diffusion: Depth Completion for Transparent Objects with Diffusion Denoising Approach

Zhenning Zhou
Weiqing Shen
Han Sun
Yizhao Wang
Qixin Cao

Transparent objects are common in industrial automation and daily life. However, accurate visual perception of these objects remains challenging due to their reflective and refractive properties. Most previous studies fail to capture contextual information or typically rely on regression-based methods at the decoder stage, suffering from overfitting and unsatisfactory object details. To overcome these limitations, we present a novel depth completion framework for transparent objects with diffusion denoising approach (DCT-Diffusion). First, we adopt a transformer-based encoder to globally learn the depth relationships from different parts of the input by modeling long-distance dependencies. Then, we propose to introduce the diffusion model to generate refined depth maps from random depth distribution. Through iterative refinement, our model can progressively enhance depth map details and achieves fine-grained performance. Lastly, a conditioned fusion module is developed, which utilizes encoder features as visual conditions and fuses them with the denoising block at each step using augmented attention. Extensive comparative studies and cross-domain experiments prove that the DCT-Diffusion outperforms previous methods and significantly improves the robustness and generalization ability. Moreover, visualization results further illustrate that our method can generate depth maps with more complete geometry and clearer boundaries, achieving satisfactory results.

Details

ICLR Conference 2025 Conference Paper

DynAlign: Unsupervised Dynamic Taxonomy Alignment for Cross-Domain Segmentation

Han Sun
Rui Gong
Ismail Nejjar
Olga Fink

Current unsupervised domain adaptation (UDA) methods for semantic segmentation typically assume identical class labels between the source and target domains. This assumption ignores the label-level domain gap, which is common in real-world scenarios, and limits their ability to identify finer-grained or novel categories without requiring extensive manual annotation. A promising direction to address this limitation lies in recent advancements in foundation models, which exhibit strong generalization abilities due to their rich prior knowledge. However, these models often struggle with domain-specific nuances and underrepresented fine-grained categories. To address these challenges, we introduce DynAlign, a two-stage framework that integrates UDA with foundation models to bridge both the image-level and label-level domain gaps. Our approach leverages prior semantic knowledge to align source categories with target categories that can be novel, more fine-grained, or named differently. (e.g., vehicle to car, truck, bus). Foundation models are then employed for precise segmentation and category reassignment. To further enhance accuracy, we propose a knowledge fusion approach that dynamically adapts to varying scene contexts. DynAlign generates accurate predictions in a new target label space without requiring any manual annotations, allowing seamless adaptation to new taxonomies through either model retraining or direct inference. Experiments on the GTA $\rightarrow$ IDD and GTA$\rightarrow$ Mapillary benchmarks validate the effectiveness of our approach, achieving a significant improvement over existing methods. Our code is publically available at https://github.com/hansunhayden/DynAlign.

Details

EAAI Journal 2024 Journal Article

Detection of fruit tree diseases in natural environments: A novel approach based on stereo camera and deep learning

Han Sun
Jinlin Xue
Yue Song
Peixiao Wang
Yu Wen
Tianyu Zhang

Details DOI

JBHI Journal 2023 Journal Article

An Effective Model for Predicting Phage-Host Interactions Via Graph Embedding Representation Learning With Multi-Head Attention Mechanism

Yue Wang
Han Sun
Haodong Wang
Dandan Li
Weizhong Zhao
Xingpeng Jiang
Xianjun Shen

In the treatment of bacterial infectious diseases, overuse of antibiotics may lead to not only bacterial resistance to antibiotics but also dysbiosis of beneficial bacteria which are essential for maintaining normal human life activities. Instead, phage therapy, which invades and lyses specific pathogenic bacteria without affecting beneficial bacteria, becomes more and more popular to treat bacterial infectious diseases. For the effective phage therapy, it requires to accurately predict potential phage-host interactions from heterogeneous information network consisting of bacteria and phages. Although many models have been proposed for predicting phage-host interactions, most methods fail to consider fully the sparsity and unconnectedness of phage-host heterogeneous information network, deriving the undesirable performance on phage-host interactions prediction. To address the challenge, we propose an effective model called GERMAN-PHI for predicting Phage-Host Interactions via Graph Embedding Representation learning with Multi-head Attention mechaNism. In GERMAN-PHI, the multi-head attention mechanism is utilized to learn representations of phages and hosts from multiple perspectives of phage-host associations, addressing the sparsity and unconnectedness in phage-host heterogeneous information network. More specifically, a module of GAT with talking-heads is employed to learn representations of phages and bacteria, on which neural induction matrix completion is conducted to reconstruct the phage-host association matrix. Results of comprehensive experiments demonstrate that GERMAN-PHI performs better than the state-of-the-art methods on phage-host interactions prediction. In addition, results of case study for two high-risk human pathogens show that GERMAN-PHI can predict validated phages with high accuracy, and some potential or new associated phages are provided as well.

Details DOI

IROS Conference 2023 Conference Paper

PanelPose: A 6D Pose Estimation of Highly-Variable Panel Object for Robotic Robust Cockpit Panel Inspection

Han Sun
Peiyuan Ni
Zhiqi Li
Yizhao Wang
Xiaoxiao Zhu
Qixin Cao

In robotic cockpit inspection scenarios, the 6D pose of highly-variable panel objects is necessary. However, the buttons with different states on the panel cause the variable texture and point cloud, which confuses the traditional invariable object pose estimation method. The bottleneck is the variable texture and point cloud. To address this issue, we propose a simple yet effective method denoted as PanelPose that leverages synthetic data and edge-line features. Specifically, we extract edge and line features of RGB images and fuse these feature maps as a multi-feature fusion map (MFF Map) to focus on the shape features of panel objects. Moreover, we design an effective keypoint selection algorithm considering the shape information of panel objects, which simplifies keypoint localization for precise pose estimation. Finally, the panel object pose is estimated via PNP/RANSAC, refined by the multi-state template (MST) and multi-scale ICP. We experimentally show that state-of-the-art 6D pose estimation methods alone are not sufficient to solve the cockpit panel inspection task but that our method significantly improves the performance. In cockpit inspection scenarios, the panel localization error is less than 3mm using our method. Code and data are available at https://github.com/sunhan1997/PaneIPose.

Details

NeurIPS Conference 2023 Conference Paper

SimMMDG: A Simple and Effective Framework for Multi-modal Domain Generalization

Hao Dong
Ismail Nejjar
Han Sun
Eleni Chatzi
Olga Fink

In real-world scenarios, achieving domain generalization (DG) presents significant challenges as models are required to generalize to unknown target distributions. Generalizing to unseen multi-modal distributions poses even greater difficulties due to the distinct properties exhibited by different modalities. To overcome the challenges of achieving domain generalization in multi-modal scenarios, we propose SimMMDG, a simple yet effective multi-modal DG framework. We argue that mapping features from different modalities into the same embedding space impedes model generalization. To address this, we propose splitting the features within each modality into modality-specific and modality-shared components. We employ supervised contrastive learning on the modality-shared features to ensure they possess joint properties and impose distance constraints on modality-specific features to promote diversity. In addition, we introduce a cross-modal translation module to regularize the learned features, which can also be used for missing-modality generalization. We demonstrate that our framework is theoretically well-supported and achieves strong performance in multi-modal DG on the EPIC-Kitchens dataset and the novel Human-Animal-Cartoon (HAC) dataset introduced in this paper. Our source code and HAC dataset are available at https: //github. com/donghao51/SimMMDG.

PDF Details

EAAI Journal 2019 Journal Article

Web image annotation based on Tri-relational Graph and semantic context analysis

Jing Zhang
Ti Tao
Yakun Mu
Han Sun
Dongdong Li
Zhe Wang

Details DOI