EAAI Journal 2026 Journal Article
An advanced detector and dual-shortest distance intersection algorithm for navigation path extraction in complex orchards
- Pengfei Lv
- Jinlin Xue
- Wenbo Wei
- Shaohua Liu
- Weiwei Gao
- Han Sun
- Hanzhao Miao
- Weihao Wang
Author name cluster
Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.
EAAI Journal 2026 Journal Article
EAAI Journal 2026 Journal Article
AAAI Conference 2026 Conference Paper
Macro placement is a crucial subproblem of chip design, focusing on determining the locations of numerous macros while minimizing multiple metrics. In recent years, reinforcement learning (RL) has gained traction as a favorable technique to improve placement performance. However, existing RL-based placers ignore the orientation of macros, resulting in the state space constrained to two-dimensional discrete coordinates and greatly restricting the exploration opportunities. To address this issue, we propose a novel macro placement method, RSPlace, which guides the bidirectional expansion of the global search tree to offer the RL agent more exploration opportunities, incorporating rotation into the RL-based macro placement solution for the first time. RSPlace intelligently determines the optimal rotation angle to maximize placement benefits by leveraging rotation sensing and placement perturbations. Extensive experiments demonstrate that taking the macro orientation into account substantially broadens the feasible locations and effectively reduces the half-perimeter wirelength (HPWL), thus ensuring that our approach significantly improves the optimization effect compared to the state-of-the-art method.
JBHI Journal 2025 Journal Article
Due to the abuse of antibiotics, some pathogenic bacteria have developed resistance to most antibiotics, leading to the emergence of antibiotic-resistant superbugs. Therefore, researchers resort to phage therapy for bacterial infections. For phage therapy, the fundamental step is to accurately identify phage-host interactions. Although various methods have been proposed, the existing methods suffer from the following two shortcomings: 1) they fail to make full use of genetic information including both genome and protein sequence of phages; 2) host specificity of phages is not explicitly utilized when learning representations of phages and bacteria. In this paper, we present an efficient computational method called PHISGAE for predicting phage-host interactions, in which the host specificity is explicitly employed. Firstly, initial phage-phage connections are efficiently constructed via utilizing phage genome and protein sequence. Then, the refined heterogeneous network is derived by applying K-nearest neighbor strategy, keeping relatively more meaningful local semantics among phages and bacteria. Finally, a host specificity-aware graph autoencoder is proposed to learn high-quality representations of phages and bacteria for predicting phage-host interactions. Experimental results show that PHISGAE outperforms the state-of-the-art methods on predicting phage-host interactions at both species level and genus level (AUC values of 94. 73% and 96. 32%, respectively). Moreover, results of case study demonstrate that PHISGAE is able to identify candidate hosts with high probability for previously unseen phages identified from metagenomics, effectively predicting potential phage-host interactions in real-world applications.
IROS Conference 2025 Conference Paper
Transparent objects are common in industrial automation and daily life. However, accurate visual perception of these objects remains challenging due to their reflective and refractive properties. Most previous studies fail to capture contextual information or typically rely on regression-based methods at the decoder stage, suffering from overfitting and unsatisfactory object details. To overcome these limitations, we present a novel depth completion framework for transparent objects with diffusion denoising approach (DCT-Diffusion). First, we adopt a transformer-based encoder to globally learn the depth relationships from different parts of the input by modeling long-distance dependencies. Then, we propose to introduce the diffusion model to generate refined depth maps from random depth distribution. Through iterative refinement, our model can progressively enhance depth map details and achieves fine-grained performance. Lastly, a conditioned fusion module is developed, which utilizes encoder features as visual conditions and fuses them with the denoising block at each step using augmented attention. Extensive comparative studies and cross-domain experiments prove that the DCT-Diffusion outperforms previous methods and significantly improves the robustness and generalization ability. Moreover, visualization results further illustrate that our method can generate depth maps with more complete geometry and clearer boundaries, achieving satisfactory results.
ICLR Conference 2025 Conference Paper
Current unsupervised domain adaptation (UDA) methods for semantic segmentation typically assume identical class labels between the source and target domains. This assumption ignores the label-level domain gap, which is common in real-world scenarios, and limits their ability to identify finer-grained or novel categories without requiring extensive manual annotation. A promising direction to address this limitation lies in recent advancements in foundation models, which exhibit strong generalization abilities due to their rich prior knowledge. However, these models often struggle with domain-specific nuances and underrepresented fine-grained categories. To address these challenges, we introduce DynAlign, a two-stage framework that integrates UDA with foundation models to bridge both the image-level and label-level domain gaps. Our approach leverages prior semantic knowledge to align source categories with target categories that can be novel, more fine-grained, or named differently. (e.g., vehicle to car, truck, bus). Foundation models are then employed for precise segmentation and category reassignment. To further enhance accuracy, we propose a knowledge fusion approach that dynamically adapts to varying scene contexts. DynAlign generates accurate predictions in a new target label space without requiring any manual annotations, allowing seamless adaptation to new taxonomies through either model retraining or direct inference. Experiments on the GTA $\rightarrow$ IDD and GTA$\rightarrow$ Mapillary benchmarks validate the effectiveness of our approach, achieving a significant improvement over existing methods. Our code is publically available at https://github.com/hansunhayden/DynAlign.
EAAI Journal 2024 Journal Article
JBHI Journal 2023 Journal Article
In the treatment of bacterial infectious diseases, overuse of antibiotics may lead to not only bacterial resistance to antibiotics but also dysbiosis of beneficial bacteria which are essential for maintaining normal human life activities. Instead, phage therapy, which invades and lyses specific pathogenic bacteria without affecting beneficial bacteria, becomes more and more popular to treat bacterial infectious diseases. For the effective phage therapy, it requires to accurately predict potential phage-host interactions from heterogeneous information network consisting of bacteria and phages. Although many models have been proposed for predicting phage-host interactions, most methods fail to consider fully the sparsity and unconnectedness of phage-host heterogeneous information network, deriving the undesirable performance on phage-host interactions prediction. To address the challenge, we propose an effective model called GERMAN-PHI for predicting Phage-Host Interactions via Graph Embedding Representation learning with Multi-head Attention mechaNism. In GERMAN-PHI, the multi-head attention mechanism is utilized to learn representations of phages and hosts from multiple perspectives of phage-host associations, addressing the sparsity and unconnectedness in phage-host heterogeneous information network. More specifically, a module of GAT with talking-heads is employed to learn representations of phages and bacteria, on which neural induction matrix completion is conducted to reconstruct the phage-host association matrix. Results of comprehensive experiments demonstrate that GERMAN-PHI performs better than the state-of-the-art methods on phage-host interactions prediction. In addition, results of case study for two high-risk human pathogens show that GERMAN-PHI can predict validated phages with high accuracy, and some potential or new associated phages are provided as well.
IROS Conference 2023 Conference Paper
In robotic cockpit inspection scenarios, the 6D pose of highly-variable panel objects is necessary. However, the buttons with different states on the panel cause the variable texture and point cloud, which confuses the traditional invariable object pose estimation method. The bottleneck is the variable texture and point cloud. To address this issue, we propose a simple yet effective method denoted as PanelPose that leverages synthetic data and edge-line features. Specifically, we extract edge and line features of RGB images and fuse these feature maps as a multi-feature fusion map (MFF Map) to focus on the shape features of panel objects. Moreover, we design an effective keypoint selection algorithm considering the shape information of panel objects, which simplifies keypoint localization for precise pose estimation. Finally, the panel object pose is estimated via PNP/RANSAC, refined by the multi-state template (MST) and multi-scale ICP. We experimentally show that state-of-the-art 6D pose estimation methods alone are not sufficient to solve the cockpit panel inspection task but that our method significantly improves the performance. In cockpit inspection scenarios, the panel localization error is less than 3mm using our method. Code and data are available at https://github.com/sunhan1997/PaneIPose.
NeurIPS Conference 2023 Conference Paper
In real-world scenarios, achieving domain generalization (DG) presents significant challenges as models are required to generalize to unknown target distributions. Generalizing to unseen multi-modal distributions poses even greater difficulties due to the distinct properties exhibited by different modalities. To overcome the challenges of achieving domain generalization in multi-modal scenarios, we propose SimMMDG, a simple yet effective multi-modal DG framework. We argue that mapping features from different modalities into the same embedding space impedes model generalization. To address this, we propose splitting the features within each modality into modality-specific and modality-shared components. We employ supervised contrastive learning on the modality-shared features to ensure they possess joint properties and impose distance constraints on modality-specific features to promote diversity. In addition, we introduce a cross-modal translation module to regularize the learned features, which can also be used for missing-modality generalization. We demonstrate that our framework is theoretically well-supported and achieves strong performance in multi-modal DG on the EPIC-Kitchens dataset and the novel Human-Animal-Cartoon (HAC) dataset introduced in this paper. Our source code and HAC dataset are available at https: //github. com/donghao51/SimMMDG.
EAAI Journal 2019 Journal Article