Arrow Research search

Author name cluster

Yao Chen

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

9 papers
2 author rows

Possible papers

9

AAAI Conference 2026 Conference Paper

Sample Weighted Incomplete Multimodal Clustering Based on Graph Coarsening Label Extraction

  • Zhenjiao Liu
  • Xue Xiao
  • Yao Chen
  • Jiao Xue
  • Shubin Ma
  • Liang Zhao

Multimodal data is typically collected through heterogeneous sensors and processing pipelines. However, due to variations in acquisition environments, device capabilities, and feature extraction methods, such data often suffers from incompleteness and inconsistent quality across modalities. To address these challenges, prior studies have explored modality selection and data completion strategies to improve information fusion. Nevertheless, these approaches face two main limitations: (1) they struggle to simultaneously ensure computational efficiency for large-scale graph data and maintain structural and semantic consistency across heterogeneous modality graphs; and (2) most of them operate at the modality level and fail to capture fine-grained, sample-specific quality variations. To overcome these issues, we propose a novel clustering framework, Sample Weighted Incomplete Multimodal Clustering Based on Graph Coarsening Label Extraction (IMC-GCSW). The proposed method introduces a graph coarsening-based label extraction strategy. It significantly reduces the computational cost of multimodal graph processing, while preserving key node information and local topological structures. Furthermore, a quality-aware sample weighting strategy is designed to enable fine-grained modeling of modality-specific data quality, allowing the model to dynamically suppress the influence of low-quality modalities on individual samples. Experiments on both general-purpose datasets and the Fructus Aurantii Disease and Pest Datasets demonstrate that the proposed method exhibits superior performance and strong adaptability in handling multimodal data with incompleteness and quality inconsistency.

AAAI Conference 2025 Conference Paper

Local Conditional Controlling for Text-to-Image Diffusion Models

  • Yibo Zhao
  • Liang Peng
  • Yang Yang
  • Zekai Luo
  • Hengjia Li
  • Yao Chen
  • Zheng Yang
  • Xiaofei He

Diffusion models have exhibited impressive prowess in the text-to-image task. Recent methods add image-level structure controls, e.g., edge and depth maps, to manipulate the generation process together with text prompts to obtain desired images. This controlling process is globally operated on the entire image, which limits the flexibility of control regions. In this paper, we explore a novel and practical task setting: local control. It focuses on controlling specific local region according to user-defined image conditions, while the remaining regions are only conditioned by the original text prompt. However, it is non-trivial to achieve it. The naive manner of directly adding local conditions may lead to the local control dominance problem, which forces the model to focus on the controlled region and neglect object generation in other regions. To mitigate this problem, we propose Regional Discriminate Loss to update the noised latents, aiming at enhanced object generation in non-control regions. Furthermore, the proposed Focused Token Response suppresses weaker attention scores which lack the strongest response to enhance object distinction and reduce duplication. Lastly, we adopt Feature Mask Constraint to reduce quality degradation in images caused by information differences across the local control region. All proposed strategies are operated at the inference stage. Extensive experiments demonstrate that our method can synthesize high-quality images aligned with the text prompt under local control conditions.

JBHI Journal 2024 Journal Article

Improving Needle Tip Tracking and Detection in Ultrasound-Based Navigation System Using Deep Learning-Enabled Approach

  • Hui Che
  • Jiaxin Qin
  • Yao Chen
  • Zihan Ji
  • Yibo Yan
  • Jing Yang
  • Qi Wang
  • Chaofeng Liang

Ultrasound-guided percutaneous interventions have numerous advantages over traditional techniques. Accurate needle placement in the target anatomy is crucial for successful intervention, and reliable visual information is essential to achieve this. However, previous studies have revealed several challenges, such as the variability in needle echogenicity and the common misalignment of the ultrasound beam and the needle. Advanced techniques have been developed to optimize needle visualization, including hardware-based and image-processing-based methods. This paper proposes a novel strategy of integrating ultrasound-based deep learning approaches into an optical navigation system to enhance needle visualization and improve tip positioning accuracy. Both the tracking and detection algorithms are optimized utilizing optical tracking information. The information is introduced into the tracking network to define the search patch update strategy and form a trajectory reference to correct tracking results. In the detection network, the original image is processed according to the needle insertion position and current position given by the optical localization system to locate a coarse region, and the depth-score criterion is adopted to optimize detection results. Extensive experiments demonstrate that our approach achieves promising tip tracking and detection performance with tip localization errors of 1. 11 $\pm $ 0. 59 mm and 1. 17 $\pm$ 0. 70 mm, respectively. Moreover, we establish a paired dataset consisting of ultrasound images and their corresponding spatial tip coordinates acquired from the optical tracking system and conduct real puncture experiments to verify the effectiveness of the proposed methods. Our approach significantly improves needle visualization and provides physicians with visual guidance for posture adjustment.

IROS Conference 2023 Conference Paper

Trajectory-Based SLAM for Indoor Mobile Robots with Limited Sensing Capabilities

  • Yao Chen
  • Jeremias Rodriguez
  • Arman Karimian
  • Benjamin Pheil
  • Jose Franco
  • Renaud Moser
  • Read Sandström
  • Scott Lenser

In this paper we introduce a novel SLAM system for 2-D indoor environments that relies only on limited sensing. Our fully autonomous system uses only the trajectory of the robot around walls and objects in the environment as landmarks and is capable of robust and long-term exploration and mapping of a broad range of household floor plans. Rank-deficient and full-rank factors are created when the robot observes existing trajectory-based landmarks, and they are filtered and added in a pose graph, which is optimized periodically. The mission space is mapped by efficient adaptive local mapping algorithms. The proposed SLAM system has been extensively tested in various scenarios, and experimental results show its robustness and accuracy.

ICRA Conference 2020 Conference Paper

Geometric Pretraining for Monocular Depth Estimation

  • Kaixuan Wang
  • Yao Chen
  • Hengkai Guo
  • Linfu Wen
  • Shaojie Shen

ImageNet-pretrained networks have been widely used in transfer learning for monocular depth estimation. These pretrained networks are trained with classification losses for which only semantic information is exploited while spatial information is ignored. However, both semantic and spatial information is important for per-pixel depth estimation. In this paper, we design a novel self-supervised geometric pretraining task that is tailored for monocular depth estimation using uncalibrated videos. The designed task decouples the structure information from input videos by a simple yet effective conditional autoencoder-decoder structure. Using almost unlimited videos from the internet, networks are pretrained to capture a variety of structures of the scene and can be easily transferred to depth estimation tasks using calibrated images. Extensive experiments are used to demonstrate that the proposed geometric-pretrained networks perform better than ImageNet-pretrained networks in terms of accuracy, few-shot learning and generalization ability. Using existing learning methods, geometric-transferred networks achieve new state-of-the-art results by a large margin. The pretrained networks will be open source soon 1.

ICRA Conference 2020 Conference Paper

GPO: Global Plane Optimization for Fast and Accurate Monocular SLAM Initialization

  • Sicong Du
  • Hengkai Guo
  • Yao Chen
  • Yilun Lin 0002
  • Xiangbing Meng
  • Linfu Wen
  • Fei-Yue Wang 0001

Initialization is essential to monocular Simultaneous Localization and Mapping (SLAM) problems. This paper focuses on a novel initialization method for monocular SLAM based on planar features. The algorithm starts by homography estimation in a sliding window. It then proceeds to a global plane optimization (GPO) to obtain camera poses and the plane normal. 3D points can be recovered using planar constraints without triangulation. The proposed method fully exploits the plane information from multiple frames and avoids the ambiguities in homography decomposition. We validate our algorithm on the collected chessboard dataset against baseline implementations and present extensive analysis. Experimental results show that our method outperforms the ne-tuned baselines in both accuracy and real-time.

IROS Conference 2020 Conference Paper

Resultant Radius of Curvature of Stylet-and-Tube Steerable Needles Based on the Mechanical Properties of the Soft Tissue, and the Needle

  • Fan Yang 0101
  • Mahdieh Babaiasl
  • Yao Chen
  • Jow-Lian Ding
  • John P. Swensen

Steerable needles have been widely researched in recent years, and they have multiple potential roles in the medical area. The flexibility and capability of avoiding obstacles allow the steerable needles to be applied in the biopsy, drug delivery and other medical applications that require a high degree of freedom and control accuracy. Radius of Curvature (ROC) of the needle while inserting in the soft tissue is an important parameter for evaluation of the efficacy, and steerability of these flexible needles. For our Fracture-directed Stylet-and-Tube Steerable Needles, it is important to find a relationship among the resultant insertion ROC, pre-set wire shape and the Young’s Modulus of soft tissue to characterize this class of steerable needles. In this paper, an approach is provided for obtaining resultant ROC using stylet and tissue’s mechanical properties. A finite element analysis is also conducted to support the reliability of the model. This work sets the foundation for other researchers to predict the insertion ROC based on the mechanical properties of the needle, and the soft tissue that is being inserted.