Author name cluster

Shan An

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

9 papers

2 author rows

JBHI Journal 2026 Journal Article

MBE-UNet: Multi-Branch Boundary Enhanced U-Net for Ultrasound Segmentation

Qing Qin
Ziwei Lin
Guangyuan Gao
Chunxiao Han
Ruofan Wang
Yingmei Qin
Shanshan Li
Shan An

Accurately capturing object areas in medical images is crucial for the clinical diagnosis and treatment of diseases. Due to the inherent low contrast and blurry edges in ultrasound images, most existing CNN-based methods often yield unsatisfactory segmentation results, making ultrasound image segmentation a challenging task. This paper introduces a novel multi-branch boundary enhanced network (MBE-UNet) for automatic ultrasound image segmentation. This method can accurately segment targets and delineate boundaries simultaneously using a multi-branch network. First, a global pyramid attention module (GPAM) is designed to capture multi-scale contextual information. Second, we embed a boundary cascade module (BCM) in the main branch to ensure the network focuses on edge information flow and generates relatively desirable boundaries. Finally, a boundary feature fusion module (BFM) is used to integrate boundary and region information, obtaining a boundary enhanced region map. The visual results and quantitative analysis demonstrate that the proposed MBE-UNet outperforms classical segmentation networks on three publicly available ultrasound datasets.

Details DOI

IROS Conference 2025 Conference Paper

HyperGraph ROS: An Open-Source Robot Operating System for Hybrid Parallel Computing based on Computational HyperGraph

Shufang Zhang
Jiazheng Wu
Jiacheng He
Kaiyi Wang
Shan An

This paper presents HyperGraph ROS, an open-source robot operating system that unifies intra-process, inter-process, and cross-device computation into a computational hypergraph for efficient message passing and parallel execution. In order to optimize communication, HyperGraph ROS dynamically selects the optimal communication mechanism while maintaining a consistent API. For intra-process messages, Intel-TBB Flow Graph is used with C++ pointer passing, which ensures zero memory copying and instant delivery. Meanwhile, inter-process and cross-device communication seamlessly switch to ZeroMQ. When a node receives a message from any source, it is immediately activated and scheduled for parallel execution by Intel-TBB. The computational hypergraph consists of nodes represented by TBB flow graph nodes and edges formed by TBB pointer-based connections for intra-process communication, as well as ZeroMQ links for inter-process and cross-device communication. This structure enables seamless distributed parallelism. Additionally, HyperGraph ROS provides ROS-like utilities such as a parameter server, a coordinate transformation tree, and visualization tools. Evaluation in diverse robotic scenarios demonstrates significantly higher transmission and throughput efficiency compared to ROS 2. Our work is available at https://github.com/wujiazheng2020a/hyper_graph_ros.

Details

ICML Conference 2025 Conference Paper

TinyMIG: Transferring Generalization from Vision Foundation Models to Single-Domain Medical Imaging

Chuang Liu
Hongyan Xu 0002
Yichao Cao
Xiu Su
Zhe Qu
Tianfa Li
Shan An
Haogang Zhu

Medical imaging faces significant challenges in single-domain generalization (SDG) due to the diversity of imaging devices and the variability among data collection centers. To address these challenges, we propose TinyMIG, a framework designed to transfer generalization capabilities from vision foundation models to medical imaging SDG. TinyMIG aims to enable lightweight specialized models to mimic the strong generalization capabilities of foundation models in terms of both global feature distribution and local fine-grained details during training. Specifically, for global feature distribution, we propose a Global Distribution Consistency Learning strategy that mimics the prior distributions of the foundation model layer by layer. For local fine-grained details, we further design a Localized Representation Alignment method, which promotes semantic alignment and generalization distillation between the specialized model and the foundation model. These mechanisms collectively enable the specialized model to achieve robust performance in diverse medical imaging scenarios. Extensive experiments on large-scale benchmarks demonstrate that TinyMIG, with extremely low computational cost, significantly outperforms state-of-the-art models, showcasing its superior SDG capabilities. All the code and model weights will be publicly available.

Details

IROS Conference 2023 Conference Paper

An Open-Source Robotic Chinese Chess Player

Shan An
Guangfu Che
Jinghao Guo
Yanjie Xu
Guoxin Wang
Konstantinos A. Tsintotas
Fukai Zhang
Junjie Ye 0004

Consumer robots can accompany children growing up, improving their abilities while playing and entertaining. This paper presents an open-source, practical, low-cost robotic Chinese chess player. The proposed system includes an elaborate mechanical structure, a simple kinematic solution, a novel robot operating system, real-time and accurate chess recognition. Regarding its mechanical design, it combines a magnetism structure and mechanical cam drive, while the overall system has just three servo motors. At the same time, its control strategy is simple and effective. Furthermore, a lightweight robot message communication mechanism, entitled TinyROS, is developed for computing resource-limited embedded chips. Concerning the recognition process, our CNNbased object detector determines chess and achieves accurate identification. As a result, our robotic Chinese chess player is exquisite and easy for large-scale promotion while improving users' chess skills. Aiming to facilitate future consumer robot research and popularize customer robots, the model's mechanical and software design and the TinyROS protocol are open-sourced at https://github.com/Star-Robot/chinese-chess-robot.

Details

ICRA Conference 2021 Conference Paper

Deep Balanced Learning for Long-tailed Facial Expressions Recognition

Hongxiang Gao
Shan An
Jianqing Li 0002
Chengyu Liu 0001

The analysis of facial expression is a very complex and challenging problem. Most researches for automated Facial Expression Recognition (FER) are mainly based on deep learning networks, rarely considering data imbalance. This paper commits to addressing the long-tail distribution problems among large-scale datasets in wild. Inspired by the continual learning method, we reconstruct multi-subsets first by randomly selecting from head classes and up-sampling tail classes. A pre-trained backbone is then introduced to learn general weights in a repeatedly train-prune fashion. Hereafter, our approach creatively trains a new classifier based on union parameters previously preserved and achieves an outperformance without extra parameters added in, using the gradual-prune technique. The results show that the independent training of classifiers has been a contributing factor. We successfully conduct this experiment with several classic networks, prove its effectiveness in training a deep network on imbalanced dataset. In the face of the poor performance in current FER, we find that domain knowledge is somehow affecting the accuracy of recognition by further exploring the obstacles from the image itself. Code available at https://github.com/Epicghx/FER

Details

IROS Conference 2021 Conference Paper

Real-Time Monocular Human Depth Estimation and Segmentation on Embedded Systems

Shan An
Fangru Zhou
Mei Yang
Haogang Zhu
Changhong Fu 0001
Konstantinos A. Tsintotas

Estimating a scene’s depth to achieve collision avoidance against moving pedestrians is a crucial and fundamental problem in the robotic field. This paper proposes a novel, low complexity network architecture for fast and accurate human depth estimation and segmentation in indoor environments, aiming to applications for resource-constrained platforms (including battery-powered aerial, micro-aerial, and ground vehicles) with a monocular camera being the primary perception module. Following the encoder-decoder structure, the proposed framework consists of two branches, one for depth prediction and another for semantic segmentation. Moreover, network structure optimization is employed to improve its forward inference speed. Exhaustive experiments on three self-generated datasets prove our pipeline’s capability to execute in real-time, achieving higher frame rates than contemporary state-of-the-art frameworks (114. 6 frames per second on an NVIDIA Jetson Nano GPU with TensorRT) while maintaining comparable accuracy.

Details

ICRA Conference 2021 Conference Paper

Vanishing Point Aided LiDAR-Visual-Inertial Estimator

Peng Wang
Zheng Fang 0001
Shibo Zhao
Yongnan Chen
Ming Zhou
Shan An

In this paper, we propose a vanishing point aided LiDAR-Visual-Inertial estimator to achieve real-time, low-drift and robust pose estimation. The proposed method is mainly composed of 3 sequential modules, namely IMU-aided vanishing point (VP) detection module, voxel-map based feature depth association module, and visual inertial fixed-lag smoother module. The IMU-aided VP detection module will detect feature points, line segments and vanishing points to establish robust correspondences in successive frames. In particular, we propose to use 1-line RANSAC method to provide stable VP hypotheses and polar grid to accelerate vanishing point hypothesis validation. After that, we propose a novel voxel-map based feature depth association method, to retrieve depth and assign depth to visual feature efficiently. Finally, the visual inertial fixed-lag smoother module is proposed to jointly minimize error terms. Experiments show that our method outperforms the state-of-the-art visual-inertial odometry and LiDAR-visual estimator in both indoor and outdoor environments.

Details

IJCAI Conference 2020 Conference Paper

Transductive Relation-Propagation Network for Few-shot Learning

Yuqing Ma
Shihao Bai
Shan An
Wei Liu
Aishan Liu
Xiantong Zhen
Xianglong Liu

Few-shot learning, aiming to learn novel concepts from few labeled examples, is an interesting and very challenging problem with many practical advantages. To accomplish this task, one should concentrate on revealing the accurate relations of the support-query pairs. We propose a transductive relation-propagation graph neural network (TRPN) to explicitly model and propagate such relations across support-query pairs. Our TRPN treats the relation of each support-query pair as a graph node, named relational node, and resorts to the known relations between support samples, including both intra-class commonality and inter-class uniqueness, to guide the relation propagation in the graph, generating the discriminative relation embeddings for support-query pairs. A pseudo relational node is further introduced to propagate the query characteristics, and a fast, yet effective transductive learning strategy is devised to fully exploit the relation information among different queries. To the best of our knowledge, this is the first work that explicitly takes the relations of support-query pairs into consideration in few-shot learning, which might offer a new way to solve the few-shot learning problem. Extensive experiments conducted on several benchmark datasets demonstrate that our method can significantly outperform a variety of state-of-the-art few-shot learning methods.

PDF Details DOI

IROS Conference 2019 Conference Paper

Fast and Incremental Loop Closure Detection Using Proximity Graphs

Shan An
Guangfu Che
Fangru Zhou
Xianglong Liu 0001
Xin Ma
Yu Chen

Visual loop closure detection, which can be considered as an image retrieval task, is an important problem in SLAM (Simultaneous Localization and Mapping) systems. The frequently used bag-of-words (BoW) models can achieve high precision and moderate recall. However, the requirement for lower time costs and fewer memory costs for mobile robot applications is not well satisfied. In this paper, we propose a novel loop closure detection framework titled FILD’ (Fast and Incremental Loop closure Detection), which focuses on an on-line and incremental graph vocabulary construction for fast loop closure detection. The global and local features of frames are extracted using the Convolutional Neural Networks (CNN) and SURF on the GPU, which guarantee extremely fast extraction speeds. The graph vocabulary construction is based on one type of proximity graph, named Hierarchical Navigable Small World (HNSW) graphs, which is modified to adapt to this specific application. In addition, this process is coupled with a novel strategy for real-time geometrical verification, which only keeps binary hash codes and significantly saves on memory usage. Extensive experiments on several publicly available datasets show that the proposed approach can achieve fairly good recall at 100% precision compared to other state-of-the-art methods. The source code can be downloaded at https://github.com/AnshanTJU/FILD for further studies.

Details