Author name cluster

Gui-Bin Bian

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

18 papers

2 author rows

IROS Conference 2025 Conference Paper

Dynamic Action Localization and Recognition for Intelligent Perception of Surgical Robots

Yaqin Peng
Gui-Bin Bian
Zhen Li 0049
Ruichen Ma
Qiang Ye

Robot-assisted surgery has significantly advanced surgical precision, yet the development of autonomous surgical robots remains hindered by their limited understanding of complex surgical actions. Current systems lack the ability to effectively perceive and interpret intricate surgical relationships, which restricts their capability to assist surgeons in dynamic surgical environments. To overcome these challenges, a novel self-supervised learning method for surgical action recognition has been proposed, aimed at enhancing the understanding of surgical actions. The method has introduced a dynamic masking with attention-based action localization module to focus the model on critical spatial regions where actions occur, enabling surgical view guidance for intelligent surgical robot while extracting key features. Moreover, a graph-enhanced adaptive feature selection module is employed to assign relevance to features and capture the temporal relationships between adjacent frames. Long Short-Term Memory has been utilized to model long-term dependencies across video sequences, while multi-view contrastive learning facilitates the extraction of discriminative features from both masked and unmasked sequences. Experimental results demonstrate a 3. 4% improvement in Average Precision and an Area Under Receiver Operating Characteristic Curve of 92. 9% on Neuro67 dataset for surgical action recognition. The method enables dynamic adjustments to the surgical view, achieving surgical visual navigation. These advancements contribute to the development of intelligent and autonomous surgical robots capable of assisting surgeons in complex and dynamic surgical settings.

Details

IROS Conference 2025 Conference Paper

High-Precision Tracking of Time-Varying Trajectories for Microsurgical Robots in Constrained Environments

Yu-Peng Zhai
Gui-Bin Bian
Zhen Li 0049
Qiang Ye
Tian-Qi Deng
Ming-Yang Zhang
Pan Fu
Wen-Hao He

This research addresses the challenge of achieving high-precision tracking of time-varying trajectories under nonlinear disturbances and motion constraints in microsurgical robots. A hybrid control framework integrating fuzzy adaptive sliding mode control with radial basis function neural networks is proposed. This framework dynamically adjusts the sliding mode gain to suppress high-frequency jitter and compensate for unmodeled disturbances such as joint friction and tissue contact forces. Experiments conducted on a self-developed microscopic ophthalmic robot platform demonstrated that the trajectory tracking error was reduced to 1. 1 μm, representing improvements of 85. 9%, 76. 1%, and 66. 7% compared to PID control, sliding mode control and non-singular fast terminal sliding mode control respectively. The tracking delay was 19 milliseconds. In experiments on living pigs with central retinal artery occlusion, the system successfully performed intravascular injection, with a maximum error of 3. 97 μm. This solution, through optimization via fuzzy logic and neural networks, achieves micron-level precision and robustness, effectively solving high-frequency control noise and low-frequency environmental disturbances, ensuring both the accuracy and safety of the microsurgical robot.

Details

IROS Conference 2025 Conference Paper

Implicit Disparity-Blur Alignment for Fast and Precise Autofocus in Robotic Microsurgical Imaging

Pan Fu
Zhen Li
Ming-Yang Zhang
Yu-Peng Zhai
Junzheng Wang
Wen-Hao He
Gui-Bin Bian

Creating an intelligent surgical environment requires not only advanced robotic systems but also optimized microscopic imaging. However, autofocus remains a fundamental challenge, with current methods suffering from slow iterative processes or directional ambiguity, which compromises real-time performance. This paper presents an implicit disparity-blur alignment approach for robotic microsurgical autofocus, integrating stereo geometry’s monotonic depth cues with de-focus characteristics for rapid convergence. A novel physics-guided dual-stream network is developed to encode implicit depth representations through hierarchical cross-pathway feature fusion, enabling reliable focus prediction without explicit stereo matching in blur-degraded regions. An ROI-aware attention module is proposed to dynamically optimize focus-critical regions, coupled with learnable physics-guided kernel learning for precise Z-offset estimation. The approach achieves a top directional accuracy of 94. 85% and a single-pass focus error of 0. 20 mm with an inference time of 53 ms on a surgical dataset, which outperforms state-of-the-art methods in reducing iteration count by 22. 8% and inference time by 51. 8%. An intelligent robotic microscope prototype is developed, with validation through ex vivo tests demonstrating its ability to enable fast and precise multi-region focusing for microsurgeries.

Details

IROS Conference 2025 Conference Paper

Spatiotemporal Motion Prediction of Intraocular Microsurgical Robot in Non-Visible Regions

Ya-Wen Deng
Zhen Li 0049
Qiang Ye
Yu-Peng Zhai
Weihong Yu
Zhangguo Yu
Gui-Bin Bian

In intraocular microsurgery with minute operational scales, instruments pass through non-visible regions of the anterior segment, where robot-assisted surgery, which heavily relies on visual perception, fails to determine the instrument’s attitude relative to the eyeball. This compromises surgical flexibility, increases risks, and hinders autonomous surgery development. Therefore, a framework for predicting instrument trajectories in non-visible regions during robot-assisted microsurgery has been proposed to mitigate the risks of retinal and lens injuries caused by blind operations and enhance surgical procedures’ intelligence and autonomy. First, a lightweight reconstruction of the anterior segment environment is performed under controlled knowledge guidance to construct a global map. Second, the tip position of the surgical instrument is detected through multi-sensor fusion, enabling the perception of instrument-environment interactions under visual constraints. Based on this, a long short-term spatiotemporal aggregation algorithm for instrument trajectory prediction is proposed, which enhances surgical safety by providing high-precision predictions of the instrument tip’s motion trajectory. Experiments show that the framework achieved a 0. 0435 mm average prediction error in non-visible regions, corresponding to 0. 03% of the region in a single dimension and 7. 25% of the surgical instrument’s diameter. This significantly enhances the precision of robot-assisted surgery under visual constraints and provides robust technical support for safe, intelligent, and autonomous intraocular robotic surgery.

Details

ICRA Conference 2024 Conference Paper

A Hybrid Admittance Control Algorithm for Automatic Robotic Cranium-Milling

Chen Qian 0006
Zhen Li 0049
Qiang Ye
Pei-Cong Ge
Jizong Zhao
Gui-Bin Bian

Prior robot-assisted cranium-milling studies only considered controlling the force in the skull’s vertical direction and neglected the milling cutter’s feed force. Additionally, achieving stable force control in multiple directions is challenging for robots due to the uneven skull surface. Here a hybrid admittance control algorithm incorporating a model-free adaptive nonlinear force control and fuzzy control algorithms is proposed to accomplish effective automatic cranial-milling tasks. First, a pure data-driven model-free adaptive control method based on partial form dynamic linearization is used to control the feed force. Second, fuzzy control minimizes the total error of both the vertical and feed force by adaptively adjusting the milling cutter’s velocity and position. 42 ex vivo animal skull-milling experiments conducted by the automatic robotic cranium-milling system indicate that when using the proposed control algorithm, the force error percentage can be maintained below 5. 0% within 3 s and the maximal root mean square error percentages for vertical and feed force are 1. 85% and 1. 94%, respectively. Moreover, no instances of dura mater damage are observed and the robotic system exhibits a high level of autonomy as it performs the skull milling task with minimal human involvement throughout the entire experiment. The results suggest the potential for advancing the intelligence level of neurosurgery in the future.

Details

IROS Conference 2024 Conference Paper

Design and Modeling of a Thin-walled Multi-segment Continuum Robotic Bronchoscope

Gui-Bin Bian
Ming-Yang Zhang
Qiang Ye
Han Ren
Yu-Peng Zhai
Ruichen Ma
Zhen Li 0049

Cable-driven continuum robots in bronchoscopic procedures hold immense potential to revolutionize the diagnosis and treatment of lung cancer. However, robotic bronchoscopes in current studies are typically large in size and inflexible. Therefore, this article introduces a novel cable-driven continuum robot bronchoscopy system that achieves modular design between the actuation and operation ends. A continuum structure with a dual-segment notched flexible skeleton, featuring a wall thickness of 0. 45 mm, has been designed to perform bending movements exceeding 190°. This enhances flexibility and increases the spatial capacity of the working channels. A kinematic model was developed, integrating the actuation force and the mechanical characteristics of the driving cables for error compensation, estimating the correlation between the displacement of the driving cables and the position of the continuum robot’s end-effector. The verification showed that the root mean square error (RMSE) of the end-effector position is 2. 57 mm, which accounts for 4. 8% of the continuum’s length. A prototype of the robotic bronchoscopy system was created, and its performance and potential applications in bronchoscopic intervention surgeries were validated through vivo pig intervention experiments.

Details

ICRA Conference 2024 Conference Paper

Procedure Recognition by Knowledge-Driven Segmentation in Robotic-Assisted Vitreoretinal Surgery

Zhen Li 0049
Ya-Wen Deng
Qiang Ye
Weihong Yu
Haoxiang Qi
Yaliang Liu
Zhangguo Yu
Gui-Bin Bian

Internal limiting membrane (ILM) peeling is a vital vitreoretinal surgery procedure. However, due to the thickness of just 1-2 micrometers and the intricacies associated with its varying density and adhesion, the difficulty of manipulation exceeds the physiological limits of human perception and operation. Surgical robot is characterized by high precision and stability. However, navigating intricate intraocular environments and handling minuscule high-precision areas remain enormous challenges. These include issues of uneven lighting, field-of-view loss, and motion blur. This paper proposed a perception method named ‘Multimodal Surgical Process Recognition based on Domain Knowledge and Segmentation (MSPR-DKS), ’ designed to address these challenges and provide input for the precise control of robots. Moreover, a comprehensive dataset focused on ILM peeling during macular hole surgeries was established. Experimental results underscore the efficacy of this approach, with segmentation accuracies exceeding 99. 27% for instruments and macular holes and an average accuracy of 98. 97% in recognizing surgical processes. This study paves the way for leveraging domain knowledge and image segmentation to improve robot-assisted manipulation of soft tissues in ophthalmology.

Details

IROS Conference 2023 Conference Paper

Automated Key Action Detection for Closed Reduction of Pelvic Fractures by Expert Surgeons in Robot-Assisted Surgery

Mingzhang Pan
Ya-Wen Deng
Zhen Li 0049
Yuan Chen
Xiao-Lan Liao
Gui-Bin Bian

Pelvic fractures are one of the most serious traumas in orthopedics, and the technical proficiency and expertise of the surgical team strongly influence the quality of reduction results. With the advancement of information technology and robotics, robot-assisted pelvic fracture reduction surgery is expected to reduce the impact caused by inexperienced doctors and improve the accuracy and stability of pelvic reduction. However, this requires the robot to detect key surgeon actions from time-series data, enabling the robot to independently perceive the surgical status, predict the surgeon's intentions, assess the demonstrated level of professional competence, and assess the progress of the surgery. Therefore, a multi-task deep learning neural network architecture is proposed, which incorporates Convolutional Neural Network-Bidirectional Long Short-Term Memory (CNN-BiLSTM) along with tri-modality fusion and feature extraction techniques. The proposed framework aims to achieve key action detection in closed reduction operations for pelvic fractures. Subsequently, a trimodal fine-grained dataset was constructed, wherein 29, 32, and 14 labels were marked on flexion, position, and pressure data for 14 key closed reduction actions. The experimental results show that the correct detection rate of closed reduction actions is 92. 3 %, significantly higher than the commonly used recognition algorithms. This work provides a method for the robot to learn the surgeon's professional knowledge, provides the basis for the operation's motion perception, and contributes to the autonomy of the robot-assisted closed reduction surgery of pelvic fractures.

Details

JBHI Journal 2022 Journal Article

Space Squeeze Reasoning and Low-Rank Bilinear Feature Fusion for Surgical Image Segmentation

Zhen-Liang Ni
Gui-Bin Bian
Zhen Li
Xiao-Hu Zhou
Rui-Qi Li
Zeng-Guang Hou

Surgical image segmentation is critical for surgical robot control and computer-assisted surgery. In the surgical scene, the local features of objects are highly similar, and the illumination interference is strong, which makes surgical image segmentation challenging. To address the above issues, a bilinear squeeze reasoning network is proposed for surgical image segmentation. In it, the space squeeze reasoning module is proposed, which adopts height pooling and width pooling to squeeze global contexts in the vertical and horizontal directions, respectively. The similarity between each horizontal position and each vertical position is calculated to encode long-range semantic dependencies and establish the affinity matrix. The feature maps are also squeezed from both the vertical and horizontal directions to model channel relations. Guided by channel relations, the affinity matrix is expanded to the same size as the input features. It captures long-range semantic dependencies from different directions, helping address the local similarity issue. Besides, a low-rank bilinear fusion module is proposed to enhance the model’s ability to recognize similar features. This module is based on the low-rank bilinear model to capture the inter-layer feature relations. It integrates the location details from low-level features and semantic information from high-level features. Various semantics can be represented more accurately, which effectively improves feature representation. The proposed network achieves state-of-the-art performance on cataract image segmentation dataset CataSeg and robotic image segmentation dataset EndoVis 2018.

Details DOI

ECAI Conference 2020 Conference Paper

A Lightweight Recurrent Attention Network for Real-Time Guidewire Segmentation and Tracking in Interventional X-Ray Fluoroscopy

Yan-Jie Zhou
Xiao-Liang Xie
Gui-Bin Bian
Zeng-Guang Hou

In endovascular surgery and cardiology, interventional therapy is currently the treatment of choice for most patients. Robust guidewire detection in 2D X-ray fluoroscopy can greatly assist physicians in interventional therapy. Nevertheless, this task often comes with the challenge of the extreme foreground-background class imbalance caused by the slenderer guidewire structure compared to other interventional tools. To address this challenge, a novel efficient network architecture, termed Fast Recurrent Attention Network (FRA-Net), is proposed for fully automatic mono-guidewire and dual-guidewire segmentation and tracking. The main contributions of the proposed network are threefold: 1) We propose a novel attention module that improves model sensitivity to guidewire pixels without requiring complicated heuristics. 2) We design a recurrent convolutional layer that ensures better feature representation. 3) Focal Loss is reinforced to better address the problems of extreme class imbalance and misclassified examples. Quantitative and qualitative evaluation on various datasets demonstrates that the proposed network significantly outperforms simpler baselines as well as the best previously-published result for this task, achieving the state-of-the-art performance. To the best of our knowledge, this is the first end-to-end approach capable of real-time segmenting and tracking mono-guidewire and dual-guidewire in 2D X-ray fluoroscopy.

Details

ICRA Conference 2020 Conference Paper

A Multilayer-Multimodal Fusion Architecture for Pattern Recognition of Natural Manipulations in Percutaneous Coronary Interventions

Xiao-Hu Zhou
Xiao-Liang Xie
Zhen-Qiu Feng
Zeng-Guang Hou
Gui-Bin Bian
Rui-Qi Li
Zhen-Liang Ni
Shi-Qi Liu 0004

The increasingly-used robotic systems can provide precise delivery and reduce X-ray radiation to medical staff in percutaneous coronary interventions (PCI), but natural manipulations of interventionalists are forgone in most robot-assisted procedures. Therefore, it is necessary to explore natural manipulations to design more advanced human-robot interfaces (HRI). In this study, a multilayer-multimodal fusion architecture is proposed to recognize six typical subpatterns of guidewire manipulations in conventional PCI. The synchronously acquired multimodal behaviors from ten subjects are used as the inputs of the fusion architecture. Six classification-based and two rule-based fusion algorithms are evaluated for performance comparisons. Experimental results indicate that the multimodal fusion brings significant accuracy improvement in comparison with single-modal schemes. Furthermore, the proposed architecture can achieve the overall accuracy of 96. 90%, much higher than that of a singlelayer recognition architecture (92. 56%). These results have indicated the potential of the proposed method for facilitating the development of HRI for robot-assisted PCI.

Details

AIIM Journal 2020 Journal Article

An intelligent learning approach for improving ECG signal classification and arrhythmia analysis

Arun Kumar Sangaiah
Maheswari Arumugam
Gui-Bin Bian

Details DOI

ICRA Conference 2020 Conference Paper

Attention-Guided Lightweight Network for Real-Time Segmentation of Robotic Surgical Instruments

Zhen-Liang Ni
Gui-Bin Bian
Zeng-Guang Hou
Xiao-Hu Zhou
Xiao-Liang Xie
Zhen Li 0049

The real-time segmentation of surgical instruments plays a crucial role in robot-assisted surgery. However, it is still a challenging task to implement deep learning models to do real-time segmentation for surgical instruments due to their high computational costs and slow inference speed. In this paper, we propose an attention-guided lightweight network (LWANet), which can segment surgical instruments in real-time. LWANet adopts encoder-decoder architecture, where the encoder is the lightweight network MobileNetV2, and the decoder consists of depthwise separable convolution, attention fusion block, and transposed convolution. Depthwise separable convolution is used as the basic unit to construct the decoder, which can reduce the model size and computational costs. Attention fusion block captures global contexts and encodes semantic dependencies between channels to emphasize target regions, contributing to locating the surgical instrument. Transposed convolution is performed to upsample feature maps for acquiring refined edges. LWANet can segment surgical instruments in real-time while takes little computational costs. Based on 960x544 inputs, its inference speed can reach 39 fps with only 3. 39 GFLOPs. Also, it has a small model size and the number of parameters is only 2. 06 M. The proposed network is evaluated on two datasets. It achieves state-of-the- art performance 94. 10% mean IOU on Cata7 and obtains a new record on EndoVis 2017 with a 4. 10% increase on mean IOU.

Details

IJCAI Conference 2020 Conference Paper

BARNet: Bilinear Attention Network with Adaptive Receptive Fields for Surgical Instrument Segmentation

Zhen-Liang Ni
Gui-Bin Bian
Guan-An Wang
Xiao-Hu Zhou
Zeng-Guang Hou
Xiao-Liang Xie
Zhen Li
Yu-Han Wang

Surgical instrument segmentation is crucial for computer-assisted surgery. Different from common object segmentation, it is more challenging due to the large illumination variation and scale variation in the surgical scenes. In this paper, we propose a bilinear attention network with adaptive receptive fields to address these two issues. To deal with the illumination variation, the bilinear attention module models global contexts and semantic dependencies between pixels by capturing second-order statistics. With them, semantic features in challenging areas can be inferred from their neighbors, and the distinction of various semantics can be boosted. To adapt to the scale variation, our adaptive receptive field module aggregates multi-scale features and selects receptive fields adaptively. Specifically, it models the semantic relationships between channels to choose feature maps with appropriate scales, changing the receptive field of subsequent convolutions. The proposed network achieves the best performance 97. 47% mean IoU on Cata7. It also takes the first place on EndoVis 2017, exceeding the second place by 10. 10% mean IoU.

PDF Details DOI

AAAI Conference 2020 Conference Paper

Pyramid Attention Aggregation Network for Semantic Segmentation of Surgical Instruments

Zhen-Liang Ni
Gui-Bin Bian
Guan-An Wang
Xiao-Hu Zhou
Zeng-Guang Hou
Hua-Bin Chen
Xiao-Liang Xie

Semantic segmentation of surgical instruments plays a critical role in computer-assisted surgery. However, specular re- ﬂection and scale variation of instruments are likely to occur in the surgical environment, undesirably altering visual features of instruments, such as color and shape. These issues make semantic segmentation of surgical instruments more challenging. In this paper, a novel network, Pyramid Attention Aggregation Network, is proposed to aggregate multiscale attentive features for surgical instruments. It contains two critical modules: Double Attention Module and Pyramid Upsampling Module. Speciﬁcally, the Double Attention Module includes two attention blocks (i. e. , position attention block and channel attention block), which model semantic dependencies between positions and channels by capturing joint semantic information and global contexts, respectively. The attentive features generated by the Double Attention Module can distinguish target regions, contributing to solving the specular reﬂection issue. Moreover, the Pyramid Upsampling Module extracts local details and global contexts by aggregating multi-scale attentive features. It learns the shape and size features of surgical instruments in different receptive ﬁelds and thus addresses the scale variation issue. The proposed network achieves state-of-the-art performance on various datasets. It achieves a new record of 97. 10% mean IOU on Cata7. Besides, it comes ﬁrst in the MICCAI EndoVis Challenge 2017 with 9. 90% increase on mean IOU.

PDF Details

ICRA Conference 2019 Conference Paper

A GPU Based Parallel Genetic Algorithm for the Orientation Optimization Problem in 3D Printing

Zhishuai Li
Gang Xiong 0001
Xipeng Zhang
Zhen Shen 0004
Can Luo
Xiuqin Shang
Xisong Dong
Gui-Bin Bian

The choice of model orientation is a very important issue in Additive Manufacturing (AM). In this paper, the model orientation problem is formulated as a multi-objective optimization problem, aiming at minimizing the building time, the surface quality, and the supporting area. Then we convert the problem into a single-objective optimization in the linear-weighted way. After that, the Genetic Algorithm (GA) is used to solve the optimization problem and the process of GA is parallelized and implemented on GPU. Experimental results show that when dealing with complex models in AM, compared with CPU only implementation, the GPU based GA can speed up the process by about 50 times, which helps to significantly reduce the optimization time and ensure the quality of solutions. The GPU based parallel methods we proposed can help to reduce the execution time and improve the efficiency greatly, making the processes more efficient.

Details

IROS Conference 2019 Conference Paper

Path Planning for Surgery Robot with Bidirectional Continuous Tree Search and Neural Network

Rui-Jian Huang
Gui-Bin Bian
Chen Xin 0003
Zhen Li 0049
Zeng-Guang Hou

Solving a thorny issue of real-time path planning for surgery robot in uncertain environments, a novel algorithm named bidirectional continuous tree search (BCTS) is proposed. Most partially observable markov decision process (POMDP) planners address challenges of unknown environments with discrete states, observations and actions, which are fail to automate the operative procedure. However, the BCTS method addresses the issue by handling POMDPs in continuous state, observation and action spaces. The proposed approach has a bidirectional search structure with the intent of greatly improving the calculation efficiency. Meanwhile, Bayesian optimization (BO) algorithm is considered to dynamically sample promising actions while we construct a belief tree. In view of the speed of BO process, the upper and lower bounds of the optimal action values given by fast informed bound (FIB) and point-based value iteration (PBVI) limit the search scope, so we can improve the speed of BO. In addition, we apply an optimal path planning generator, radial basis function neural network (RBFNN), to obtain a smoother trajectory. Finally, simulation of glaucoma surgery has been carried out to explore the best surgical approach. The results show that the introduced structure can effectively guide the surgery robot to perform surgical procedures and receive a real-time as well as smooth path.

Details

ICRA Conference 2015 Conference Paper

Design and evaluation of a bio-inspired robotic hand for percutaneous coronary intervention

Zhen-Qiu Feng
Gui-Bin Bian
Xiao-Liang Xie
Zeng-Guang Hou
Jian-Long Hao

The percutaneous coronary interventions (PCI) require complex operating skills of the interventional devices and make the surgeons being exposed to heavy X-ray radiation. Accurate delivery of the interventional devices and avoiding the radiation are especially important for the surgeons. This paper presents a novel dedicated dual-finger robotic hand (DRH) and a console to assist the surgeons to deliver the interventional devices in PCIs. The system is designed in the master-slave way which helps the surgeons to reduce the exposure to radiation. The mechanism of the DRH is bio-inspired and motions are decoupled in kinematics. In PCI procedures, the accuracy of the guidewire delivery and the catheter tip placement have significant effects on the surgical results. The performances of the DRH in delivering the guidewire and the balloon/stent catheter were evaluated by three surgical manipulations. The results show that the DRH has the ability to deliver the guidewire and the balloon/stent catheter precisely.

Details