IROS Conference 2025 Conference Paper
Dynamic Action Localization and Recognition for Intelligent Perception of Surgical Robots
- Yaqin Peng
- Gui-Bin Bian
- Zhen Li 0049
- Ruichen Ma
- Qiang Ye
Robot-assisted surgery has significantly advanced surgical precision, yet the development of autonomous surgical robots remains hindered by their limited understanding of complex surgical actions. Current systems lack the ability to effectively perceive and interpret intricate surgical relationships, which restricts their capability to assist surgeons in dynamic surgical environments. To overcome these challenges, a novel self-supervised learning method for surgical action recognition has been proposed, aimed at enhancing the understanding of surgical actions. The method has introduced a dynamic masking with attention-based action localization module to focus the model on critical spatial regions where actions occur, enabling surgical view guidance for intelligent surgical robot while extracting key features. Moreover, a graph-enhanced adaptive feature selection module is employed to assign relevance to features and capture the temporal relationships between adjacent frames. Long Short-Term Memory has been utilized to model long-term dependencies across video sequences, while multi-view contrastive learning facilitates the extraction of discriminative features from both masked and unmasked sequences. Experimental results demonstrate a 3. 4% improvement in Average Precision and an Area Under Receiver Operating Characteristic Curve of 92. 9% on Neuro67 dataset for surgical action recognition. The method enables dynamic adjustments to the surgical view, achieving surgical visual navigation. These advancements contribute to the development of intelligent and autonomous surgical robots capable of assisting surgeons in complex and dynamic surgical settings.