Author name cluster

Dong He

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

4 papers

2 author rows

IROS Conference 2025 Conference Paper

Autonomous Suturing Method for Robot-Assisted Minimally Invasive Surgery

Mei Feng
Haoju Li
Yao Li
Kun Yang
Dong He
Xiuquan Lu

Robot-assisted minimally invasive surgery is widely used because of its superior postoperative recovery outcomes. However, the workload for surgeons remains high. The development of autonomous suturing capabilities in surgical robots is poised to significantly reduce surgeon workload. In this study, we present a novel method or autonomous suturing using a minimally invasive surgical robot. We quantify the surgical suturing requirements and propose corresponding metrics for evaluating the suturing effect. We also use the dynamic adjustment of stitch position to optimize the surgical robot autonomous suturing scheme. Furthermore, we employ particle swarm algorithms to enhance the grasping posture of surgical instruments, enabling the robot to achieve optimal suture needle clamping. Our method maintains the same level of expert operator in the suturing parametric index of suturing when suturing two types of wounds: gauze and egg membrane. The autonomous suturing method proposed in this study is currently deployed on our own surgical robot, and it can be generalized to other surgical robots. This will lay the foundation for surgical robots to achieve fully autonomous surgery. The experimental results show that the stitching effect of our proposed autonomous robot stitching method is already close to that of surgeons using the same robot, and it maintains good consistency in multiple sets of experiments. The method proposed in this study can be generalized to various other surgical robots, laying the foundation for surgical robots to achieve fully autonomous surgery.

Details

EAAI Journal 2024 Journal Article

Digital-analog driven multi-scale transfer for smart bearing fault diagnosis

Wenbin Huang
Zixian Li
Xiaoxi Ding
Dong He
Qihang Wu
Jing Liu

Self-diagnosis and self-decision are crucial to smart bearing, where intelligent and robust models should be built and deployed on the smart bearing chip for an on-line edge effect. Whereas, this process requires a large amount of labeled prior data to train the fault identification model. Although the existing digital-analog driven transfer learning methods can realize fault identification under small samples, these algorithms mainly focus on how to reduce the difference between the two domains. These algorithms do not form a complete and applicable method for smart bearing fault diagnosis. Focusing on these issues, a digital-analog driven multi-scale transfer (DaD-MsT) method was proposed for smart bearing fault diagnosis. Different from the conventional methods, it can be achieved through end-side and edge-side cooperation, and the effect of transfer diagnosis is further improved by the proposed deep branch transfer network (DBTN) model. First, the smart bearing dynamic model is established, and the dynamic model response is obtained for use as source domain data in end-side. Then, a DBTN model was proposed to realize more effective digital-analog driven transfer learning. Finally, the trained model is deployed on the edge chip of the smart bearing for real-time fault identification and parameter fine-tuning. Experiments and comparisons verify the effectiveness of the proposed method in the case of small-sample data. Specifically, an online edge intelligent diagnosis system is also built to illustrate the ability in actual application of smart bearing intelligent diagnosis.

Details DOI

IROS Conference 2024 Conference Paper

LiDAR-camera Online Calibration by Representing Local Feature and Global Spatial Context

SeongJoo Moon
Sebin Lee
Dong He
Sung-Eui Yoon

LiDAR-camera calibration plays a crucial role in autonomous driving. However, operation-induced factors such as physical vibrations and temperature variations degrade the pre-deployment calibration accuracy, leading to the environmental perception performance deterioration. Recent recalibration methods have achieved online calibration without a target board by leveraging the relative attributes of LiDAR and camera. Nevertheless, we proposes a novel framework for LiDAR-camera online calibration which employs a Transformer network to learn crucial interactions between cameras and LiDAR sensors. Additionally, our novel framework design enables the effective calibration by utilizing correspondence point information between the two sensors. This allows the utilization of global spatial context and achieves high performance by integrating information across modalities. Experimental results indicate that our method demonstrates superior performance compared to state-of-the-art benchmarks.

Details

NeurIPS Conference 2024 Conference Paper

Task Me Anything

Jieyu Zhang
Weikai Huang
Zixian Ma
Oscar Michel
Dong He
Tanmay Gupta
Wei-Chiu Ma
Ali Farhadi

Benchmarks for large multimodal language models (MLMs) now serve to simultaneously assess the general capabilities of models instead of evaluating for a specific capability. As a result, when a developer wants to identify which models to use for their application, they are overwhelmed by the number of benchmarks and remain uncertain about which benchmark's results are most reflective of their specific use case. This paper introduces Task-Me-Anything, a benchmark generation engine which produces a benchmark tailored to a user's needs. Task-Me-Anything maintains an extendable taxonomy of visual assets and can programmatically generate a vast number of task instances. Additionally, it algorithmically addresses user queries regarding MLM performance efficiently within a computational budget. It contains 113K images, 10K videos, 2K 3D object assets, over 365 object categories, 655 attributes, and 335 relationships. It can generate 500M image/video question-answering pairs, which focus on evaluating MLM perceptual capabilities. Task-Me-Anything reveals critical insights: open-source MLMs excel in object and attribute recognition but lack spatial and temporal understanding; each model exhibits unique strengths and weaknesses; larger models generally perform better, though exceptions exist; and GPT4O demonstrates challenges in recognizing rotating/moving objects and distinguishing colors.

PDF Details DOI