Arrow Research search
Back to ICRA

ICRA 2025

TransDiff: Diffusion-Based Method for Manipulating Transparent Objects Using a Single RGB-D Image

Conference Paper Accepted Paper Artificial Intelligence ยท Robotics

Abstract

Manipulating transparent objects presents significant challenges due to the complexities introduced by their reflection and refraction properties, which considerably hinder the accurate estimation of their 3D shapes. To address these challenges, we propose a single-view RGB-D-based depth completion framework, TransDiff, that leverages the Denoising Diffusion Probabilistic Models(DDPM) to achieve material-agnostic object grasping in desktop. Specifically, we leverage features extracted from RGB images, including semantic segmentation, edge maps, and normal maps, to condition the depth map generation process. Our method learns an iterative denoising process that transforms a random depth distribution into a depth map, guided by initially refined depth information, ensuring more accurate depth estimation in scenarios involving transparent objects. Additionally, we propose a novel training method to better align the noisy depth and RGB image features, which are used as conditions to refine depth estimation step by step. Finally, we utilized an improved inference process to accelerate the denoising procedure. Through comprehensive experimental validation, we demonstrate that our method significantly outperforms the baselines in both synthetic and real-world benchmarks with acceptable inference time. The demo of our method can be found on: https://wang-haoxiao.github.io/TransDiff/

Authors

Keywords

  • Training
  • Accuracy
  • Three-dimensional displays
  • Depth measurement
  • Noise reduction
  • Grasping
  • Transforms
  • Feature extraction
  • Reflection
  • Noise measurement
  • RGB-D Images
  • Transparent Objects
  • Single RGB-D Image
  • Semantic Segmentation
  • RGB Images
  • Depth Map
  • Depth Information
  • Depth Estimation
  • Noisy Images
  • Normal Map
  • Neural Network
  • Root Mean Square Error
  • Mean Square Error
  • Diffusion Process
  • Object Detection
  • Mean Absolute Error
  • Global Optimization
  • Point Cloud
  • Reversible Process
  • Diffusion Model
  • Surface Normals
  • Visual Conditions
  • Geometric Information
  • Latent Space
  • Depth Camera
  • Iterative Refinement
  • Inference Step
  • Scene Details
  • Robot Manipulator
  • Object Depth

Context

Venue
IEEE International Conference on Robotics and Automation
Archive span
1984-2025
Indexed papers
30179
Paper id
994823065922730865