Arrow Research search
Back to IROS

IROS 2024

DAP: Diffusion-based Affordance Prediction for Multi-modality Storage

Conference Paper Accepted Paper Artificial Intelligence · Robotics

Abstract

Solving storage problems—where objects must be accurately placed into containers with precise orientations and positions—presents a distinct challenge that extends beyond traditional rearrangement tasks. These challenges are primarily due to the need for fine-grained 6D manipulation and the inherent multi-modality of solution spaces, where multiple viable goal configurations exist for the same storage container. We present a novel Diffusion-based Affordance Prediction (DAP) pipeline for the multi-modal object storage problem. DAP leverages a two-step approach, initially identifying a placeable region on the container and then precisely computing the relative pose between the object and that region. Existing methods either struggle with multi-modality issues or computation-intensive training. Our experiments demonstrate DAP’s superior performance and training efficiency over the current state-of-the-art RPDiff, achieving remarkable results on the RPDiff benchmark. Additionally, our experiments showcase DAP’s data efficiency in real-world applications, an advancement over existing simulation-driven approaches. Our contribution fills a gap in robotic manipulation research by offering a solution that is both computationally efficient and capable of handling real-world variability. Code and supplementary material can be found at: https://github.com/changhaonan/DPS.git.

Authors

Keywords

  • Training
  • Codes
  • Affordances
  • Pipelines
  • Pose estimation
  • Containers
  • Prediction algorithms
  • Diffusion models
  • Computational efficiency
  • Intelligent robots
  • Training Efficiency
  • Relative Pose
  • Prediction Pipeline
  • Storage Problems
  • Multimodal Problems
  • Denoising
  • Bimodal
  • Point Cloud
  • Bounding Box
  • Diffusion Model
  • Days Of Training
  • Coordinates Of Points
  • Semantic Segmentation
  • Target Object
  • Labeled Data
  • Robotic Arm
  • Instance Segmentation
  • Dishwasher
  • Transformer Architecture
  • Conditional Variational Autoencoder
  • Point Cloud Registration
  • Neural Field
  • Pose Prediction
  • Point Cloud Segmentation
  • Gaussian Noise
  • Training Objective
  • Attention Mechanism

Context

Venue
IEEE/RSJ International Conference on Intelligent Robots and Systems
Archive span
1988-2025
Indexed papers
26578
Paper id
1135221680222286823