Arrow Research search
Back to IROS

IROS 2025

Learning Generalizable 3D Manipulation With 10 Demonstrations

Conference Paper Accepted Paper Artificial Intelligence ยท Robotics

Abstract

Learning robust and generalizable manipulation skills from few demonstrations remains a key challenge in robotics, with broad applications in industrial automation and service robotics. Although recent imitation learning methods have achieved impressive results, they often require a large amount of demonstration data and struggle to generalize across different spatial variants. In this work, we propose a framework that learns 3D manipulation policies from only 10 demonstrations while achieving robust generalization to unseen spatial configurations through semantic-guided perception and spatial-equivariant policy learning. Our framework consists of two key modules: a Semantic Guided Perception module that extracts task-aware 3D representations from RGB-D inputs using semantic priors and a Spatial Generalized Decision module implementing a diffusion-based policy that preserves spatial equivariance through denoising. Central to our framework is a spatially equivariant training strategy, which adapts 2D data augmentation principles to 3D manipulation by maintaining gripper-object spatial relationships during trajectory augmentation. We validate our framework through extensive experiments on both simulation benchmarks and real-world robotic systems. Our method demonstrates a significant improvement in success rates over state-of-the-art approaches on a series of challenging tasks, particularly under significant object pose variations. This work shows significant potential to advance efficient and generalizable manipulation skill learning in real-world applications.

Authors

Keywords

  • Training
  • Three-dimensional displays
  • Service robots
  • Imitation learning
  • Semantics
  • Noise reduction
  • Benchmark testing
  • Data augmentation
  • Trajectory
  • Intelligent robots
  • 3D Manipulation
  • Spatial Variation
  • Equivalency
  • Real-world Systems
  • 3D Representation
  • Policy Learning
  • Object Pose
  • Improve Success Rates
  • Benchmark Simulation
  • Robotics Challenge
  • Training Data
  • Task Performance
  • Reference Frame
  • Simulation Experiments
  • Point Cloud
  • Bounding Box
  • Diffusion Model
  • RGB-D Images
  • Manipulation Tasks
  • 3D Rotation
  • Rotation Vector
  • Foundation Model
  • Cartesian Position
  • 3D Translation
  • Background Points
  • Real-world Experiments
  • Robotic Arm

Context

Venue
IEEE/RSJ International Conference on Intelligent Robots and Systems
Archive span
1988-2025
Indexed papers
26578
Paper id
428687899449588792