RAPTR: Radar-based 3D Pose Estimation using Transformer

Sorachi Kato; Ryoma Yataka; Pu Wang; Pedro Miraldo; Takuya Fujihashi; Petros Boufounos

Back to NeurIPS

NeurIPS 2025

RAPTR: Radar-based 3D Pose Estimation using Transformer

Conference Paper Main Conference Track Artificial Intelligence · Machine Learning

PDF Details

Abstract

Radar-based indoor 3D human pose estimation typically relied on fine-grained 3D keypoint labels, which are costly to obtain especially in complex indoor settings involving clutter, occlusions, or multiple people. In this paper, we propose \textbf{RAPTR} (RAdar Pose esTimation using tRansformer) under weak supervision, using only 3D BBox and 2D keypoint labels which are considerably easier and more scalable to collect. Our RAPTR is characterized by a two-stage pose decoder architecture with a pseudo-3D deformable attention to enhance (pose/joint) queries with multi-view radar features: a pose decoder estimates initial 3D poses with a 3D template loss designed to utilize the 3D BBox labels and mitigate depth ambiguities; and a joint decoder refines the initial poses with 2D keypoint labels and a 3D gravity loss. Evaluated on two indoor radar datasets, RAPTR outperforms existing methods, reducing joint position error by $34. 3$\% on HIBER and $76. 9$\% on MMVR. Our implementation is available at \url{https: //github. com/merlresearch/radar-pose-transformer}.

RAPTR: Radar-based 3D Pose Estimation using Transformer

Abstract

Authors

Keywords

Context