Multi-view Consistent 3D Panoptic Scene Understanding

Xianzhu Liu; Xin Sun; Haozhe Xie; Zonglin Li; Ru Li; Shengping Zhang

doi:10.1609/aaai.v39i6.32598

Back to AAAI

AAAI 2025

Multi-view Consistent 3D Panoptic Scene Understanding

Conference Paper AAAI Technical Track on Computer Vision V Artificial Intelligence

PDF Details DOI

Abstract

3D panoptic scene understanding seeks to create novel view images with 3D-consistent panoptic segmentation, which is crucial for many vision and robotics applications. Mainstream methods (e.g., Panoptic Lifting) directly use machine-generated 2D panoptic segmentation masks as training labels. However, these generated masks often exhibit multi-view inconsistencies, leading to ambiguities during the optimization process. To address this, we present Multi-view Consistent 3D Panoptic Scene Understanding (MVC-PSU), featuring two key components: 1) Probabilistic Semantic Aligner, which associates semantic information of corresponding pixels across multiple views by probabilistic alignment to ensure that predicted panoptic segmentation masks are consistent across different views. 2) Geometric Consistency Enforcer, which uses multi-view projection and monocular depth consistency to ensure that the geometry of the reconstructed scene is accurate and consistent across different views. Experimental results demonstrate that the proposed MVC-PSU surpasses state-of-the-art methods on the ScanNet, Replica, and HyperSim datasets.

Multi-view Consistent 3D Panoptic Scene Understanding

Abstract

Authors

Keywords

Context