Arrow Research search
Back to AAAI

AAAI 2025

Multi-view Consistent 3D Panoptic Scene Understanding

Conference Paper AAAI Technical Track on Computer Vision V Artificial Intelligence

Abstract

3D panoptic scene understanding seeks to create novel view images with 3D-consistent panoptic segmentation, which is crucial for many vision and robotics applications. Mainstream methods (e.g., Panoptic Lifting) directly use machine-generated 2D panoptic segmentation masks as training labels. However, these generated masks often exhibit multi-view inconsistencies, leading to ambiguities during the optimization process. To address this, we present Multi-view Consistent 3D Panoptic Scene Understanding (MVC-PSU), featuring two key components: 1) Probabilistic Semantic Aligner, which associates semantic information of corresponding pixels across multiple views by probabilistic alignment to ensure that predicted panoptic segmentation masks are consistent across different views. 2) Geometric Consistency Enforcer, which uses multi-view projection and monocular depth consistency to ensure that the geometry of the reconstructed scene is accurate and consistent across different views. Experimental results demonstrate that the proposed MVC-PSU surpasses state-of-the-art methods on the ScanNet, Replica, and HyperSim datasets.

Authors

Keywords

No keywords are indexed for this paper.

Context

Venue
AAAI Conference on Artificial Intelligence
Archive span
1980-2026
Indexed papers
28718
Paper id
400944356148274335