Author name cluster

Wangmeng Zuo

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

54 papers

2 author rows

AAAI Conference 2026 Conference Paper

Deblur4DGS: 4D Gaussian Splatting from Blurry Monocular Video

Renlong Wu
Zhilu Zhang
Mingyang Chen
Zifei Yan
Wangmeng Zuo

Recent 4D reconstruction methods have yielded impressive results but rely on sharp videos as supervision. However, motion blur often occurs in videos due to camera shake and object movement, while existing methods render blurry results when using such videos for reconstructing 4D models. Although a few approaches attempted to address the problem, they struggled to produce high-quality results, due to the inaccuracy in estimating continuous dynamic representations within the exposure time. Encouraged by recent works in 3D motion trajectory modeling using 3D Gaussian Splatting (3DGS), we take 3DGS as the scene representation manner, and propose Deblur4DGS to obtain a high-quality 4D model from blurry monocular video. Specifically, we transform continuous dynamic representations estimation within an exposure time into the exposure time estimation. Moreover, we introduce the exposure regularization term, multi-frame, and multi-resolution consistency regularization term to avoid trivial solutions. Furthermore, to better represent objects with large motion, we suggest blur-aware variable canonical Gaussians. Beyond novel-view synthesis, Deblur4DGS can be applied to improve blurry video from multiple perspectives, including deblurring, frame interpolation, and video stabilization. Extensive experiments in both synthetic and real-world data on the above four tasks show that Deblur4DGS outperforms state-of-the-art 4D reconstruction methods.

PDF Details DOI

AAAI Conference 2026 Conference Paper

Perceptual Quality Assessment of 3D Gaussian Splatting: A Subjective Dataset and Prediction Metric

Zhaolin Wan
Yining Diao
Jingqi Xu
Hao Wang
Zhiyang Li
Xiaopeng Fan
Wangmeng Zuo
Debin Zhao

With the rapid advancement of 3D visualization, 3D Gaussian Splatting (3DGS) has emerged as a leading technique for real-time, high-fidelity rendering. While prior research has emphasized algorithmic performance and visual fidelity, the perceptual quality of 3DGS-rendered content, especially under varying reconstruction conditions, remains largely underexplored. In practice, factors such as viewpoint sparsity, limited training iterations, point downsampling, noise, and color distortions can significantly degrade visual quality, yet their perceptual impact has not been systematically studied. To bridge this gap, we present 3DGS-QA, the first subjective quality assessment dataset for 3DGS. It comprises 225 degraded reconstructions across 15 object types, enabling a controlled investigation of common distortion factors. Based on this dataset, we introduce a no-reference quality prediction model that directly operates on native 3D Gaussian primitives, without requiring rendered images or ground-truth references. Our model extracts spatial and photometric cues from the Gaussian representation to estimate perceived quality in a structure-aware manner. We further benchmark existing quality assessment methods, spanning both traditional and learning-based approaches. Experimental results show that our method consistently achieves superior performance, highlighting its robustness and effectiveness for 3DGS content evaluation. The dataset and code are made publicly available to facilitate future research in 3DGS quality assessment.

PDF Details DOI

AAAI Conference 2026 Conference Paper

RefSTAR: Blind Face Image Restoration with Reference Selection, Transfer, and Reconstruction

Zhicun Yin
Junjie Chen
Ming Liu
Zhixin Wang
Fan Li
Renjing Pei
Xiaoming Li
Rynson W. H. Lau

Introducing high-quality references can largely alleviate the uncertainty in blind face image restoration tasks, yet the equivocal utilization of reference priors makes it still a struggle to well preserve the human identity. We attribute the identity inconsistency to two deficiencies of existing reference-based face restoration methods, namely the inability to effectively determine which features need to be transferred, and the failure to preserve the structure and details of the selected features. This work mainly focuses on these two issues, and we present a novel blind face image restoration method that considers reference selection, transfer, and reconstruction (RefSTAR) to introduce proper features from reference images. Specifically, we construct a reference selection (RefSel) module, which can generate accurate masks to select reference features. For training the RefSel module, we construct a RefSel-HQ dataset through a mask generation pipeline, which contains annotated masks for 10,000 ground truth-reference pairs. To guarantee the exact introduction of selected reference features, a feature fusion paradigm is designed for reference feature transferring, and a Mask-Compatible Cycle-Consistency Loss is redesigned based on reference reconstruction to further ensure the presence of selected reference image features in the output image. Experiments on various backbone models demonstrate superior performance, showing better identity preservation ability and reference feature transfer quality.

PDF Details DOI

NeurIPS Conference 2025 Conference Paper

BMMR: A Large-Scale Bilingual Multimodal Multi-Discipline Reasoning Dataset

Zhiheng Xi
Guanyu Li
Yutao Fan
Honglin Guo
Yufang Liu
Xiaoran Fan
Jiaqi Liu
Wangmeng Zuo

In this paper, we introduce BMMR, a large-scale bilingual, multimodal, multi-disciplinary reasoning dataset for the community to develop and evaluate large multimodal models (LMMs). BMMR comprises 100k university-level questions drawn from 300 UNESCO-defined subjects, spanning diverse formats—multiple-choice, fill-in-the-blank, and open-ended QA—and sourced from both print and digital media such as books, exams, and quizzes. All data are curated and filtered via a human-in-the-loop, automated, and scalable framework, and each instance is paired with a high-quality reasoning path. The dataset is organized into two parts: BMMR-Eval that comprises 20k high-quality instances to comprehensively assess LMMs’ knowledge and reasoning across multiple disciplines in both Chinese and English; and BMMR-Train that contains 80k instances to support further research and development, extending the current focus on mathematical reasoning to diverse disciplines and domains. In addition, we propose the process-based multi-discipline BMMR-Verifier for accurate and fine-grained evaluation of LMMs’ reasoning. Extensive experiments reveal that (i) even SOTA models leave substantial headroom on BMMR-Eval; (ii) reasoning models exhibit discipline bias and outperform LMMs only on specific subjects; (iii) open-source models still trail their proprietary counterparts; and (iv) fine-tuning on BMMR-Train narrows this gap. Additionally, we conduct reasoning-chain analyses using BMMR-Verifier and other in-depth studies, uncovering the challenges LMMs currently face in multidisciplinary reasoning. We will release the data and models, and we believe our work can offers valuable insights and contributions to the community.