Compressing Streamable Free-Viewpoint Videos to 0.1 MB per Frame

Luyang Tang; Jiayu Yang; Rui Peng; Yongqi Zhai; Shihe Shen; Ronggang Wang

doi:10.1609/aaai.v39i7.32780

Back to AAAI

AAAI 2025

Compressing Streamable Free-Viewpoint Videos to 0.1 MB per Frame

Conference Paper AAAI Technical Track on Computer Vision VI Artificial Intelligence

PDF Details DOI

Abstract

The success of 3D Gaussian Splatting (3DGS) in static scenes has inspired numerous attempts to construct Free-Viewpoint Videos (FVVs) of dynamic scenes from multi-view videos. Despite advancements in current techniques, simultaneously achieving photo-realistic view synthesis results, fast on-the-fly training, real-time rendering, and low storage costs remains a formidable problem. To address these challenges, we propose the first Gaussian-based streamable FVV intelligent compression framework named iFVC. Specifically, we utilize an anchor-based Gaussian representation to model the scene. To achieve on-the-fly training, we propose a Binary Transformation Cache (BTC) to model the dynamic changes between adjacent timesteps, which not only ensures compactness but also supports precise bit rate estimation. Furthermore, we carefully design a high-resolution transformation tri-plane assisted by a saliency grid as our BTC, allowing for accurate dynamic capture. The entire pipeline is regarded as a joint optimization of rate and distortion to achieve optimal compression performance. Experiments on widely used datasets demonstrate the state-of-the-art performance of our framework in both synthesis quality and efficiency, i.e., achieving per-frame training in 13 seconds with a storage cost of 0.1 MB and real-time rendering at 120 FPS.

Compressing Streamable Free-Viewpoint Videos to 0.1 MB per Frame

Abstract

Authors

Keywords

Context