Arrow Research search
Back to AAAI

AAAI 2025

Compressing Streamable Free-Viewpoint Videos to 0.1 MB per Frame

Conference Paper AAAI Technical Track on Computer Vision VI Artificial Intelligence

Abstract

The success of 3D Gaussian Splatting (3DGS) in static scenes has inspired numerous attempts to construct Free-Viewpoint Videos (FVVs) of dynamic scenes from multi-view videos. Despite advancements in current techniques, simultaneously achieving photo-realistic view synthesis results, fast on-the-fly training, real-time rendering, and low storage costs remains a formidable problem. To address these challenges, we propose the first Gaussian-based streamable FVV intelligent compression framework named iFVC. Specifically, we utilize an anchor-based Gaussian representation to model the scene. To achieve on-the-fly training, we propose a Binary Transformation Cache (BTC) to model the dynamic changes between adjacent timesteps, which not only ensures compactness but also supports precise bit rate estimation. Furthermore, we carefully design a high-resolution transformation tri-plane assisted by a saliency grid as our BTC, allowing for accurate dynamic capture. The entire pipeline is regarded as a joint optimization of rate and distortion to achieve optimal compression performance. Experiments on widely used datasets demonstrate the state-of-the-art performance of our framework in both synthesis quality and efficiency, i.e., achieving per-frame training in 13 seconds with a storage cost of 0.1 MB and real-time rendering at 120 FPS.

Authors

Keywords

No keywords are indexed for this paper.

Context

Venue
AAAI Conference on Artificial Intelligence
Archive span
1980-2026
Indexed papers
28718
Paper id
742089400218848988