Author name cluster

Luyang Tang

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

5 papers

2 author rows

AAAI Conference 2025 Conference Paper

Compressing Streamable Free-Viewpoint Videos to 0.1 MB per Frame

Luyang Tang
Jiayu Yang
Rui Peng
Yongqi Zhai
Shihe Shen
Ronggang Wang

The success of 3D Gaussian Splatting (3DGS) in static scenes has inspired numerous attempts to construct Free-Viewpoint Videos (FVVs) of dynamic scenes from multi-view videos. Despite advancements in current techniques, simultaneously achieving photo-realistic view synthesis results, fast on-the-fly training, real-time rendering, and low storage costs remains a formidable problem. To address these challenges, we propose the first Gaussian-based streamable FVV intelligent compression framework named iFVC. Specifically, we utilize an anchor-based Gaussian representation to model the scene. To achieve on-the-fly training, we propose a Binary Transformation Cache (BTC) to model the dynamic changes between adjacent timesteps, which not only ensures compactness but also supports precise bit rate estimation. Furthermore, we carefully design a high-resolution transformation tri-plane assisted by a saliency grid as our BTC, allowing for accurate dynamic capture. The entire pipeline is regarded as a joint optimization of rate and distortion to achieve optimal compression performance. Experiments on widely used datasets demonstrate the state-of-the-art performance of our framework in both synthesis quality and efficiency, i.e., achieving per-frame training in 13 seconds with a storage cost of 0.1 MB and real-time rendering at 120 FPS.

PDF Details DOI

ICLR Conference 2025 Conference Paper

Swift4D: Adaptive divide-and-conquer Gaussian Splatting for compact and efficient reconstruction of dynamic scene

Jiahao Wu
Rui Peng
Zhiyan Wang
Lu Xiao
Luyang Tang
Jinbo Yan
Kaiqiang Xiong
Ronggang Wang

Novel view synthesis has long been a practical but challenging task, although the introduction of numerous methods to solve this problem, even combining advanced representations like 3D Gaussian Splatting, they still struggle to recover high-quality results and often consume too much storage memory and training time. In this paper we propose Swift4D, a divide-and-conquer 3D Gaussian Splatting method that can handle static and dynamic primitives separately, achieving a good trade-off between rendering quality and efficiency, motivated by the fact that most of the scene is the static primitive and does not require additional dynamic properties. Concretely, we focus on modeling dynamic transformations only for the dynamic primitives which benefits both efficiency and quality. We first employ a learnable decomposition strategy to separate the primitives, which relies on an additional parameter to classify primitives as static or dynamic. For the dynamic primitives, we employ a compact multi-resolution 4D Hash mapper to transform these primitives from canonical space into deformation space at each timestamp, and then mix the static and dynamic primitives to produce the final output. This divide-and-conquer method facilitates efficient training and reduces storage redundancy. Our method not only achieves state-of-the-art rendering quality while being 20× faster in training than previous SOTA methods with a minimum storage requirement of only 30MB on real-world datasets.

Details

NeurIPS Conference 2024 Conference Paper

Structure Consistent Gaussian Splatting with Matching Prior for Few-shot Novel View Synthesis

Rui Peng
Wangze Xu
Luyang Tang
Liwei Liao
Jianbo Jiao
Ronggang Wang

Despite the substantial progress of novel view synthesis, existing methods, either based on the Neural Radiance Fields (NeRF) or more recently 3D Gaussian Splatting (3DGS), suffer significant degradation when the input becomes sparse. Numerous efforts have been introduced to alleviate this problem, but they still struggle to synthesize satisfactory results efficiently, especially in the large scene. In this paper, we propose SCGaussian, a Structure Consistent Gaussian Splatting method using matching priors to learn 3D consistent scene structure. Considering the high interdependence of Gaussian attributes, we optimize the scene structure in two folds: rendering geometry and, more importantly, the position of Gaussian primitives, which is hard to be directly constrained in the vanilla 3DGS due to the non-structure property. To achieve this, we present a hybrid Gaussian representation. Besides the ordinary non-structure Gaussian primitives, our model also consists of ray-based Gaussian primitives that are bound to matching rays and whose optimization of their positions is restricted along the ray. Thus, we can utilize the matching correspondence to directly enforce the position of these Gaussian primitives to converge to the surface points where rays intersect. Extensive experiments on forward-facing, surrounding, and complex large scenes show the effectiveness of our approach with state-of-the-art performance and high efficiency. Code is available at https: //github. com/prstrive/SCGaussian.

PDF Details DOI

NeurIPS Conference 2023 Conference Paper

GenS: Generalizable Neural Surface Reconstruction from Multi-View Images

Rui Peng
Xiaodong Gu
Luyang Tang
Shihe Shen
Fanqi Yu
Ronggang Wang

Combining the signed distance function (SDF) and differentiable volume rendering has emerged as a powerful paradigm for surface reconstruction from multi-view images without 3D supervision. However, current methods are impeded by requiring long-time per-scene optimizations and cannot generalize to new scenes. In this paper, we present GenS, an end-to-end generalizable neural surface reconstruction model. Unlike coordinate-based methods that train a separate network for each scene, we construct a generalized multi-scale volume to directly encode all scenes. Compared with existing solutions, our representation is more powerful, which can recover high-frequency details while maintaining global smoothness. Meanwhile, we introduce a multi-scale feature-metric consistency to impose the multi-view consistency in a more discriminative multi-scale feature space, which is robust to the failures of the photometric consistency. And the learnable feature can be self-enhanced to continuously improve the matching accuracy and mitigate aggregation ambiguity. Furthermore, we design a view contrast loss to force the model to be robust to those regions covered by few viewpoints through distilling the geometric prior from dense input to sparse input. Extensive experiments on popular benchmarks show that our model can generalize well to new scenes and outperform existing state-of-the-art methods even those employing ground-truth depth supervision. Code will be available at https: //github. com/prstrive/GenS.

PDF Details

IROS Conference 2022 Conference Paper

RECCraft System: Towards Reliable and Efficient Collective Robotic Construction

Qiwei Xu
Yizheng Zhang
Shenghao Zhang 0001
Rui Zhao
Zhuoxing Wu
Dongsheng Zhang
Cheng Zhou
Xiong Li 0001

This research presents a novel Collective Robotic Construction (CRC) system named RECCraft. The RECCraft hardware system is composed of the mobile manipulation vehicles, the cubic blocks, and the folding ramp blocks. Solid connection and easy removal of the blocks are achieved by an electropermanent magnet and silicon steel sheets. With one degree of freedom (DOF) lifting manipulator, the robot can carry a block 3. 7 times its volume. An active folding ramp block can provide a robust passage to the upper level for the robot. Our study focuses on systemic improvement of the construction speed and reliability of the robotic construction system. Visual perception system realized by Apritag is adopted, featured by convenient deployment and high precision, to provide a reliable guarantee for robotic construction. RL-based planner provides end-to-end solution for planning tasks of building multi-layer constructions, which is validated by simulation platform and real prototype. Compared with construction speed of existing robotic construction systems, our proposed RECCraft system achieves state-of-the-art level. The robot builds a 2-layer construction by RL-based planner in 4 minutes and 16 seconds, which achieves construction volumetric throughput of 6. 7×10 5 mm 3 /s.

Details