Arrow Research search

Author name cluster

Luyang Tang

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

5 papers
2 author rows

Possible papers

5

AAAI Conference 2025 Conference Paper

Compressing Streamable Free-Viewpoint Videos to 0.1 MB per Frame

  • Luyang Tang
  • Jiayu Yang
  • Rui Peng
  • Yongqi Zhai
  • Shihe Shen
  • Ronggang Wang

The success of 3D Gaussian Splatting (3DGS) in static scenes has inspired numerous attempts to construct Free-Viewpoint Videos (FVVs) of dynamic scenes from multi-view videos. Despite advancements in current techniques, simultaneously achieving photo-realistic view synthesis results, fast on-the-fly training, real-time rendering, and low storage costs remains a formidable problem. To address these challenges, we propose the first Gaussian-based streamable FVV intelligent compression framework named iFVC. Specifically, we utilize an anchor-based Gaussian representation to model the scene. To achieve on-the-fly training, we propose a Binary Transformation Cache (BTC) to model the dynamic changes between adjacent timesteps, which not only ensures compactness but also supports precise bit rate estimation. Furthermore, we carefully design a high-resolution transformation tri-plane assisted by a saliency grid as our BTC, allowing for accurate dynamic capture. The entire pipeline is regarded as a joint optimization of rate and distortion to achieve optimal compression performance. Experiments on widely used datasets demonstrate the state-of-the-art performance of our framework in both synthesis quality and efficiency, i.e., achieving per-frame training in 13 seconds with a storage cost of 0.1 MB and real-time rendering at 120 FPS.

ICLR Conference 2025 Conference Paper

Swift4D: Adaptive divide-and-conquer Gaussian Splatting for compact and efficient reconstruction of dynamic scene

  • Jiahao Wu
  • Rui Peng
  • Zhiyan Wang
  • Lu Xiao
  • Luyang Tang
  • Jinbo Yan
  • Kaiqiang Xiong
  • Ronggang Wang

Novel view synthesis has long been a practical but challenging task, although the introduction of numerous methods to solve this problem, even combining advanced representations like 3D Gaussian Splatting, they still struggle to recover high-quality results and often consume too much storage memory and training time. In this paper we propose Swift4D, a divide-and-conquer 3D Gaussian Splatting method that can handle static and dynamic primitives separately, achieving a good trade-off between rendering quality and efficiency, motivated by the fact that most of the scene is the static primitive and does not require additional dynamic properties. Concretely, we focus on modeling dynamic transformations only for the dynamic primitives which benefits both efficiency and quality. We first employ a learnable decomposition strategy to separate the primitives, which relies on an additional parameter to classify primitives as static or dynamic. For the dynamic primitives, we employ a compact multi-resolution 4D Hash mapper to transform these primitives from canonical space into deformation space at each timestamp, and then mix the static and dynamic primitives to produce the final output. This divide-and-conquer method facilitates efficient training and reduces storage redundancy. Our method not only achieves state-of-the-art rendering quality while being 20× faster in training than previous SOTA methods with a minimum storage requirement of only 30MB on real-world datasets.

NeurIPS Conference 2024 Conference Paper

Structure Consistent Gaussian Splatting with Matching Prior for Few-shot Novel View Synthesis

  • Rui Peng
  • Wangze Xu
  • Luyang Tang
  • Liwei Liao
  • Jianbo Jiao
  • Ronggang Wang

Despite the substantial progress of novel view synthesis, existing methods, either based on the Neural Radiance Fields (NeRF) or more recently 3D Gaussian Splatting (3DGS), suffer significant degradation when the input becomes sparse. Numerous efforts have been introduced to alleviate this problem, but they still struggle to synthesize satisfactory results efficiently, especially in the large scene. In this paper, we propose SCGaussian, a Structure Consistent Gaussian Splatting method using matching priors to learn 3D consistent scene structure. Considering the high interdependence of Gaussian attributes, we optimize the scene structure in two folds: rendering geometry and, more importantly, the position of Gaussian primitives, which is hard to be directly constrained in the vanilla 3DGS due to the non-structure property. To achieve this, we present a hybrid Gaussian representation. Besides the ordinary non-structure Gaussian primitives, our model also consists of ray-based Gaussian primitives that are bound to matching rays and whose optimization of their positions is restricted along the ray. Thus, we can utilize the matching correspondence to directly enforce the position of these Gaussian primitives to converge to the surface points where rays intersect. Extensive experiments on forward-facing, surrounding, and complex large scenes show the effectiveness of our approach with state-of-the-art performance and high efficiency. Code is available at https: //github. com/prstrive/SCGaussian.

NeurIPS Conference 2023 Conference Paper

GenS: Generalizable Neural Surface Reconstruction from Multi-View Images

  • Rui Peng
  • Xiaodong Gu
  • Luyang Tang
  • Shihe Shen
  • Fanqi Yu
  • Ronggang Wang

Combining the signed distance function (SDF) and differentiable volume rendering has emerged as a powerful paradigm for surface reconstruction from multi-view images without 3D supervision. However, current methods are impeded by requiring long-time per-scene optimizations and cannot generalize to new scenes. In this paper, we present GenS, an end-to-end generalizable neural surface reconstruction model. Unlike coordinate-based methods that train a separate network for each scene, we construct a generalized multi-scale volume to directly encode all scenes. Compared with existing solutions, our representation is more powerful, which can recover high-frequency details while maintaining global smoothness. Meanwhile, we introduce a multi-scale feature-metric consistency to impose the multi-view consistency in a more discriminative multi-scale feature space, which is robust to the failures of the photometric consistency. And the learnable feature can be self-enhanced to continuously improve the matching accuracy and mitigate aggregation ambiguity. Furthermore, we design a view contrast loss to force the model to be robust to those regions covered by few viewpoints through distilling the geometric prior from dense input to sparse input. Extensive experiments on popular benchmarks show that our model can generalize well to new scenes and outperform existing state-of-the-art methods even those employing ground-truth depth supervision. Code will be available at https: //github. com/prstrive/GenS.

IROS Conference 2022 Conference Paper

RECCraft System: Towards Reliable and Efficient Collective Robotic Construction

  • Qiwei Xu
  • Yizheng Zhang
  • Shenghao Zhang 0001
  • Rui Zhao
  • Zhuoxing Wu
  • Dongsheng Zhang
  • Cheng Zhou
  • Xiong Li 0001

This research presents a novel Collective Robotic Construction (CRC) system named RECCraft. The RECCraft hardware system is composed of the mobile manipulation vehicles, the cubic blocks, and the folding ramp blocks. Solid connection and easy removal of the blocks are achieved by an electropermanent magnet and silicon steel sheets. With one degree of freedom (DOF) lifting manipulator, the robot can carry a block 3. 7 times its volume. An active folding ramp block can provide a robust passage to the upper level for the robot. Our study focuses on systemic improvement of the construction speed and reliability of the robotic construction system. Visual perception system realized by Apritag is adopted, featured by convenient deployment and high precision, to provide a reliable guarantee for robotic construction. RL-based planner provides end-to-end solution for planning tasks of building multi-layer constructions, which is validated by simulation platform and real prototype. Compared with construction speed of existing robotic construction systems, our proposed RECCraft system achieves state-of-the-art level. The robot builds a 2-layer construction by RL-based planner in 4 minutes and 16 seconds, which achieves construction volumetric throughput of 6. 7×10 5 mm 3 /s.