Author name cluster

Luyao Zhang

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

4 papers

2 author rows

AAAI Conference 2026 Conference Paper

DriveLiDAR4D: Sequential and Controllable LiDAR Scene Generation for Autonomous Driving

Kaiwen Cai
Xinze Liu
Xia Zhou
Hengtong Hu
Jie Xiang
Luyao Zhang
Xueyang Zhang
Kun Zhan

The generation of realistic LiDAR point clouds plays a crucial role in the development and evaluation of autonomous driving systems. Although recent methods for 3D LiDAR point cloud generation have shown significant improvements, they still face notable limitations, including the lack of sequential generation capabilities and the inability to produce accurately positioned foreground objects and realistic backgrounds. These shortcomings hinder their practical applicability. In this paper, we introduce DriveLiDAR4D, a novel LiDAR generation pipeline consisting of multimodal conditions and a novel sequential noise prediction model LiDAR4DNet, capable of producing temporally consistent LiDAR scenes with highly controllable foreground objects and realistic backgrounds. To the best of our knowledge, this is the first work to address the sequential generation of LiDAR scenes with full scene manipulation capability in an end-to-end manner. We evaluated DriveLiDAR4D on the nuScenes and KITTI datasets, where we achieved an FRD score of 743.13 and an FVD score of 16.96 on the nuScenes dataset, surpassing the current state-of-the-art (SOTA) method, UniScene, with an performance boost of 37.2% in FRD and 24.1% in FVD, respectively.

PDF Details DOI

AAAI Conference 2026 Conference Paper

RSVG-ZeroOV: Exploring a Training-Free Framework for Zero-Shot Open-Vocabulary Visual Grounding in Remote Sensing Images

Ke Li
Di Wang
Ting Wang
Fuyu Dong
Yiming Zhang
Luyao Zhang
Xiangyu Wang
Shaofeng Li

Remote sensing visual grounding (RSVG) aims to localize objects in remote sensing images based on free-form natural language expressions. Existing approaches are typically constrained to closed-set vocabularies, limiting their applicability in open-world scenarios. While recent attempts to leverage generic foundation models for open-vocabulary RSVG, they overly rely on expensive high-quality datasets and time-consuming fine-tuning. To address these limitations, we propose RSVG-ZeroOV, a training-free framework that aims to explore the potential of frozen generic foundation models for zero-shot open-vocabulary RSVG. Specifically, RSVG-ZeroOV comprises three key stages: (i) Overview: We utilize a vision-language model (VLM) to obtain cross-attention maps that capture semantic correlations between text queries and visual regions. (ii) Focus: By leveraging the fine-grained modeling priors of a diffusion model (DM), we fill in gaps in structural and shape information of objects, which are often overlooked by VLM. (iii) Evolve: A simple yet effective attention evolution module is introduced to suppress irrelevant activations, yielding purified segmentation masks over the referred objects. Without cumbersome task-specific training, RSVG-ZeroOV offers an efficient and scalable solution. Extensive experiments demonstrate that the proposed framework consistently outperforms existing weakly-supervised and zero-shot methods.

PDF Details DOI

NeurIPS Conference 2024 Conference Paper

Croissant: A Metadata Format for ML-Ready Datasets

Mubashara Akhtar
Omar Benjelloun
Costanza Conforti
Luca Foschini
Pieter Gijsbers
Joan Giner-Miguelez
Sujata Goswami
Nitisha Jain

Data is a critical resource for machine learning (ML), yet working with data remains a key friction point. This paper introduces Croissant, a metadata format for datasets that creates a shared representation across ML tools, frameworks, and platforms. Croissant makes datasets more discoverable, portable, and interoperable, thereby addressing significant challenges in ML data management. Croissant is already supported by several popular dataset repositories, spanning hundreds of thousands of datasets, enabling easy loading into the most commonly-used ML frameworks, regardless of where the data is stored. Our initial evaluation by human raters shows that Croissant metadata is readable, understandable, complete, yet concise.

PDF Details DOI

ICRA Conference 2021 Conference Paper

Reachability-based Push Recovery for Humanoid Robots with Variable-Height Inverted Pendulum

Shunpeng Yang
Hua Chen 0007
Luyao Zhang
Zhefeng Cao
Patrick M. Wensing
Yizhang Liu
Jianxin Pang
Wei Zhang 0013

This paper studies push recovery for humanoid robots based on a variable-height inverted pendulum (VHIP) model. We first develop an approach for treating zero-step capturability of the VHIP with a novel methodology based on Hamilton-Jacobi (HJ) reachability analysis. Such an approach uses the sub-zero level set of a value function to encode capturability of the VHIP, where the value function is obtained by numerically solving a HJ variational inequality offline. Based on this analysis, a simple and effective method for adjusting foothold locations is then devised for cases where the VHIP state is not zero-step capturable. In addition, the HJ reachability analysis naturally induces an optimal control law that allows for rapid planning with the VHIP during push recovery online. To enable use of the strategy with a position-controlled humanoid robot, an associated differential inverse kinematics based tracking controller is employed. The effectiveness of the overall framework is demonstrated with the UBTECH Walker robot in the MuJoCo simulator. Simulation validations show a significant improvement in push robustness as compared to the methods based on the classical linear inverted pendulum model.

Details