Author name cluster

Yu Wu

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

39 papers

2 author rows

AAAI Conference 2026 Conference Paper

Text-based Aerial-Ground Person Retrieval

Xinyu Zhou
Yu Wu
Jiayao Ma
Wenhao Wang
Min Cao
Mang Ye

This work introduces Text-based Aerial-Ground Person Retrieval (TAG-PR), which aims to retrieve person images from heterogeneous aerial and ground views with textual descriptions. Unlike traditional Text-based Person Retrieval (T-PR), which focuses solely on ground-view images, TAG-PR introduces greater practical significance and presents unique challenges due to the large viewpoint discrepancy across images. To support this task, we contribute: (1) TAG-PEDES dataset, constructed from public benchmarks with automatically generated textual descriptions, enhanced by a diversified text generation paradigm to ensure robustness under view heterogeneity; and (2) TAG-CLIP, a novel retrieval framework that addresses view heterogeneity through a hierarchically-routed mixture of experts module to learn view-specific and view-agnostic features and a viewpoint decoupling strategy to decouple view-specific features for better cross-modal alignment. We evaluate the effectiveness of TAG-CLIP on both the proposed TAG-PEDES and existing T-PR benchmarks.

PDF Details DOI

IROS Conference 2025 Conference Paper

An Online Motion Planning Framework for Navigating Torpedo-shaped Autonomous Underwater Vehicles in Unknown Underwater Environments

Tianyou Yu
Zhaoxuan Dong
Yu Wu
Xingjie Fu

Navigating unknown underwater environments is a significant challenge for autonomous underwater vehicles (AUVs), especially those with torpedo-like shapes. Lacking a prior map, these vehicles rely on real-time sensor data for perception. Although online motion planning addresses this challenge, many existing methods are primarily tested on more maneuverable robots, such as multicopters and ground vehicles, and do not account for the unique kinematics of torpedo-shaped AUVs, such as limited lateral movement, or the need for 3D motion planning. In this paper, we propose an online motion planning system specifically designed for torpedo-shaped AUVs to navigate 3D underwater terrain without prior environmental knowledge. The system employs a receding horizon planning framework to ensure safe navigation by replanning the trajectory when collisions are detected or the planning horizon is reached. For trajectory generation, a search-based method is used and utilizes a 3D Dubins curve heuristic to guide the generation of an optimal 3D trajectory that adheres to the AUV’s kinematic constraints. To further enhance safety and smoothness, gradient-based optimization is applied to refine the trajectory. Experiments in simulated environments validate the proposed method, demonstrating its ability to generate safe trajectories for AUVs in complex and unknown environments. We release our code as an open-source package 1.

Details

NeurIPS Conference 2025 Conference Paper

BNMusic: Blending Environmental Noises into Personalized Music

Chi Zuo
Martin Møller
Pablo Martínez-Nuevo
Huayang Huang
Yu Wu
Ye Zhu

While being disturbed by environmental noises, the acoustic masking technique is a conventional way to reduce the annoyance in audio engineering that seeks to cover up the noises with other dominant yet less intrusive sounds. However, misalignment between the dominant sound and the noise—such as mismatched downbeats—often requires an excessive volume increase to achieve effective masking. Motivated by recent advances in cross-modal generation, in this work, we introduce an alternative method to acoustic masking, aiming to reduce the noticeability of environmental noises by blending them into personalized music generated based on user-provided text prompts. Following the paradigm of music generation using mel-spectrogram representations, we propose a Blending Noises into Personalized Music (BNMusic) framework with two key stages. The first stage synthesizes a complete piece of music in a mel-spectrogram representation that encapsulates the musical essence of the noise. In the second stage, we adaptively amplifying the generated music segment to further reduce noise perception and enhance the blending effectiveness, while preserving auditory quality. Our experiments with comprehensive evaluations on MusicBench, EPIC-SOUNDS, and ESC-50 demonstrate the effectiveness of our framework, highlighting the ability to blend environmental noise with rhythmically aligned, adaptively amplified, and enjoyable music segments, minimizing the noticeability of the noise, thereby improving overall acoustic experiences. Project page: https: //d-fas. github. io/BNMusic_page/.