Author name cluster

Yue Wu

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

60 papers

2 author rows

AAAI Conference 2026 Conference Paper

Adaptive Graph Attention Based Discrete Hashing for Incomplete Cross-modal Retrieval

Shuang Zhang
Yue Wu
Lei Shi
Huilong Jin
Feifei Kou
Pengfei Zhang
Mingying Xu
Pengtao Lv

Cross-modal hashing has emerged as a pivotal solution for efficient retrieval across diverse modalities, such as images and texts, by mapping them into compact binary hash spaces. However, in real-world scenarios, the modalities data is often missing or misaligned. Existing methods are most rely on fully paired training data and ignore missing or misaligned modalities data, resulting in the semantic inconsistencies. To address these challenges, we propose an Adaptive Graph Attention-Based Discrete Hashing (AGADH) method, which consists of three parts. First, to solve the problem of missing modalities, AGADH employs a masked completion strategy to reconstruct missing modalities. Second, to mitigate semantic misalignment, AGADH leverages a Graph Attention Network (GAT) encoder-decoder architecture with alignment module to construct features from different modalities. Additionally, to enhance the fusion performance, an adaptive fusion module dynamically adjusting the contributions of image and text modalities with learnable weighting coefficients is proposed. Extensive experiments on three benchmark datasets, MS-COCO, NUS-WIDE, and MIRFlickr-25K, demonstrating that AGADH outperforms state-of-the-art methods in both fully paired and incompletely paired scenarios, showing its robustness and effectiveness in cross-modal retrieval tasks.

PDF Details DOI

AAAI Conference 2026 Conference Paper

DcSplat: Dual-Constraint Human Gaussian Splatting with Latent Multi-View Consistency

Tengfei Xiao
Yue Wu
Zhigang Gao
Yongzhe Yuan
Can Qin
Hao Li
Mingyang Zhang

Human Novel View Synthesis (HNVS) aims to synthesize photorealistic human images from novel viewpoints given observations from known views. Despite significant advances achieved by existing methods such as NeRF, diffusion models, and 3DGS, they still face substantial challenges in achieving stable modeling from a single image. In this paper, we introduce Dual-Constraint Human Gaussian Splatting (DcSplat), a novel, simple, and efficient 3D Gaussian-based framework for single-view 3D human reconstruction. To address occlusion-induced texture missing and depth ambiguities, we introduce two key components: a Latent Multi-View Consistency Constraint Mechanism and a Geometric Constraint Module. The former employs a Latent-space Appearance Transformer (LatentFormer) to learn semantically coherent, view-consistent appearance priors via SMPL-guided pseudo-view fusion. The latter refines noisy SMPL-based depth through a U-Net-like structure conditioned on latent appearance features. These two modules are jointly optimized to generate high-quality Gaussian parameters in a unified latent space. Extensive experiments demonstrate that DcSplat outperforms existing SOTA methods in both geometry and texture quality, while achieving fast inference and lower computational cost.

PDF Details DOI

AAAI Conference 2026 Conference Paper

Flora: Effortless Context Construction to Arbitrary Length and Scale

Tianxiang Chen
Zhentao Tan
Xiaofan Bo
Yue Wu
Tao Gong
Qi Chu
Jieping Ye

Effectively handling long contexts is challenging for Large Language Models (LLMs) due to the rarity of long texts, high computational demands, and substantial forgetting of short-context abilities. Recent approaches have attempted to construct long contexts for instruction tuning, but these methods often require LLMs or human interventions, which are both costly and limited in length and diversity. Also, the drop in short-context performances of present long-context LLMs remains significant. In this paper, we introduce Flora, an effortless (human/LLM-free) long-context construction strategy. Flora can markedly enhance the long-context performance of LLMs by arbitrarily assembling short instructions based on categories and instructing LLMs to generate responses based on long-context meta-instructions. This enables Flora to produce contexts of arbitrary length and scale with rich diversity, while only slightly compromising short-context performance. Experiments on Llama3-8B-Instruct and QwQ-32B show that LLMs enhanced by Flora excel in three long-context benchmarks while maintaining strong performances in short-context tasks.

PDF Details DOI

AAAI Conference 2026 Conference Paper

Hybrid Vector-Occupancy Field for Robust Implicit 3D Surface Reconstruction

Yue Wu
Zhigang Gao
Tengfei Xiao
Can Qin
Yongzhe Yuan
Hao Li
Kaiyuan Feng
Wenping Ma

We introduce the Hybrid Vector-Occupancy Field (HVOF), a new implicit 3D representation for reconstructing both open and closed surfaces from sparse point clouds. Existing approaches, such as occupancy field and signed distance fields, face severe limitations. They struggle with open surfaces, while unsigned distance field and neural vector field exhibit directional instability in complex topologies and ridge regions. HVOF addresses these challenges by incorporating a smoothly decaying occupancy field around the surface, while capturing precise local geometry using truncated displacement vectors, naturally mitigating direction-field ambiguities near ridge regions. This unified design forms a robust hybrid representation that leverages both occupancy and vector fields. To fulfill it, we design a Hybrid Field variational autoencoder including a hierarchical cross-attention encoder and dual-branch decoder that jointly learn occupancy and vector fields through continuous weighting. Extensive experiments demonstrate that HVOF consistently outperforms state-of-the-art methods across ShapeNet, ABC, and MGN datasets, accurately reconstructing both open and closed surfaces while preserving fine geometric details in complex regions.

PDF Details DOI

ICLR Conference 2025 Conference Paper

A Common Pitfall of Margin-based Language Model Alignment: Gradient Entanglement

Hui Yuan 0002
Yifan Zeng
Yue Wu
Huazheng Wang
Mengdi Wang 0001
Liu Leqi

Reinforcement Learning from Human Feedback (RLHF) has become the predominant approach for aligning language models (LMs) to be more helpful and less harmful. At its core, RLHF uses a margin-based loss for preference optimization, which specifies the ideal LM behavior only in terms of the difference between preferred and dispreferred responses. In this paper, we identify a common pitfall of margin-based methods---the under-specification of ideal LM behavior on preferred and dispreferred responses individually, which results in two unintended consequences as the margin increases: (1) The probability of dispreferred (e.g., unsafe) responses may increase, resulting in potential safety alignment failures. (2) The probability of preferred responses may decrease, even when those responses are ideal. We demystify the reasons behind these problematic behaviors: margin-based losses couple the change in the preferred probability with the gradient of the dispreferred one, and vice versa, often preventing the preferred probability from increasing while the dispreferred one decreases, and thus causing a synchronized increase or decrease in both probabilities. We term this effect, inherent in margin-based objectives, gradient entanglement. Formally, we derive conditions for general margin-based alignment objectives under which gradient entanglement becomes concerning: the inner product between the gradient of preferred log-probability and the gradient of dispreferred log-probability is large relative to the individual gradient norms. Furthermore, we theoretically investigate why such inner products can be large when aligning language models and empirically validate our findings. Empirical implications of our framework further extend to explaining important differences in the training dynamics of various preference optimization algorithms and suggesting future directions for improvement.