Arrow Research search

Author name cluster

Yuxin Deng

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

12 papers
1 author row

Possible papers

12

AAAI Conference 2026 Conference Paper

SGPFeat: Semantic and Geometric Priors for Multi-modal Image Matching

  • Yuxin Deng
  • Botian Wang
  • Kaining Zhang
  • Hao Zhang
  • Jiayi Ma

Multi-modal image matching is a fundamental task in multi-view and multi-modal image processing. Its key challenge lies in extracting features that remain consistent despite drastic appearance variations across modalities. However, the learning of the feature is hindered by the scarcity and the inaccurate alignment of existing multi-modal datasets. To address this, we propose a knowledge distillation framework termed SGPFeat that transfers rich prior knowledge from large-scale unimodal tasks to enhance multi-modal representation learning. Specifically, semantic priors from a vision foundation model guide the feature extractor to identify shared semantic structures across modalities, enabling better generalization under large appearance gaps. In parallel, geometric priors derived from accurately aligned visible-light datasets improve detection precision on noisy aligned multi-modal pairs. Furthermore, we introduce a Heterogeneous Feature Aggregation (HFA) module to facilitate effective distillation and feature representation. Extensive experiments demonstrate that semantic and geometric priors bring significant improvement for our SGPFeat across diverse multi-modal image matching benchmarks.

AAAI Conference 2025 Conference Paper

BEVSync: Asynchronous Data Alignment for Camera-based Vehicle-Infrastructure Cooperative Perception Under Uncertain Delays

  • Wentao Wang
  • Jiaqian Wang
  • Yuxin Deng
  • Guang Tan

Vehicle-to-infrastructure (V2I) cooperative perception systems can enhance the sensing abilities of autonomous vehicles. Existing V2I solutions often consider LiDARs devices instead of cameras, the most prevalent sensors with low cost and wide installation. In addition, a major challenge that has been underexplored is the time asynchrony between image frames from different sources. This asynchrony arises because of clock differences, varying times involved in data processing and transmission, causing uncertain delays that complicate data alignment and potentially reduce perception accuracy. We propose BEVSync, a camera-based V2I cooperative perception system that adaptively aligns frames from the ego-vehicle and infrastructure by compensating for motion deviations. Specifically, we develop an extractor-compensator model to extract and predict perceptual features using historical frames, thereby smoothing out the data misalignment. Experiments on the real-world dataset DAIR-V2X show that our approach surpasses existing methods in terms of performance and robustness.

AAAI Conference 2024 Conference Paper

ResMatch: Residual Attention Learning for Feature Matching

  • Yuxin Deng
  • Kaining Zhang
  • Shihua Zhang
  • Yansheng Li
  • Jiayi Ma

Attention-based graph neural networks have made great progress in feature matching. However, the literature lacks a comprehensive understanding of how the attention mechanism operates for feature matching. In this paper, we rethink cross- and self-attention from the viewpoint of traditional feature matching and filtering. To facilitate the learning of matching and filtering, we incorporate the similarity of descriptors into cross-attention and relative positions into self-attention. In this way, the attention can concentrate on learning residual matching and filtering functions with reference to the basic functions of measuring visual and spatial correlation. Moreover, we leverage descriptor similarity and relative positions to extract inter- and intra-neighbors. Then sparse attention for each point can be performed only within its neighborhoods to acquire higher computation efficiency. Extensive experiments, including feature matching, pose estimation and visual localization, confirm the superiority of the proposed method. Our codes are available at https://github.com/ACuOoOoO/ResMatch.

AAAI Conference 2024 Conference Paper

SDGMNet: Statistic-Based Dynamic Gradient Modulation for Local Descriptor Learning

  • Yuxin Deng
  • Jiayi Ma

Rescaling the backpropagated gradient of contrastive loss has made significant progress in descriptor learning. However, current gradient modulation strategies have no regard for the varying distribution of global gradients, so they would suffer from changes in training phases or datasets. In this paper, we propose a dynamic gradient modulation, named SDGMNet, for contrastive local descriptor learning. The core of our method is formulating modulation functions with dynamically estimated statistical characteristics. Firstly, we introduce angle for distance measure after deep analysis on backpropagation of pair-wise loss. On this basis, auto-focus modulation is employed to moderate the impact of statistically uncommon individual pairs in stochastic gradient descent optimization; probabilistic margin cuts off the gradients of proportional triplets that have achieved enough optimization; power adjustment balances the total weights of negative pairs and positive pairs. Extensive experiments demonstrate that our novel descriptor surpasses previous state-of-the-art methods in several tasks including patch verification, retrieval, pose estimation, and 3D reconstruction.