Arrow Research search

Author name cluster

Yifan Xing

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

5 papers
2 author rows

Possible papers

5

NeurIPS Conference 2025 Conference Paper

Salient Concept-Aware Generative Data Augmentation

  • Tianchen Zhao
  • Xuanbai Chen
  • Zhihua Li
  • Jun Fang
  • Dongsheng An
  • Xiang Xu
  • Zhuowen Tu
  • Yifan Xing

Recent generative data augmentation methods conditioned on both image and text prompts struggle to balance between fidelity and diversity, as it is challenging to preserve essential image details while aligning with varied text prompts. This challenge arises because representations in the synthesis process often become entangled with non-essential input image attributes such as environmental contexts, creating conflicts with text prompts intended to modify these elements. To address this, we propose a personalized image generation framework that uses a salient concept-aware image embedding model to reduce the influence of irrelevant visual details during the synthesis process, thereby maintaining intuitive alignment between image and text inputs. By generating images that better preserve class-discriminative features with additional controlled variations, our framework effectively enhances the diversity of training datasets and thereby improves the robustness of downstream models. Our approach demonstrates superior performance across eight fine-grained vision datasets, outperforming state-of-the-art augmentation methods with averaged classification accuracy improvements by 0. 73\% and 6. 5\% under conventional and long-tail settings, respectively.

IROS Conference 2024 Conference Paper

Object-based SLAM Using Superquadrics

  • Yifan Xing
  • Noe Samano
  • Wen Fan 0001
  • Andrew Calway

Visual SLAM uses visual information, typically point features, to localise a camera and, at the same time, map the environment. In recent years, there has been interest in using scene-understanding capabilities to enhance the mapping process and object-level SLAM systems have appeared in response. However, most of the previous work is limited to prestored object models or pre-trained networks to represent the objects, which limits working scenarios or uses representations with limited scope, such as cubes or quadrics. To address this, we propose to use superquadrics as the object representation and, in this paper, present a proof of principle SLAM system in which object-based mapping is fully integrated with camera tracking via keyframe optimisation. The system was tested on simulated and real datasets, and the results show that the system can achieve lightweight and comparatively good object representation whilst also giving good camera trajectories estimates under certain scenarios.

ICLR Conference 2024 Conference Paper

Threshold-Consistent Margin Loss for Open-World Deep Metric Learning

  • Qin Zhang
  • Linghan Xu
  • Jun Fang
  • Qingming Tang
  • Ying Nian Wu
  • Joseph Tighe
  • Yifan Xing

Existing losses used in deep metric learning (DML) for image retrieval often lead to highly non-uniform intra-class and inter-class representation structures across test classes and data distributions. When combined with the common practice of using a fixed threshold to declare a match, this gives rise to significant performance variations in terms of false accept rate (FAR) and false reject rate (FRR) across test classes and data distributions. We define this issue in DML as threshold inconsistency. In real-world applications, such inconsistency often complicates the threshold selection process when deploying large-scale image retrieval systems. To measure this inconsistency, we propose a novel variance-based metric called Operating-Point-Inconsistency-Score (OPIS) that quantifies the variance in the operating characteristics across classes. Using the OPIS metric, we find that achieving high accuracy levels in a DML model does not automatically guarantee threshold consistency. In fact, our investigation reveals a Pareto frontier in the high-accuracy regime, where existing methods to improve accuracy often lead to degradation in threshold consistency. To address this trade-off, we introduce the Threshold-Consistent Margin (TCM) loss, a simple yet effective regularization technique that promotes uniformity in representation structures across classes by selectively penalizing hard sample pairs. Large-scale experiments demonstrate TCM's effectiveness in enhancing threshold consistency while preserving accuracy, simplifying the threshold selection process in practical DML settings.

ICRA Conference 2023 Conference Paper

Tac-VGNN: A Voronoi Graph Neural Network for Pose-Based Tactile Servoing

  • Wen Fan 0001
  • Max Yang
  • Yifan Xing
  • Nathan F. Lepora
  • Dandan Zhang 0001

Tactile pose estimation and tactile servoing are fundamental capabilities of robot touch. Reliable and precise pose estimation can be provided by applying deep learning models to high-resolution optical tactile sensors. Given the recent successes of Graph Neural Network (GNN) and the effectiveness of Voronoi features, we developed a Tactile Voronoi Graph Neural Network (Tac-VGNN) to achieve reliable pose-based tactile servoing relying on a biomimetic optical tactile sensor (TacTip). The GNN is well suited to modeling the distribution relationship between shear motions of the tactile markers, while the Voronoi diagram supplements this with area-based tactile features related to contact depth. The experiment results showed that the Tac-VGNN model can help enhance data interpretability during graph generation and model training efficiency significantly than CNN-based methods. It also improved pose estimation accuracy along vertical depth by 28. 57% over vanilla GNN without Voronoi features and achieved better performance on the real surface following tasks with smoother robot control trajectories. For more project details, please view our website: https://sites.google.com/view/tac-vgnn/home