Author name cluster

Yi Han

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

6 papers

2 author rows

ICRA Conference 2025 Conference Paper

NeRF-Based Transparent Object Grasping Enhanced by Shape Priors

Yi Han
Zixin Lin
Dongjie Li
Lvping Chen
Yongliang Shi
Gan Ma

Transparent object grasping remains a persistent challenge in robotics, largely due to the difficulty of acquiring precise 3D information. Conventional optical 3D sensors struggle to capture transparent objects, and machine learning methods are often hindered by their reliance on high-quality datasets. Leveraging NeRF's capability for continuous spatial opacity modeling, our proposed architecture integrates a NeRF-based approach for reconstructing the 3D information of transparent objects. Despite this, certain portions of the reconstructed 3D information may remain incomplete. To address these deficiencies, we introduce a shape-prior-driven completion mechanism, further refined by a geometric pose estimation method we have developed. This allows us to obtain a complete and reliable 3D information of transparent objects. Utilizing this refined data, we perform scene-level grasp prediction and deploy the results in real-world robotic systems. Experimental validation demonstrates the efficacy of our architecture, showcasing its capability to reliably capture 3D information of various transparent objects in cluttered scenes, and correspondingly, achieve high-quality, stable, and executable grasp predictions.

Details

NeurIPS Conference 2025 Conference Paper

Reverse Diffusion Sequential Monte Carlo Samplers

Luhuan Wu
Yi Han
Christian Andersson Naesseth
John Cunningham

We propose a novel sequential Monte Carlo (SMC) method for sampling from unnormalized target distributions based on a reverse denoising diffusion process. While recent diffusion-based samplers simulate the reverse diffusion using approximate score functions, they can suffer from accumulating errors due to time discretization and imperfect score estimation. In this work, we introduce a principled SMC framework that formalizes diffusion-based samplers as proposals while systematically correcting for their biases. The core idea is to construct informative intermediate target distributions that progressively steer the sampling trajectory toward the final target distribution. Although ideal intermediate targets are intractable, we develop \emph{exact approximations} using quantities from the score estimation-based proposal, without requiring additional model training or inference overhead. The resulting sampler, termed \textit{\ourmethodfull}, enables consistent sampling and unbiased estimation of the target's normalization constant under mild conditions. We demonstrate the effectiveness of our method on a range of synthetic targets and real-world Bayesian inference problems.

PDF Details

NeurIPS Conference 2025 Conference Paper

RoboRefer: Towards Spatial Referring with Reasoning in Vision-Language Models for Robotics

Enshen Zhou
Jingkun An
Cheng Chi
Yi Han
Shanyu Rong
Chi Zhang
Pengwei Wang
Zhongyuan Wang

Spatial referring is a fundamental capability of embodied robots to interact with the 3D physical world. However, even with the powerful pretrained VLMs, recent approaches are still not qualified to accurately understand the complex 3D scenes and dynamically reason about the instruction-indicated locations for interaction. To this end, we propose RoboRefer, a 3D-aware vision language model (VLM) that can first achieve precise spatial understanding by integrating a disentangled but dedicated depth encoder via supervised fine-tuning (SFT). Moreover, RoboRefer advances generalized multi-step spatial reasoning via reinforcement fine-tuning (RFT), with metric-sensitive process reward functions tailored for spatial referring tasks. To support SFT and RFT training, we introduce RefSpatial, a large-scale dataset of 20M QA pairs (2x prior), covering 31 spatial relations (vs. 15 prior) and supporting complex reasoning processes (up to 5 steps). In addition, we introduce RefSpatial-Bench, a challenging benchmark filling the gap in evaluating spatial referring with multi-step reasoning. Experiments show that SFT-trained RoboRefer achieves state-of-the-art spatial understanding, with an average success rate of 89. 6%. RFT-trained RoboRefer further outperforms all other baselines by a large margin, even surpassing Gemini-2. 5-Pro by 12. 4% in average accuracy on RefSpatial-Bench. Notably, RoboRefer can be integrated with various control policies to execute long-horizon, dynamic tasks across diverse robots (e, g. , UR5, G1 humanoid) in cluttered real-world scenes.

PDF Details

EAAI Journal 2025 Journal Article

Ship fuel consumption prediction based on transfer learning: Models and applications

Xi Luo
Mingyang Zhang
Yi Han
Ran Yan
Shuaian Wang

Details DOI

IJCAI Conference 2022 Conference Paper

Modeling Precursors for Temporal Knowledge Graph Reasoning via Auto-encoder Structure

Yifu Gao
Linhui Feng
Zhigang Kan
Yi Han
Linbo Qiao
Dongsheng Li

Temporal knowledge graph (TKG) reasoning that infers missing facts in the future is an essential and challenging task. When predicting a future event, there must be a narrative evolutionary process composed of closely related historical facts to support the event's occurrence, namely fact precursors. However, most existing models employ a sequential reasoning process in an auto-regressive manner, which cannot capture precursor information. This paper proposes a novel auto-encoder architecture that introduces a relation-aware graph attention layer into transformer (rGalT) to accommodate inference over the TKG. Specifically, we first calculate the correlation between historical and predicted facts through multiple attention mechanisms along intra-graph and inter-graph dimensions, then constitute these mutually related facts into diverse fact segments. Next, we borrow the translation generation idea to decode in parallel the precursor information associated with the given query, which enables our model to infer future unknown facts by progressively generating graph structures. Experimental results on four benchmark datasets demonstrate that our model outperforms other state-of-the-art methods, and precursor identiﬁcation provides supporting evidence for prediction.

PDF Details DOI

YNICL Journal 2021 Journal Article

Uncinate fasciculus and its cortical terminals in aphasia after subcortical stroke: A multi-modal MRI study

Binlong Zhang
Jingling Chang
Joel Park
Zhongjian Tan
Lu Tang
Tianli Lyu
Yi Han
Ruiwen Fan

Details DOI