Author name cluster

Jieping Ye

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

127 papers

2 author rows

AAAI Conference 2026 Conference Paper

Bridging the Language Gap: Uncovering and Aligning Shared Circuits for Multi-Hop Reasoning in Multilingual LLMs

Chenghao Sun
Zhen Huang
Yonggang Zhang
Xinmei Tian
Xu Shen
Jieping Ye

Large language models (LLMs) present a paradox: they can correctly answer a multi-hop factual query in a high-resource language like English, yet fail on the identical query in another language. This raises a fundamental question about the nature of multilingual knowledge: are facts missing, or merely inaccessible? The underlying mechanisms for this knowledge gap have remained largely unexplored. In this work, we resolve this question by introducing a mechanistic interpretability framework that traces the causal pathways of multi-hop knowledge reasoning. Our analysis reveals a core, non-obvious finding: cross-lingual inconsistencies do not stem from a knowledge deficit. Instead, factual knowledge is robustly stored in a set of **shared, language-agnostic semantic neurons**. The failure originates from **misaligned attention pathways**, where a common set of critical attention heads fails to correctly route information along the reasoning chain to the appropriate knowledge neurons in lower-resource languages. This mechanistic diagnosis motivates a targeted alignment strategy: a surgical fine-tuning of only these critical heads. Experiments demonstrate that our method achieves significant improvements in multilingual multi-hop factuality—with positive cross-lingual transfer—while uniquely preserving general model capabilities, offering a scalable and mechanistically-grounded approach to building more reliable multilingual models.

PDF Details DOI

AAAI Conference 2026 Conference Paper

Enhancing Spatial Reasoning Through Visual and Textual Thinking

Xun Liang
Xin Guo
Zhongming Jin
Weihang Pan
Penghui Shang
Deng Cai
Binbin Lin
Jieping Ye

The spatial reasoning task aims to reason about the spatial relationships in 2D and 3D space, which is a fundamental capability for Visual Question Answering (VQA) and robotics. Although vision language models (VLMs) have developed rapidly in recent years, they are still struggling with the spatial reasoning task. In this paper, we introduce a method that can enhance Spatial reasoning through Visual and Textual thinking Simultaneously (SpatialVTS). In the spatial visual thinking phase, our model is trained to generate location-related specific tokens of important targets automatically. Not only are the objects mentioned in the problem addressed, but also the potential objects related to the reasoning are considered. During the spatial textual thinking phase, our model conducts long-term thinking based on visual cues and dialogues and gradually inferences the answers to spatial reasoning problems. To effectively support the model's training, we made manual corrections to the existing spatial reasoning dataset, eliminating numerous incorrect labels resulting from automatic annotation, restructuring the data input format to enhance generalization, and developing a reasoning framework for model thinking. Without introducing any additional information (such as masks or depth), our model's overall average level in several spatial understanding tasks has significantly improved compared with other models.

PDF Details DOI

AAAI Conference 2026 Conference Paper

FGD-Align: Pluralistic Alignment for Large Language Models via Fuzzy Group Decision-Making

Weihang Pan
Zhengxu Yu
Yong Wu
Xun Liang
Zhongming Jin
Qiang Fu
Penghui Shang
Binbin Lin

Ensuring alignment with human values is essential for modern large language models (LLMs), especially amid growing concerns around AI safety and social impact. Yet achieving such alignment remains challenging due to the limited, noisy, and often conflicting nature of human feedback from diverse annotators. Most existing approaches, such as Direct Preference Optimization (DPO), assume consistent and conflict-free supervision, overlooking the ambiguity, inconsistency, and value trade-offs inherent in real-world preferences—often leading to reduced robustness and exclusion of minority views. To address this, we propose FGD-Align, a novel pluralistic alignment framework grounded in Fuzzy Group Decision-Making theory. Our approach rigorously models and aggregates human preferences while retaining the complexity of real-world value trade-offs. Unlike traditional methods that rely on coarse-grained preference pairs, FGD-Align introduces fuzzy preference modeling via triangular fuzzy numbers to capture nuanced, multi-criteria human judgments. We further develop a new training objective, Probabilistic Fuzzy DPO, which incorporates fuzzy preference strength as adaptive loss weights and gradient filters, enhancing robustness to ambiguity and inconsistency in feedback. Comprehensive experiments demonstrate that FGD-Align consistently outperforms both DPO variants and advanced preference aggregation methods in terms of preference accuracy and robustness to ambiguity. It achieves superior alignment stability and better preserves minority preferences, all with minimal computational overhead. Our work bridges the gap between algorithmic tractability and the nuanced landscape of human values, enabling more scalable, inclusive, and socially-aware AI alignment.

PDF Details DOI

AAAI Conference 2026 Conference Paper

Flora: Effortless Context Construction to Arbitrary Length and Scale

Tianxiang Chen
Zhentao Tan
Xiaofan Bo
Yue Wu
Tao Gong
Qi Chu
Jieping Ye

Effectively handling long contexts is challenging for Large Language Models (LLMs) due to the rarity of long texts, high computational demands, and substantial forgetting of short-context abilities. Recent approaches have attempted to construct long contexts for instruction tuning, but these methods often require LLMs or human interventions, which are both costly and limited in length and diversity. Also, the drop in short-context performances of present long-context LLMs remains significant. In this paper, we introduce Flora, an effortless (human/LLM-free) long-context construction strategy. Flora can markedly enhance the long-context performance of LLMs by arbitrarily assembling short instructions based on categories and instructing LLMs to generate responses based on long-context meta-instructions. This enables Flora to produce contexts of arbitrary length and scale with rich diversity, while only slightly compromising short-context performance. Experiments on Llama3-8B-Instruct and QwQ-32B show that LLMs enhanced by Flora excel in three long-context benchmarks while maintaining strong performances in short-context tasks.

PDF Details DOI

AIIM Journal 2025 Journal Article

CATI: A medical context-enhanced framework for diagnosis code assignment in the UK Biobank study

Yue Shen
Jie Wang
Zhe Wang
Zhihao Shi
Hanzhu Chen
Zheng Wang
Yukang Jiang
Xiaopu Wang

Diagnosis codes are standard code format of diseases or medical conditions. This study is aimed at assigning diagnosis codes to patients in large-scale biobanks, particularly addressing the issue of missing codes for some patients. This is crucial for downstream disease-related tasks. While recent methods primarily rely on structured biobank data for code assignment, they often overlook the valuable medical context provided by textual information in the biobanks and hierarchical structure of the disease coding system. To address this gap, we have developed CATI, a medical context-enhanced framework for diagnosis Code Assignment by integrating Textual details derived from key features and disease hIerarchy. The study is based on the UK Biobank data and considers Phecodes and ICD-10 codes as standard disease formats. We start by representing ten informative codified features using their formal names and then integrate them into CATI as text embeddings, achieved through prompt tuning on the pre-trained language model BioBERT. Recognizing the hierarchical structure of diagnosis codes, we have developed a novel convolution layer in our method that effectively propagates logits between adjacent diagnosis codes. Evaluation results demonstrate that CATI outperforms existing state-of-the-art methods in terms of both Phecodes and ICD-10 codes, boasting at least a 5. 16% improvement in average AUROC for unseen disease codes and an 8. 68% rise in average AUPRC for disease codes with training instances ranging in (1000, 10000]. This framework contributes to the formation of well-defined cohorts for downstream studies and offers a unique perspective for addressing complex healthcare tasks by incorporating vital medical context.

Details DOI

ICRA Conference 2025 Conference Paper

CoL3D: Collaborative Learning of Single-view Depth and Camera Intrinsics for Metric 3D Shape Recovery

Chenghao Zhang
Lubin Fan
Shen Cao
Bojian Wu
Jieping Ye

Recovering the metric 3D shape from a single image is particularly relevant for robotics and embodied in-telligence applications, where accurate spatial understanding is crucial for navigation and interaction with environments. Usu-ally, the mainstream approaches achieve it through monocular depth estimation. However, without camera intrinsics, the 3D metric shape can not be recovered from depth alone. In this study, we theoretically demonstrate that depth serves as a 3D prior constraint for estimating camera intrinsics and uncover the reciprocal relations between these two elements. Motivated by this, we propose a collaborative learning framework for jointly estimating depth and camera intrinsics, named CoL3D, to learn metric 3D shapes from single images. Specifically, CoL3D adopts a unified network and performs collaborative optimization at three levels: depth, camera intrinsics, and 3D point clouds. For camera intrinsics, we design a canonical incidence field mechanism as a prior that enables the model to learn the residual incident field for enhanced calibration. Additionally, we incorporate a shape similarity measurement loss in the point cloud space, which improves the quality of 3D shapes essential for robotic applications. As a result, when training and testing on a single dataset with in-domain settings, CoL3D delivers outstanding performance in both depth estimation and camera calibration across several indoor and outdoor benchmark datasets, which leads to remarkable 3D shape quality for the perception capabilities of robots.

Details

NeurIPS Conference 2025 Conference Paper

Consistent Paths Lead to Truth: Self-Rewarding Reinforcement Learning for LLM Reasoning

Kongcheng Zhang
QI YAO
Shunyu Liu
Yingjie Wang
Baisheng Lai
Jieping Ye
Mingli Song
Dacheng Tao

Recent advances of Reinforcement Learning (RL) have highlighted its potential in complex reasoning tasks, yet effective training often relies on external supervision, which limits the broader applicability. In this work, we propose a novel self-rewarding reinforcement learning framework to enhance Large Language Model (LLM) reasoning by leveraging the consistency of intermediate reasoning states across different reasoning trajectories. Our key insight is that correct responses often exhibit consistent trajectory patterns in terms of model likelihood: their intermediate reasoning states tend to converge toward their own final answers ( high consistency ) with minimal deviation toward other candidates ( low volatility ). Inspired by this observation, we introduce CoVo, an intrinsic reward mechanism that integrates Co nsistency and Vo latility via a robust vector-space aggregation strategy, complemented by a curiosity bonus to promote diverse exploration. CoVo enables LLMs to perform RL in a self-rewarding manner, offering a scalable pathway for learning to reason without external supervision. Extensive experiments on diverse reasoning benchmarks show that CoVo achieves performance comparable to or even surpassing supervised RL. Our code is available at https: //github. com/sastpg/CoVo.