Arrow Research search

Author name cluster

Junfeng Zhao

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

12 papers
1 author row

Possible papers

12

AAAI Conference 2026 Conference Paper

Toward Better EHR Reasoning in LLMs: Reinforcement Learning with Expert Attention Guidance

  • Yue Fang
  • Yuxin Guo
  • Jiaran Gao
  • Hongxin Ding
  • Xinke Jiang
  • Weibin Liao
  • Yongxin Xu
  • Yinghao Zhu

Improving large language models (LLMs) for electronic health record (EHR) reasoning is essential for enabling accurate and generalizable clinical predictions. While LLMs excel at medical text understanding, they underperform on EHR-based prediction tasks due to challenges in modeling temporally structured, high-dimensional data. Existing approaches often rely on hybrid paradigms, where LLMs serve merely as frozen prior retrievers while downstream deep learning (DL) models handle prediction, failing to improve the LLM’s intrinsic reasoning capacity and inheriting the generalization limitations of DL models. To this end, we propose EAG-RL, a novel two-stage training framework designed to intrinsically enhance LLMs’ EHR reasoning ability through expert attention guidance, where expert EHR models refer to task-specific DL models trained on EHR data. Concretely, EAG-RL first constructs high-quality, stepwise reasoning trajectories using expert-guided Monte Carlo Tree Search to effectively initialize the LLM’s policy. Then, EAG-RL further optimizes the policy via reinforcement learning by aligning the LLM’s attention with clinically salient features identified by expert EHR models. Extensive experiments on two real-world EHR datasets show that EAG-RL improves the intrinsic EHR reasoning ability of LLMs by an average of 14.62%, while also enhancing robustness to feature perturbations and generalization to unseen clinical domains. These results demonstrate the practical potential of EAG-RL for real-world deployment in clinical prediction tasks.

AAAI Conference 2025 Conference Paper

DearLLM: Enhancing Personalized Healthcare via Large Language Models-Deduced Feature Correlations

  • Yongxin Xu
  • Xinke Jiang
  • Xu Chu
  • Rihong Qiu
  • Yujie Feng
  • Hongxin Ding
  • Junfeng Zhao
  • Yasha Wang

Exploring the correlations between medical features is essential for extracting patient health patterns from electronic health records (EHR) data, and strengthening medical predictions and decision-making. To constrain the hypothesis space of pure data-driven deep learning in the context of limited annotated data, a common trend is to incorporate external knowledge, especially knowledge priors related to personalized health contexts, to optimize model training. However, most existing methods lack flexibility and are constrained by the uncertainties brought about by fixed feature correlation priors. In addition, in utilizing knowledge, these methods overlook the knowledge informative for personalized healthcare. To this end, we propose DearLLM, a novel and effective framework that leverages feature correlations deduced by large language models (LLMs) to enhance personalized healthcare. Concretely, DearLLM captures and learns quantitative correlations between medical features by calculating the conditional perplexity of LLMs’ deduction based on personalized patient backgrounds. Then, DearLLM enhances healthcare predictions by emphasizing knowledge that carries unique patient information through a feature-frequency-aware graph pooling method. Extensive experiments on two real-world benchmark datasets show significant performance gains brought by DearLLM. Furthermore, the discovered findings align well with medical literature, offering meaningful clinical interpretations.

AAAI Conference 2025 Conference Paper

KnowPO: Knowledge-Aware Preference Optimization for Controllable Knowledge Selection in Retrieval-Augmented Language Models

  • Ruizhe Zhang
  • Yongxin Xu
  • Yuzhen Xiao
  • Runchuan Zhu
  • Xinke Jiang
  • Xu Chu
  • Junfeng Zhao
  • Yasha Wang

By integrating external knowledge, Retrieval-Augmented Generation (RAG) has become an effective strategy for mitigating the hallucination problems that large language models (LLMs) encounter when dealing with knowledge-intensive tasks. However, in the process of integrating external non-parametric supporting evidence with internal parametric knowledge, inevitable knowledge conflicts may arise, leading to confusion in the model's responses. To enhance the knowledge selection of LLMs in various contexts, some research has focused on refining their behavior patterns through instruction-tuning. Nonetheless, due to the absence of explicit negative signals and comparative objectives, models fine-tuned in this manner may still exhibit undesirable behaviors such as contextual ignorance and contextual overinclusion. To this end, we propose a Knowledge-aware Preference Optimization strategy, dubbed KnowPO, aimed at achieving adaptive knowledge selection based on contextual relevance in real retrieval scenarios. Concretely, we proposed a general paradigm for constructing knowledge conflict datasets, which comprehensively cover various error types and learn how to avoid these negative signals through preference optimization methods. Simultaneously, we proposed a rewriting strategy and data ratio optimization strategy to address preference imbalances. Experimental results show that KnowPO outperforms previous methods for handling knowledge conflicts by over 37%, while also exhibiting robust generalization across various out-of-distribution datasets.

NeurIPS Conference 2025 Conference Paper

MODEL SHAPLEY: Find Your Ideal Parameter Player via One Gradient Backpropagation

  • Chu Xu
  • Xinke Jiang
  • Rihong Qiu
  • Jiaran Gao
  • Junfeng Zhao

Measuring parameter importance is crucial for understanding and optimizing large language models (LLMs). Existing work predominantly focuses on pruning or probing at neuron/feature levels without fully considering the cooperative behaviors of model parameters. In this paper, we introduce a novel approach--Model Shapley to quantify parameter importance based on the Shapley value, a principled method from cooperative game theory that captures both individual and synergistic contributions among parameters, via only one gradient backpropagation. We derive a scalable second-order approximation to compute Shapley values at the parameter level, leveraging blockwise Fisher information for tractability in large-scale settings. Our method enables fine-grained differentiation of parameter importance, facilitating targeted knowledge injection and model compression. Through mini-batch Monte Carlo updates and efficient approximation of the Hessian structure, we achieve robust Shapley-based attribution with only modest computational overhead. Experimental results indicate that this cooperative game perspective enhances interpretability, guides more effective parameter-specific fine-tuning and model compressing, and paves the way for continuous model improvement in various downstream tasks.

NeurIPS Conference 2024 Conference Paper

RAGraph: A General Retrieval-Augmented Graph Learning Framework

  • Xinke Jiang
  • Rihong Qiu
  • Yongxin Xu
  • Wentao Zhang
  • Yichen Zhu
  • Ruizhe Zhang
  • Yuchen Fang
  • Xu Chu

Graph Neural Networks (GNNs) have become essential in interpreting relational data across various domains, yet, they often struggle to generalize to unseen graph data that differs markedly from training instances. In this paper, we introduce a novel framework called General Retrieval-Augmented Graph Learning (RAGraph), which brings external graph data into the general graph foundation model to improve model generalization on unseen scenarios. On the top of our framework is a toy graph vector library that we established, which captures key attributes, such as features and task-specific label information. During inference, the RAGraph adeptly retrieves similar toy graphs based on key similarities in downstream tasks, integrating the retrieved data to enrich the learning context via the message-passing prompting mechanism. Our extensive experimental evaluations demonstrate that RAGraph significantly outperforms state-of-the-art graph learning methods in multiple tasks such as node classification, link prediction, and graph classification across both dynamic and static datasets. Furthermore, extensive testing confirms that RAGraph consistently maintains high performance without the need for task-specific fine-tuning, highlighting its adaptability, robustness, and broad applicability.

NeurIPS Conference 2024 Conference Paper

SMART: Towards Pre-trained Missing-Aware Model for Patient Health Status Prediction

  • Zhihao Yu
  • Xu Chu
  • Yujie Jin
  • Yasha Wang
  • Junfeng Zhao

Electronic health record (EHR) data has emerged as a valuable resource for analyzing patient health status. However, the prevalence of missing data in EHR poses significant challenges to existing methods, leading to spurious correlations and suboptimal predictions. While various imputation techniques have been developed to address this issue, they often obsess difficult-to-interpolate details and may introduce additional noise when making clinical predictions. To tackle this problem, we propose SMART, a Self-Supervised Missing-Aware RepresenTation Learning approach for patient health status prediction, which encodes missing information via missing-aware temporal and variable attentions and learns to impute missing values through a novel self-supervised pre-training approach which reconstructs missing data representations in the latent space rather than in input space as usual. By adopting elaborated attentions and focusing on learning higher-order representations, SMART promotes better generalization and robustness to missing data. We validate the effectiveness of SMART through extensive experiments on six EHR tasks, demonstrating its superiority over state-of-the-art methods.

NeurIPS Conference 2023 Conference Paper

Fused Gromov-Wasserstein Graph Mixup for Graph-level Classifications

  • Xinyu Ma
  • Xu Chu
  • Yasha Wang
  • Yang Lin
  • Junfeng Zhao
  • Liantao Ma
  • Wenwu Zhu

Graph data augmentation has shown superiority in enhancing generalizability and robustness of GNNs in graph-level classifications. However, existing methods primarily focus on the augmentation in the graph signal space and the graph structure space independently, neglecting the joint interaction between them. In this paper, we address this limitation by formulating the problem as an optimal transport problem that aims to find an optimal inter-graph node matching strategy considering the interactions between graph structures and signals. To solve this problem, we propose a novel graph mixup algorithm called FGWMixup, which seeks a "midpoint" of source graphs in the Fused Gromov-Wasserstein (FGW) metric space. To enhance the scalability of our method, we introduce a relaxed FGW solver that accelerates FGWMixup by improving the convergence rate from $\mathcal{O}(t^{-1})$ to $\mathcal{O}(t^{-2})$. Extensive experiments conducted on five datasets using both classic (MPNNs) and advanced (Graphormers) GNN backbones demonstrate that \mname\xspace effectively improves the generalizability and robustness of GNNs. Codes are available at https: //github. com/ArthurLeoM/FGWMixup.

AAAI Conference 2023 Conference Paper

KerPrint: Local-Global Knowledge Graph Enhanced Diagnosis Prediction for Retrospective and Prospective Interpretations

  • Kai Yang
  • Yongxin Xu
  • Peinie Zou
  • Hongxin Ding
  • Junfeng Zhao
  • Yasha Wang
  • Bing Xie

While recent developments of deep learning models have led to record-breaking achievements in many areas, the lack of sufficient interpretation remains a problem for many specific applications, such as the diagnosis prediction task in healthcare. The previous knowledge graph(KG) enhanced approaches mainly focus on learning clinically meaningful representations, the importance of medical concepts, and even the knowledge paths from inputs to labels. However, it is infeasible to interpret the diagnosis prediction, which needs to consider different medical concepts, various medical relationships, and the time-effectiveness of knowledge triples in different patient contexts. More importantly, the retrospective and prospective interpretations of disease processes are valuable to clinicians for the patients' confounding diseases. We propose KerPrint, a novel KG enhanced approach for retrospective and prospective interpretations to tackle these problems. Specifically, we propose a time-aware KG attention method to solve the problem of knowledge decay over time for trustworthy retrospective interpretation. We also propose a novel element-wise attention method to select candidate global knowledge using comprehensive representations from the local KG for prospective interpretation. We validate the effectiveness of our KerPrint through an extensive experimental study on a real-world dataset and a public dataset. The results show that our proposed approach not only achieves significant improvement over knowledge-enhanced methods but also gives the interpretability of diagnosis prediction in both retrospective and prospective views.

IJCAI Conference 2023 Conference Paper

VecoCare: Visit Sequences-Clinical Notes Joint Learning for Diagnosis Prediction in Healthcare Data

  • Yongxin Xu
  • Kai Yang
  • Chaohe Zhang
  • Peinie Zou
  • Zhiyuan Wang
  • Hongxin Ding
  • Junfeng Zhao
  • Yasha Wang

Due to the insufficiency of electronic health records (EHR) data utilized in practical diagnosis prediction scenarios, most works are devoted to learning powerful patient representations either from structured EHR data (e. g. , temporal medical events, lab test results, etc. ) or unstructured data (e. g. , clinical notes, etc. ). However, synthesizing rich information from both of them still needs to be explored. Firstly, the heterogeneous semantic biases across them heavily hinder the synthesis of representation spaces, which is critical for diagnosis prediction. Secondly, the intermingled quality of partial clinical notes leads to inadequate representations of to-be-predicted patients. Thirdly, typical attention mechanisms mainly focus on aggregating information from similar patients, ignoring important auxiliary information from others. To tackle these challenges, we propose a novel visit sequences-clinical notes joint learning approach, dubbed VecoCare. It performs a Gromov-Wasserstein Distance (GWD)-based contrastive learning task and an adaptive masked language model task in a sequential pre-training manner to reduce heterogeneous semantic biases. After pre-training, VecoCare further aggregates information from both similar and dissimilar patients through a dual-channel retrieval mechanism. We conduct diagnosis prediction experiments on two real-world datasets, which indicates that VecoCare outperforms state-of-the-art approaches. Moreover, the findings discovered by VecoCare are consistent with the medical researches.

AAAI Conference 2020 Conference Paper

COTSAE: CO-Training of Structure and Attribute Embeddings for Entity Alignment

  • Kai Yang
  • Shaoqin Liu
  • Junfeng Zhao
  • Yasha Wang
  • Bing Xie

Entity alignment is a fundamental and vital task in Knowledge Graph (KG) construction and fusion. Previous works mainly focus on capturing the structural semantics of entities by learning the entity embeddings on the relational triples and pre-aligned ”seed entities”. Some works also seek to incorporate the attribute information to assist refining the entity embeddings. However, there are still many problems not considered, which dramatically limits the utilization of attribute information in the entity alignment. Different KGs may have lots of different attribute types, and even the same attribute may have diverse data structures and value granularities. Most importantly, attributes may have various ”contributions” to the entity alignment. To solve these problems, we propose COTSAE that combines the structure and attribute information of entities by co-training two embedding learning components, respectively. We also propose a joint attention method in our model to learn the attentions of attribute types and values cooperatively. We verified our COTSAE on several datasets from real-world KGs, and the results showed that it is significantly better than the latest entity alignment methods. The structure and attribute information can complement each other and both contribute to performance improvement.

AAAI Conference 2017 Conference Paper

TaGiTeD: Predictive Task Guided Tensor Decomposition for Representation Learning from Electronic Health Records

  • Kai Yang
  • Xiang Li
  • Haifeng Liu
  • Jing Mei
  • Guotong Xie
  • Junfeng Zhao
  • Bing Xie
  • Fei Wang

With the better availability of healthcare data, such as Electronic Health Records (EHR), more and more data analytics methodologies are developed aiming at digging insights from them to improve the quality of care delivery. There are many challenges on analyzing EHR, such as high dimensionality and event sparsity. Moreover, different from other application domains, the EHR analysis algorithms need to be highly interpretable to make them clinically useful. This makes representation learning from EHRs of key importance. In this paper, we propose an algorithm called Predictive Task Guided Tensor Decomposition (TaGiTeD), to analyze EHRs. Specifically, TaGiTeD learns event interaction patterns that are highly predictive for certain tasks from EHRs with supervised tensor decomposition. Compared with unsupervised methods, TaGiTeD can learn effective EHR representations in a more focused way. This is crucial because most of the medical problems have very limited patient samples, which are not enough for unsupervised algorithms to learn meaningful representations form. We apply TaGiTeD on real world EHR data warehouse and demonstrate that TaGiTeD can learn representations that are both interpretable and predictive.