Arrow Research search

Author name cluster

Hongxia Xu

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

7 papers
2 author rows

Possible papers

7

AAAI Conference 2026 Conference Paper

ALEX:A Light Editing-knowledge Extractor

  • Minghu Wang
  • ShuLiang Zhao
  • Yuanyuan Zhao
  • Hongxia Xu

The static nature of knowledge within Large Language Models (LLMs) makes it difficult for them to adapt to evolving information, rendering knowledge editing a critical task. However, existing methods struggle with challenges of scalability and retrieval efficiency, particularly when handling complex, multi-hop questions that require multi-step reasoning. To address these challenges, this paper introduces ALEX (A Light Editing-knowledge Extractor), a lightweight knowledge editing framework. The core innovation of ALEX is its hierarchical memory architecture, which organizes knowledge updates (edits) into semantic clusters. This design fundamentally reduces retrieval complexity from a linear O(N) to a highly scalable O(K+N/C). Furthermore, the framework integrates an Inferential Query Synthesis (IQS) module to bridge the semantic gap between queries and facts, and a Dynamic Evidence Adjudication (DEA) engine that executes an efficient two-stage retrieval process. Experiments on the MQUAKE benchmark demonstrate that ALEX significantly improves both the accuracy of multi-hop answers (MultiHop-ACC) and the reliability of reasoning paths (HopWise-ACC). It also reduces the required search space by over 80%, presenting a promising path toward building scalable, efficient, and accurate knowledge editing systems.

JBHI Journal 2026 Journal Article

RetinexDA: Progressive Disentanglement Domain Adaptation for Unsupervised Cross-Modality Medical Image Segmentation

  • Yixuan Wu
  • Mingze Yin
  • Zitai Kong
  • Jintai Chen
  • Jian Wu
  • Honghao Gao
  • Hongxia Xu

Deep neural networks have achieved strong performance in medical image segmentation when the training and testing data share similar appearance characteristics. However, this assumption is rarely satisfied in practical clinical scenarios, where imaging protocols, scanner vendors, and modality physics differ substantially, resulting in severe performance degradation when the model is deployed to new environments. To address this challenge, we propose RetinexDA, a novel unsupervised domain adaptation framework that explicitly decomposes a medical image into domain-invariant structural and domain-specific appearance representations. This Retinex-inspired formulation preserves essential anatomical details while mitigating modality-dependent variations. Furthermore, we introduce Disentangled Knowledge Distillation (DKD) to ensure mutual semantic alignment between the structure–appearance decomposition in pixel space and the encoded features in latent space, strengthening fine-grained segmentation capability. In addition, a Bézier-curve domain bridging strategy is developed to generate smoothly transitioned intermediate samples across domains, improving adaptation robustness under large modality discrepancies. Extensive experiments on abdominal CT and cardiac MRI segmentation tasks demonstrate that RetinexDA surpasses state-of-the-art unsupervised domain adaptation approaches, showing strong potential for scalable and reliable clinical deployment.

AAAI Conference 2026 Conference Paper

WDT-MD: Wavelet Diffusion Transformers for Microaneurysm Detection in Fundus Images

  • Yifei Sun
  • Yuzhi He
  • Junhao Jia
  • Jinhong Wang
  • Ruiquan Ge
  • Changmiao Wang
  • Hongxia Xu

Microaneurysms (MAs), the earliest pathognomonic signs of Diabetic Retinopathy (DR), present as sub-60 μm lesions in fundus images with highly variable photometric and morphological characteristics, rendering manual screening not only labor-intensive but inherently error-prone. While diffusion-based anomaly detection has emerged as a promising approach for automated MA screening, its clinical application is hindered by three fundamental limitations. First, these models often fall prey to "identity mapping", where they inadvertently replicate the input image. Second, they struggle to distinguish MAs from other anomalies, leading to high false positives. Third, their suboptimal reconstruction of normal features hampers overall performance. To address these challenges, we propose a Wavelet Diffusion Transformer framework for MA Detection (WDT-MD), which features three key innovations: a noise-encoded image conditioning mechanism to avoid "identity mapping" by perturbing image conditions during training; pseudo-normal pattern synthesis via inpainting to introduce pixel-level supervision, enabling discrimination between MAs and other anomalies; and a wavelet diffusion Transformer architecture that combines the global modeling capability of diffusion Transformers with multi-scale wavelet analysis to enhance reconstruction of normal retinal features. Comprehensive experiments on the IDRiD and e-ophtha MA datasets demonstrate that WDT-MD outperforms state-of-the-art methods in both pixel-level and image-level MA detection. This advancement holds significant promise for improving early DR screening.

EAAI Journal 2025 Journal Article

Advancing interpretable cardiac disease diagnosis via a transformer-convolutional hybrid network on electrocardiograms

  • Xiaoqiang Liu
  • Yinlong Xu
  • Hongxia Xu
  • Liang He
  • Siyu Long
  • Yisen Huang
  • Yubin Wang
  • Yingzhou Lu

Manual heart disease diagnosis with the electrocardiogram (ECG) is intractable due to the intertwined signal features and lengthy diagnosis procedure, especially for the 24-hour dynamic ECG signals. Consequently, even experienced cardiologists may face difficulty in producing all accurate ECG reports. In recent years, Artificial Intelligence (AI), particularly neural network-based automatic ECG diagnosis methods have exhibited promising performance, suggesting a potential alternative to the labor-intensive examination conducted by cardiologists. However, many existing approaches failed to adequately consider the temporal and channel dimensions when assembling features and ignored interpretability. And clinical theory underscores the necessity of prolonged signal observations for diagnosing certain ECG conditions such as tachycardia. Moreover, specific heart diseases manifest primarily through distinct ECG leads represented as channels. In response to these challenges, this paper introduces a novel neural network architecture for ECG classification (diagnosis). The proposed model incorporates Lead Fusing blocks, transformer-XL (meaning extra long) encoder-based Encoder modules, and hierarchical temporal attentions. Importantly, this classifier operates directly on raw ECG time-series signals rather than cardiac cycles. Signal integration begins with the Lead Fusing blocks, followed by the Encoder modules and hierarchical temporal attentions, enabling the extraction of long-dependent features. Furthermore, existing convolution-based methods have been argued to compromise interpretability, whereas the proposed neural network provides improved clarity in this regard. Experimental evaluations on a comprehensive public dataset confirm the superiority of the proposed classifier over state-of-the-art methods. Moreover, a visualization method was employed to generate a location map that demonstrates the areas of the signal emphasized by the model, thereby enhancing interpretability.

NeurIPS Conference 2024 Conference Paper

Enhancing Semi-Supervised Learning via Representative and Diverse Sample Selection

  • Qian Shao
  • Jiangrui Kang
  • Qiyuan Chen
  • Zepeng Li
  • Hongxia Xu
  • Yiwen Cao
  • Jiajuan Liang
  • Jian Wu

Semi-Supervised Learning (SSL) has become a preferred paradigm in many deep learning tasks, which reduces the need for human labor. Previous studies primarily focus on effectively utilising the labelled and unlabeled data to improve performance. However, we observe that how to select samples for labelling also significantly impacts performance, particularly under extremely low-budget settings. The sample selection task in SSL has been under-explored for a long time. To fill in this gap, we propose a Representative and Diverse Sample Selection approach (RDSS). By adopting a modified Frank-Wolfe algorithm to minimise a novel criterion $\alpha$-Maximum Mean Discrepancy ($\alpha$-MMD), RDSS samples a representative and diverse subset for annotation from the unlabeled data. We demonstrate that minimizing $\alpha$-MMD enhances the generalization ability of low-budget learning. Experimental results show that RDSS consistently improves the performance of several popular SSL frameworks and outperforms the state-of-the-art sample selection approaches used in Active Learning (AL) and Semi-Supervised Active Learning (SSAL), even with constrained annotation budgets. Our code is available at [RDSS](https: //github. com/YanhuiAILab/RDSS).

ICLR Conference 2024 Conference Paper

Making Pre-trained Language Models Great on Tabular Prediction

  • Jiahuan Yan
  • Bo Zheng 0011
  • Hongxia Xu
  • Yiheng Zhu 0002
  • Danny Z. Chen
  • Jimeng Sun 0001
  • Jian Wu 0001
  • Jintai Chen

The transferability of deep neural networks (DNNs) has made significant progress in image and language processing. However, due to the heterogeneity among tables, such DNN bonus is still far from being well exploited on tabular data prediction (e.g., regression or classification tasks). Condensing knowledge from diverse domains, language models (LMs) possess the capability to comprehend feature names from various tables, potentially serving as versatile learners in transferring knowledge across distinct tables and diverse prediction tasks, but their discrete text representation space is inherently incompatible with numerical feature values in tables. In this paper, we present TP-BERTa, a specifically pre-trained LM for tabular data prediction. Concretely, a novel relative magnitude tokenization converts scalar numerical feature values to finely discrete, high-dimensional tokens, and an intra-feature attention approach integrates feature values with the corresponding feature names. Comprehensive experiments demonstrate that our pre-trained TP-BERTa leads the performance among tabular DNNs and is competitive with Gradient Boosted Decision Tree models in typical tabular data regime.

JBHI Journal 2024 Journal Article

Polygonal Approximation Learning for Convex Object Segmentation in Biomedical Images With Bounding Box Supervision

  • Wenhao Zheng
  • Jintai Chen
  • Kai Zhang
  • Jiahuan Yan
  • Jinhong Wang
  • Yi Cheng
  • Bang Du
  • Danny Z. Chen

As a common and critical medical image analysis task, deep learning based biomedical image segmentation is hindered by the dependence on costly fine-grained annotations. To alleviate this data dependence, in this article, a novel approach, called Polygonal Approximation Learning (PAL), is proposed for convex object instance segmentation with only bounding-box supervision. The key idea behind PAL is that the detection model for convex objects already contains the necessary information for segmenting them since their convex hulls, which can be generated approximately by the intersection of bounding boxes, are equivalent to the masks representing the objects. To extract the essential information from the detection model, a repeated detection approach is employed on biomedical images where various rotation angles are applied and a dice loss with the projection of the rotated detection results is utilized as a supervised signal in training our segmentation model. In biomedical imaging tasks involving convex objects, such as nuclei instance segmentation, PAL outperforms the known models (e. g. , BoxInst) that rely solely on box supervision. Furthermore, PAL achieves comparable performance with mask-supervised models including Mask R-CNN and Cascade Mask R-CNN. Interestingly, PAL also demonstrates remarkable performance on non-convex object instance segmentation tasks, for example, surgical instrument and organ instance segmentation.