Author name cluster

Hui Xu

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

14 papers

1 author row

AAAI Conference 2026 Conference Paper

Negative Entity Suppression for Zero-Shot Captioning with Synthetic Images

Zimao Lu
Hui Xu
Bing Liu
Ke Wang

Text-only training provides an attractive approach to address data scarcity challenges in zero-shot image captioning (ZIC), avoiding the expense of collecting paired image-text annotations. However, although these approaches perform well within training domains, they suffer from poor cross-domain generalization, often producing hallucinated content when encountering novel visual environments. Retrieval-based methods attempt to mitigate this limitation by leveraging external knowledge, but they can paradoxically exacerbate hallucination when retrieved captions contain entities irrelevant to the inputs. We introduce the concept of negative entities—objects that appear in generated caption but are absent from the input—and propose Negative Entity Suppression (NES) to tackle this challenge. NES seamlessly integrates three stages: (1) it employs synthetic images to ensure consistent image-to-text retrieval across both training and inference; (2) it filters negative entities from retrieved content to enhance accuracy; and (3) it applies attention-level suppression using identified negative entities to further minimize the impact of hallucination-prone features. Evaluation across multiple benchmarks demonstrates that NES maintains competitive in-domain performance while improving cross-domain transfer and reducing hallucination rates, achieving new state-of-the-art results in ZIC.

PDF Details DOI

YNIMG Journal 2025 Journal Article

DeepReducer: A linear transformer-based model for MEG denoising

Hui Xu
Li Zheng
Pan Liao
Bingjiang Lyu
Jia-Hong Gao

Details DOI

NeurIPS Conference 2025 Conference Paper

OVERT: A Benchmark for Over-Refusal Evaluation on Text-to-Image Models

Ziheng Cheng
Yixiao Huang
Hui Xu
Somayeh Sojoudi
Xuandong Zhao
Dawn Song
Song Mei

Text-to-Image (T2I) models have achieved remarkable success in generating visual content from text inputs. Although multiple safety alignment strategies have been proposed to prevent harmful outputs, they often lead to overly cautious behavior---rejecting even benign prompts---a phenomenon known as \textit{over-refusal} that reduces the practical utility of T2I models. Despite over-refusal having been observed in practice, there is no large-scale benchmark that systematically evaluates this phenomenon for T2I models. In this paper, we present an automatic workflow to construct synthetic evaluation data, resulting in OVERT (\textbf{OVE}r-\textbf{R}efusal evaluation on \textbf{T}ext-to-image models), the first large-scale benchmark for assessing over-refusal behaviors in T2I models. OVERT includes 4, 600 seemingly harmful but benign prompts across nine safety-related categories, along with 1, 785 genuinely harmful prompts (OVERT-unsafe) to evaluate the safety–utility trade-off. Using OVERT, we evaluate several leading T2I models and find that over-refusal is a widespread issue across various categories (Figure 1), underscoring the need for further research to enhance the safety alignment of T2I models without compromising their functionality. As a preliminary attempt to reduce over-refusal, we explore prompt rewriting; however, we find it often compromises faithfulness to the meaning of the original prompts. Finally, we demonstrate the flexibility of our generation framework in accommodating diverse safety requirements by generating customized evaluation data adapting to user-defined policies.

PDF Details

YNICL Journal 2025 Journal Article

State-specific GluCEST alterations in insular subregions are associated with depression and plasma inflammatory biomarker levels in patients with inflammatory bowel disease

Lixue Xu
Jun Lu
Minsi Zhou
Haiyun Shi
Jing Zheng
Tianxin Cheng
Hui Xu
Dawei Yang

Details DOI

AAAI Conference 2025 Conference Paper

Temporal Action Localization with Cross Layer Task Decoupling and Refinement

Qiang Li
Di Liu
Jun Kong
Sen Li
Hui Xu
Jianzhong Wang

Temporal action localization (TAL) involves dual tasks to classify and localize actions within untrimmed videos. However, the two tasks often have conflicting requirements for features. Existing methods typically employ separate heads for classification and localization tasks but share the same input feature, leading to suboptimal performance. To address this issue, we propose a novel TAL method with Cross Layer Task Decoupling and Refinement (CLTDR). Based on the feature pyramid of video, CLTDR strategy integrates semantically strong features from higher pyramid layers and detailed boundary-aware boundary features from lower pyramid layers to effectively disentangle the action classification and localization tasks. Moreover, the multiple features from cross layers are also employed to refine and align the disentangled classification and regression results. At last, a lightweight Gated Multi-Granularity (GMG) module is proposed to comprehensively extract and aggregate video features at instant, local, and global temporal granularities. Benefiting from the CLTDR and GMG modules, our method achieves state-of-the-art performance on five challenging benchmarks: THUMOS14, MultiTHUMOS, EPIC-KITCHENS-100, ActivityNet-1.3, and HACS. Code：https://github.com/LiQiang0307/CLTDR-GMG

PDF Details DOI

EAAI Journal 2024 Journal Article

An improved medical image segmentation framework with Channel-Height-Width-Spatial attention module

Xiang Yu
Hongbo Guo
Ying Yuan
Wenjia Guo
Xia Yang
Hui Xu
Yanqing Kong
Yudong Zhang

Details DOI

EAAI Journal 2023 Journal Article

A combination-based machine learning algorithm estimating impacts of social, economic, and environmental on resident health—on China’s provincial panel data

Li Wen
Wei Pan
Shujie Liao
Wulin Pan
Hui Xu
Cheng Hu

Details DOI

EAAI Journal 2023 Journal Article

A storage-efficient SNN–CNN hybrid network with RRAM-implemented weights for traffic signs recognition

Yufei Zhang
Hui Xu
Lixing Huang
Changlin Chen

Details DOI

JBHI Journal 2023 Journal Article

Multimodal Data Matters: Language Model Pre-Training Over Structured and Unstructured Electronic Health Records

Sicen Liu
Xiaolong Wang
Yongshuai Hou
Ge Li
Hui Wang
Hui Xu
Yang Xiang
Buzhou Tang

As two important textual modalities in electronic health records (EHR), both structured data (clinical codes) and unstructured data (clinical narratives) have recently been increasingly applied to the healthcare domain. Most existing EHR-oriented studies, however, either focus on a particular modality or integrate data from different modalities in a straightforward manner, which usually treats structured and unstructured data as two independent sources of information about patient admission and ignore the intrinsic interactions between them. In fact, the two modalities are documented during the same encounter where structured data inform the documentation of unstructured data and vice versa. In this paper, we proposed a Medical Multimodal Pre-trained Language Model, named MedM-PLM, to learn enhanced EHR representations over structured and unstructured data and explore the interaction of two modalities. In MedM-PLM, two Transformer-based neural network components are firstly adopted to learn representative characteristics from each modality. A cross-modal module is then introduced to model their interactions. We pre-trained MedM-PLM on the MIMIC-III dataset and verified the effectiveness of the model on three downstream clinical tasks, i. e. , medication recommendation, 30-day readmission prediction and ICD coding. Extensive experiments demonstrate the power of MedM-PLM compared with state-of-the-art methods. Further analyses and visualizations show the robustness of our model, which could potentially provide more comprehensive interpretations for clinical decision-making.

Details DOI

JBHI Journal 2023 Journal Article

SHAPE: A Sample-Adaptive Hierarchical Prediction Network for Medication Recommendation

Sicen Liu
Xiaolong Wang
Jingcheng Du
Yongshuai Hou
Xianbing Zhao
Hui Xu
Hui Wang
Yang Xiang

Effectively medication recommendation with complex multimorbidity conditions is a critical yet challenging task in healthcare. Most existing works predicted medications based on longitudinal records, which assumed the encoding format of intra-visit medical events are serialized and information transmitted patterns of learning longitudinal sequence data are stable. However, the following conditions may have been ignored: 1) A more compact encoder for intra-relationship in the intra-visit medical event is urgent; 2) Strategies for learning accurate representations of the variable longitudinal sequences of patients are different. In this article, we proposed a novel Sample-adaptive Hierarchical medicAtion Prediction nEtwork, termed SHAPE, to tackle the above challenges in the medication recommendation task. Specifically, we design a compact intra-visit set encoder to encode the relationship in the medical event for obtaining visit-level representation and then develop an inter-visit longitudinal encoder to learn the patient-level longitudinal representation efficiently. To endow the model with the capability of modeling the variable visit length, we introduce a soft curriculum learning method to assign the difficulty of each sample automatically by the visit length. Extensive experiments on a benchmark dataset verify the superiority of our model compared with several state-of-the-art baselines.

Details DOI

AAAI Conference 2023 Conference Paper

Temporal Knowledge Graph Reasoning with Historical Contrastive Learning

Yi Xu
Junjie Ou
Hui Xu
Luoyi Fu

Temporal knowledge graph, serving as an effective way to store and model dynamic relations, shows promising prospects in event forecasting. However, most temporal knowledge graph reasoning methods are highly dependent on the recurrence or periodicity of events, which brings challenges to inferring future events related to entities that lack historical interaction. In fact, the current moment is often the combined effect of a small part of historical information and those unobserved underlying factors. To this end, we propose a new event forecasting model called Contrastive Event Network (CENET), based on a novel training framework of historical contrastive learning. CENET learns both the historical and non-historical dependency to distinguish the most potential entities that can best match the given query. Simultaneously, it trains representations of queries to investigate whether the current moment depends more on historical or non-historical events by launching contrastive learning. The representations further help train a binary classifier whose output is a boolean mask to indicate related entities in the search space. During the inference process, CENET employs a mask-based strategy to generate the final results. We evaluate our proposed model on five benchmark graphs. The results demonstrate that CENET significantly outperforms all existing methods in most metrics, achieving at least 8.3% relative improvement of Hits@1 over previous state-of-the-art baselines on event-based datasets.

PDF Details DOI

AIIM Journal 2022 Journal Article

CATNet: Cross-event attention-based time-aware network for medical event prediction

Sicen Liu
Xiaolong Wang
Yang Xiang
Hui Xu
Hui Wang
Buzhou Tang

Details DOI

IJCAI Conference 2020 Conference Paper

Exploring Parameter Space with Structured Noise for Meta-Reinforcement Learning

Hui Xu
Chong Zhang
Jiaxing Wang
Deqiang Ouyang
Yu Zheng
Jie Shao

Efficient exploration is a major challenge in Reinforcement Learning (RL) and has been studied extensively. However, for a new task existing methods explore either by taking actions that maximize task agnostic objectives (such as information gain) or applying a simple dithering strategy (such as noise injection), which might not be effective enough. In this paper, we investigate whether previous learning experiences can be leveraged to guide exploration of current new task. To this end, we propose a novel Exploration with Structured Noise in Parameter Space (ESNPS) approach. ESNPS utilizes meta-learning and directly uses meta-policy parameters, which contain prior knowledge, as structured noises to perturb the base model for effective exploration in new tasks. Experimental results on four groups of tasks: cheetah velocity, cheetah direction, ant velocity and ant direction demonstrate the superiority of ESNPS against a number of competitive baselines.

PDF Details DOI

TCS Journal 2014 Journal Article

Monoid-matrix type automata

Hui Xu
Jing Tian
Xianzhong Zhao

Details DOI