Author name cluster

Jiale Liu

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

9 papers

2 author rows

AAAI Conference 2026 Conference Paper

Cross Modal Fine-grained Alignment via Granularity-aware and Region-uncertain Modeling

Jiale Liu
Haoming Zhou
Yishu Liu
Bingzhi Chen
Yuncheng Jiang

Fine-grained image-text alignment is a pivotal challenge in multimodal learning, underpinning key applications such as visual question answering, image captioning, and vision-language navigation. Unlike global alignment, fine-grained alignment requires precise correspondence between localized visual regions and textual tokens, often hindered by noisy attention mechanisms and oversimplified modeling of cross-modal relationships. In this work, we identify two fundamental limitations of existing approaches: the lack of robust intra-modal mechanisms to assess the significance of visual and textual tokens, leading to poor generalization in complex scenes; and the absence of fine-grained uncertainty modeling, which fails to capture the one-to-many and many-to-one nature of region-word correspondences. To address these issues, we propose a unified approach that incorporates significance-aware and granularity-aware modeling and region-level uncertainty modeling. Our method leverages modality-specific biases to identify salient features without relying on brittle cross-modal attention, and represents region features as a mixture of Gaussian distributions to capture fine-grained uncertainty. Extensive experiments on Flickr30K and MS-COCO demonstrate that our approach achieves state-of-the-art performance across various backbone architectures, significantly enhancing the robustness and interpretability of fine-grained image-text alignment.

PDF Details DOI

JBHI Journal 2026 Journal Article

RLAD: A Reliable Hippo-Guided Multi-Task Model for Alzheimer's Disease Diagnosis

Zhenxin Lei
Wenjing Zhu
Jiale Liu
Cong Hua
Johann Li
Syed Afaq Ali Shah
Liang Zhang
Mohammed Bennamoun

Early diagnosis of Alzheimer's disease (AD) is crucial for its prevention, and hippocampal atrophy is a significant lesion for early diagnosis. The current DL-based AD diagnosis methods only focus on either AD classification or hippocampus segmentation independently, neglecting the correlation between the two tasks and lacking pathological interpretability. To address this issue, we propose a Reliable Hippo-guided Learning model for Alzheimer's Disease diagnosis (RLAD), which employs multi-task learning for AD classification as a main task supplemented by hippocampus segmentation. More specifically, our model consists of 1) a hybrid shared features encoder that encodes local and global information in MRI to enhance the model's ability to learn discriminative features; 2) Task Specific Decoders to accomplish AD classification and hippocampus segmentation; and 3) Task Coordination module to correlate the two tasks and guide the classification task to focus on the hippocampus area. Our proposed RLAD model is evaluated on MRI scans of 1631 subjects from three independent datasets, including ADNI-1, ADNI-2, and HarP. Our extensive experimental results demonstrate that the proposed model significantly improves the performance of AD classification and hippocampus segmentation with strong generalization capabilities.

Details DOI

EAAI Journal 2025 Journal Article

Automatic measurement of surface parameters of composite braided fabrics based on modified U-shaped network

Yicen Gao
Jiale Liu
Zhongde Shan
Zitong Guo
Zheng Sun
Xiangyu Zhu
Jiaqi Sun

Braided composite materials have broad application prospects in advanced fields such as aerospace and rail transportation. Studies have shown that surface parameters of braided fabrics—such as braiding angles and pitch lengths—significantly influence their performance. However, measuring these parameters during mass production presents challenges related to automation, efficiency, and precision. Traditional image processing methods often suffer from limited accuracy, while deep learning approaches are difficult to implement, and their generalization ability remains unverified. In this paper, a measurement method based on a modified U-shaped network (denoted as U-Net) is proposed. First, a data-augmented image dataset of fabric surfaces with yarn edge annotations is constructed. Then, an enhanced U-Net model incorporating Dropout and an encoder based on the Visual Geometry Group's 16-layer (denoted as VGG16) network is applied, along with a weighted mixed loss function combining Focal Loss and Dice Loss to improve detection of weak yarn edge regions. Morphological operations and distortion rectification are employed for post-processing and noise reduction. Experimental results show that the proposed method improves measurement accuracy by up to 1. 84 % across various fabric samples, demonstrating excellent stability and generalization performance.

Details DOI

AAAI Conference 2025 Conference Paper

Self-Explainable Graph Transformer for Link Sign Prediction

Lu Li
Jiale Liu
Xingyu Ji
Maojun Wang
Zeyu Zhang

Signed Graph Neural Networks (SGNNs) have been shown to be effective in analyzing complex patterns in real-world situations where positive and negative links coexist. However, SGNN models suffer from poor explainability, which limit their adoptions in critical scenarios that require understanding the rationale behind predictions. To the best of our knowledge, there is currently no research work on the explainability of the SGNN models. Our goal is to address the explainability of decision-making for the downstream task of link sign prediction specific to signed graph neural networks. Since post-hoc explanations are not derived directly from the models, they may be biased and misrepresent the true explanations. Therefore, in this paper we introduce a Self-Explainable Signed Graph transformer (SE-SGformer) framework, which can not only outputs explainable information while ensuring high prediction accuracy. Specifically, we propose a new Transformer architecture for signed graphs and theoretically demonstrate that using positional encoding based on signed random walks has greater expressive power than current SGNN methods and other positional encoding graph Transformer-based approaches. We construct a novel explainable decision process by discovering the K-nearest (farthest) positive (negative) neighbors of a node to replace the neural network-based decoder for predicting edge signs. These K positive (negative) neighbors represent crucial information about the formation of positive (negative) edges between nodes and thus can serve as important explanatory information in the decision-making process. We conducted experiments on several real-world datasets to validate the effectiveness of SE-SGformer, which outperforms the state-of-the-art methods by improving 2.2% prediction accuracy and 73.1% explainablity accuracy in the best-case scenario.

PDF Details DOI

ICML Conference 2025 Conference Paper

Which Agent Causes Task Failures and When? On Automated Failure Attribution of LLM Multi-Agent Systems

Shaokun Zhang
Ming Yin 0009
Jieyu Zhang 0001
Jiale Liu
Zhiguang Han
Jingyang Zhang
Beibin Li
Chi Wang 0001

Failure attribution in LLM multi-agent systems—identifying the agent and step responsible for task failures—provides crucial clues for systems debugging but remains underexplored and labor-intensive. In this paper, we propose and formulate a new research area: automated failure attribution for LLM multi-agent systems. To support this initiative, we introduce the Who&When dataset, comprising extensive failure logs from 127 LLM multi-agent systems with fine-grained annotations linking failures to specific agents and decisive error steps. Using the Who&When, we develop and evaluate three automated failure attribution methods, summarizing their corresponding pros and cons. The best method achieves 53. 5% accuracy in identifying failure-responsible agents but only 14. 2% in pinpointing failure steps, with some methods performing below random. Even SOTA reasoning models, such as OpenAI o1 and DeepSeek R1, fail to achieve practical usability. These results highlight the task’s complexity and the need for further research in this area. Code and dataset are available in https: //github. com/mingyin1/Agents_Failure_Attribution.

Details

ICLR Conference 2024 Conference Paper

IDEAL: Influence-Driven Selective Annotations Empower In-Context Learners in Large Language Models

Shaokun Zhang
Xiaobo Xia
Zhaoqing Wang
Linghao Chen
Jiale Liu
Qingyun Wu
Tongliang Liu

In-context learning is a promising paradigm that utilizes in-context examples as prompts for the predictions of large language models. These prompts are crucial for achieving strong performance. However, since the prompts need to be sampled from a large volume of annotated examples, finding the right prompt may result in high annotation costs. To address this challenge, this paper introduces an influence-driven selective annotation method that aims to minimize annotation costs while improving the quality of in-context examples. The essence of our method is to select a pivotal subset from a large-scale unlabeled data pool to annotate for the subsequent sampling of prompts. Specifically, a directed graph is first constructed to represent unlabeled data. Afterward, the influence of candidate unlabeled subsets is quantified with a diffusion process. A simple yet effective greedy algorithm for unlabeled data selection is lastly introduced. It iteratively selects the data if it provides a maximum marginal gain with respect to quantified influence. Compared with previous efforts on selective annotations, our influence-driven method works in an end-to-end manner, avoids an intractable explicit balance between data diversity and representativeness, and enjoys theoretical support. Experiments confirm the superiority of the proposed method on various benchmarks, achieving better performance under lower time consumption during subset selection.

Details

ICML Conference 2024 Conference Paper

Offline Training of Language Model Agents with Functions as Learnable Weights

Shaokun Zhang
Jieyu Zhang 0001
Jiale Liu
Linxin Song
Chi Wang 0001
Ranjay Krishna
Qingyun Wu

Researchers and practitioners have recently reframed powerful Large Language Models (LLMs) as agents, enabling them to automate complex tasks largely via the use of specialized functions. To facilitate the development of LLM agents, we present a novel paradigm of training LLM agents without modifying the LLM weights, which is particularly useful when the LLMs are difficult or inaccessible for modifications. Inspired by how humans continuously forge tools to adapt to real-world tasks, rather than change our biological structure to fit a static set of tools, we propose to progressively forge agent’s functions to better solve the downstream tasks instead of modifying the LLM weights. By treating the functions as learnable ‘agent parameters’ and leveraging the fundamental idea of model training in artificial intelligence, we develop AgentOptimizer that employs the LLM to update agents’ functions and devise an agent training algorithm with two strategies, roll-back, and early-stop, to streamline the training process. With extensive experiments, we showcase that the agent training paradigm could significantly improve the performance of representative LLM agents in various downstream tasks. We also study the behavior of the agent training regarding aspects like the learning curve and domain transferability.

Details

ICML Conference 2024 Conference Paper

Refined Coreset Selection: Towards Minimal Coreset Size under Model Performance Constraints

Xiaobo Xia
Jiale Liu
Shaokun Zhang
Qingyun Wu
Hongxin Wei
Tongliang Liu

Coreset selection is powerful in reducing computational costs and accelerating data processing for deep learning algorithms. It strives to identify a small subset from large-scale data, so that training only on the subset practically performs on par with full data. Practitioners regularly desire to identify the smallest possible coreset in realistic scenes while maintaining comparable model performance, to minimize costs and maximize acceleration. Motivated by this desideratum, for the first time, we pose the problem of refined coreset selection, in which the minimal coreset size under model performance constraints is explored. Moreover, to address this problem, we propose an innovative method, which maintains optimization priority order over the model performance and coreset size, and efficiently optimizes them in the coreset selection procedure. Theoretically, we provide the convergence guarantee of the proposed method. Empirically, extensive experiments confirm its superiority compared with previous strategies, often yielding better model performance with smaller coreset sizes. The implementation is available at https: //github. com/xiaoboxia/LBCS.

Details

ICLR Conference 2023 Conference Paper

Moderate Coreset: A Universal Method of Data Selection for Real-world Data-efficient Deep Learning

Xiaobo Xia
Jiale Liu
Jun Yu 0001
Xu Shen 0001
Bo Han 0003
Tongliang Liu

Deep learning methods nowadays rely on massive data, resulting in substantial costs of data storage and model training. Data selection is a useful tool to alleviate such costs, where a coreset of massive data is extracted to practically perform on par with full data. Based on carefully-designed score criteria, existing methods first count the score of each data point and then select the data points whose scores lie in a certain range to construct a coreset. These methods work well in their respective preconceived scenarios but are not robust to the change of scenarios, since the optimal range of scores varies as the scenario changes. The issue limits the application of these methods, because realistic scenarios often mismatch preconceived ones, and it is inconvenient or unfeasible to tune the criteria and methods accordingly. In this paper, to address the issue, a concept of the moderate coreset is discussed. Specifically, given any score criterion of data selection, different scenarios prefer data points with scores in different intervals. As the score median is a proxy of the score distribution in statistics, the data points with scores close to the score median can be seen as a proxy of full data and generalize different scenarios, which are used to construct the moderate coreset. As a proof-of-concept, a universal method that inherits the moderate coreset and uses the distance of a data point to its class center as the score criterion, is proposed to meet complex realistic scenarios. Extensive experiments confirm the advance of our method over prior state-of-the-art methods, leading to a strong baseline for future research. The implementation is available at https://github.com/tmllab/Moderate-DS.

Details