Author name cluster

Zhong Chen

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

14 papers

2 author rows

AAAI Conference 2026 Short Paper

Fine-tuning Zero-shot Large Language Models for Patient-reported Outcomes (Student Abstract)

Yang Yan
Matthew W. Chen
Jiayi Lyu
Chen Zhao
Hao Gao
Zhong Chen

Radiotherapy (RT) is a cornerstone of cancer treatment. Following RT, patient-reported outcomes (PROs) collected via standardized questionnaires are crucial for monitoring patients' quality of life and side effects. However, traditional statistical and machine learning methods, which rely on structured numerical data, often fail to capture semantic meaning within patients' health status. To address this, we developed a novel framework using zero- and few-shot large language models (LLMs) to identify patients experiencing mild to severe depression. Furthermore, classification performance is enhanced through parameter-efficient fine-tuning. Experiments on a prostate cancer PRO dataset for depression have demonstrated that our fine-tuned LLMs consistently outperformed other baseline methods across key evaluation metrics.

PDF Details DOI

AAAI Conference 2026 Conference Paper

MARLIN: Multi-Agent Reinforcement Learning for Incremental DAG Discovery

Dong Li
Zhengzhang Chen
Xujiang Zhao
Linlin Yu
Zhong Chen
Yi He
Haifeng Chen
Chen Zhao

Uncovering causal structures from observational data is crucial for understanding complex systems and making informed decisions. While reinforcement learning (RL) has shown promise in identifying these structures in the form of a directed acyclic graph (DAG), existing methods often lack efficiency, making them unsuitable for online applications. In this paper, we propose MARLIN, an efficient multi-agent RL-based approach for incremental DAG learning. MARLIN uses a DAG generation policy that maps a continuous real-valued space to the DAG space as an intra-batch strategy, then incorporates two RL agents—state-specific and state-invariant—to uncover causal relationships and integrates these agents into an incremental learning framework. Furthermore, the framework leverages a factored action space to enhance parallelization efficiency. Extensive experiments on synthetic and real datasets demonstrate that MARLIN outperforms state-of-the-art methods in terms of both efficiency and effectiveness.

PDF Details DOI

IJCAI Conference 2025 Conference Paper

A Novel Sparse Active Online Learning Framework for Fast and Accurate Streaming Anomaly Detection Over Data Streams

Zhong Chen
Yi He
Di Wu
Chen Zhao
Meikang Qiu

Online Anomaly Detection (OAD) is critical for identifying rare yet important data points in large, dynamic, and complex data streams. A key challenge lies in achieving accurate and consistent detection of anomalies while maintaining computational and memory efficiency. Conventional OAD approaches, which depend on distributional deviations and static thresholds, struggle with model update delays and catastrophic forgetting, leading to missed detections and high false positive rates. To address these limitations, we propose a novel Streaming Anomaly Detection (SAD) method, grounded in a sparse active online learning framework. Our approach uniquely integrates ℓ1, 2-norm sparse online learning with CUR decomposition-based active learning, enabling simultaneous fast feature selection and dynamic instance selection. The efficient CUR decomposition further supports real-time residual analysis for anomaly scoring, eliminating the need for manual threshold settings about temporal data distributions. Extensive experiments on diverse streaming datasets demonstrate SAD's superiority, achieving a 14. 06% reduction in detection error rates compared to five state-of-the-art competitors.

PDF Details DOI

NeurIPS Conference 2025 Conference Paper

AegisGuard: RL-Guided Adapter Tuning for TEE-Based Efficient & Secure On-Device Inference

CHE WANG
Ziqi Zhang
Yinggui Wang
Tiantong Wang
Yurong Hao
Jianbo Gao
Tao Wei
Yang Cao

On-device large models (LMs) reduce cloud dependency but expose proprietary model weights to the end-user, making them vulnerable to white-box model stealing (MS) attacks. A common defense is TEE-Shielded DNN Partition (TSDP), which places all trainable LoRA adapters (fine tuned on private data) inside a trusted execution environment (TEE). However, this design suffers from excessive host-to-TEE communication latency. We propose AegisGuard, a fine tuning and deployment framework that selectively shields the MS sensitive adapters while offloading the rest to the GPU, balancing security and efficiency. AegisGuard integrates two key components: i) RL-based Sensitivity Measurement (RSM), which injects Gaussian noise during training and applies a lightweight reinforcement learning to rank adapters based on their impact on model stealing; and (ii) Shielded-Adapter Compression (SAC), which structurally prunes the selected adapters to reduce both parameter size and intermediate feature maps, further lowering TEE computation and data transfer costs. Extensive experiments demonstrate that AegisGuard achieves black-box level MS resilience (surrogate accuracy around 39%, matching fully shielded baselines), while reducing end-to-end inference latency by 2–3× and cutting TEE memory usage by 4× compared to state-of-the-art TSDP methods.

PDF Details

AAAI Conference 2025 Conference Paper

Metric-Agnostic Continual Learning for Sustainable Group Fairness

Heng Lian
Chen Zhao
Zhong Chen
Xingquan Zhu
My T. Thai
Yi He

Group Fairness-aware Continual Learning (GFCL) aims to eradicate discriminatory predictions against certain demographic groups in a sequence of diverse learning tasks. This paper explores an even more challenging GFCL problem – how to sustain a fair classifier across a sequence of tasks with covariate shifts and unlabeled data. We propose the MacFRL solution, with its key idea to optimize the sequence of learning tasks. We hypothesize that high-confident learning can be enabled in the optimized task sequence, where the classifier learns from a set of prioritized tasks to glean knowledge, thereby becoming more capable to handle the tasks with substantial distribution shifts that were originally deferred. Theoretical and empirical studies substantiate that MacFRL excels among its GFCL competitors in terms of prediction accuracy and group fair-ness metrics.

PDF Details DOI

TMLR Journal 2025 Journal Article

Multi-Modal Foundation Models for Computational Pathology: A Survey

Dong Li
Guihong Wan
Xintao Wu
Xinyu Wu
Xiaohui Chen
Yi He
Zhong Chen
Peter K Sorger

Foundation models have emerged as a powerful paradigm in computational pathology (CPath), enabling scalable and generalizable analysis of histopathological images. While early developments centered on uni-modal models trained solely on visual data, recent advances have highlighted the promise of multi-modal foundation models that integrate heterogeneous data sources such as textual reports, structured domain knowledge, and molecular profiles. In this survey, we provide a comprehensive and up-to-date review of multi-modal foundation models in CPath, with a particular focus on models built upon hematoxylin and eosin (H&E) stained whole slide images (WSIs) and tile-level representations. We categorize 34 state-of-the-art multi-modal foundation models into three major paradigms: vision-language, vision-knowledge graph, and vision-gene expression. We further divide vision-language models into non-LLM-based and LLM-based approaches. Additionally, we analyze 30 available multi-modal datasets tailored for pathology, grouped into image-text pairs, instruction datasets, and image-other modality pairs. Our survey also presents a taxonomy of downstream tasks, highlights training and evaluation strategies, and identifies key challenges and future directions. We aim for this survey to serve as a valuable resource for researchers and practitioners working at the intersection of pathology and AI.

PDF Details

JBHI Journal 2025 Journal Article

MvKeTR: Chest CT Report Generation With Multi-View Perception and Knowledge Enhancement

Xiwei Deng
Xianchun He
Jianfeng Bao
Yudan Zhou
Shuhui Cai
Congbo Cai
Zhong Chen

CT report generation (CTRG) aims to automatically generate diagnostic reports for 3D volumes, relieving clinicians’ workload and improving patient care. Despite clinical value, existing works fail to effectively incorporate diagnostic information from multiple anatomical views and lack related clinical expertise essential for accurate and reliable diagnosis. To resolve these limitations, we propose a novel M ulti- v iew perception K nowledge- e nhanced T ansfo R mer (MvKeTR) to mimic the diagnostic workflow of clinicians. Just as radiologists first examine CT scans from multiple planes, a Multi-View Perception Aggregator (MVPA) with view-aware attention is proposed to synthesize diagnostic information from multiple anatomical views effectively. Then, inspired by how radiologists further refer to relevant clinical records to guide diagnostic decision-making, a Cross-Modal Knowledge Enhancer (CMKE) is devised to retrieve the most similar reports based on the query volume to incorporate domain knowledge into the diagnosis procedure. Furthermore, instead of traditional MLPs, we employ Kolmogorov-Arnold Networks (KANs) as the fundamental building blocks of both modules, which exhibit superior parameter efficiency and reduced spectral bias to better capture high-frequency components critical for CT interpretation while mitigating overfitting. Extensive experiments on the public CTRG-Chest-548 K dataset demonstrate that our method outpaces prior state-of-the-art (SOTA) models across almost all metrics.

Details DOI

NeurIPS Conference 2025 Conference Paper

SolverLLM: Leveraging Test-Time Scaling for Optimization Problem via LLM-Guided Search

Dong Li
Xujiang Zhao
Linlin Yu
Yanchi Liu
Wei Cheng
Zhengzhang Chen
Zhong Chen
Feng Chen

Large Language Models (LLMs) offer promising capabilities for tackling complex reasoning tasks, including optimization problems. However, existing methods either rely on prompt engineering, which leads to poor generalization across problem types, or require costly supervised training. We introduce SolverLLM, a training-free framework that leverages test-time scaling to solve diverse optimization problems. Rather than solving directly, SolverLLM generates mathematical formulations and translates them into solver-ready code, guided by a novel Monte Carlo Tree Search (MCTS) strategy. To enhance the search process, we modify classical MCTS with (1) dynamic expansion for adaptive formulation generation, (2) prompt backpropagation to guide exploration via outcome-driven feedback, and (3) uncertainty backpropagation to incorporate reward reliability into decision-making. Experiments on six standard benchmark datasets demonstrate that SolverLLM outperforms both prompt-based and learning-based baselines, achieving strong generalization without additional training.

PDF Details

AAAI Conference 2024 Conference Paper

MKG-FENN: A Multimodal Knowledge Graph Fused End-to-End Neural Network for Accurate Drug–Drug Interaction Prediction

Di Wu
Wu Sun
Yi He
Zhong Chen
Xin Luo

Taking incompatible multiple drugs together may cause adverse interactions and side effects on the body. Accurate prediction of drug-drug interaction (DDI) events is essential for avoiding this issue. Recently, various artificial intelligence-based approaches have been proposed for predicting DDI events. However, DDI events are associated with complex relationships and mechanisms among drugs, targets, enzymes, transporters, molecular structures, etc. Existing approaches either partially or loosely consider these relationships and mechanisms by a non-end-to-end learning framework, resulting in sub-optimal feature extractions and fusions for prediction. Different from them, this paper proposes a Multimodal Knowledge Graph Fused End-to-end Neural Network (MKGFENN) that consists of two main parts: multimodal knowledge graph (MKG) and fused end-to-end neural network (FENN). First, MKG is constructed by comprehensively exploiting DDI events-associated relationships and mechanisms from four knowledge graphs of drugs-chemical entities, drug-substructures, drugs-drugs, and molecular structures. Correspondingly, a four channels graph neural network is designed to extract high-order and semantic features from MKG. Second, FENN designs a multi-layer perceptron to fuse the extracted features by end-to-end learning. With such designs, the feature extractions and fusions of DDI events are guaranteed to be comprehensive and optimal for prediction. Through extensive experiments on real drug datasets, we demonstrate that MKG-FENN exhibits high accuracy and significantly outperforms state-of-the-art models in predicting DDI events. The source code and supplementary file of this article are available on: https://github.com/wudi1989/MKG-FENN.

PDF Details DOI

AAAI Conference 2023 Conference Paper

Online Semi-supervised Learning with Mix-Typed Streaming Features

Di Wu
Shengda Zhuo
Yu Wang
Zhong Chen
Yi He

Online learning with feature spaces that are not fixed but can vary over time renders a seemingly flexible learning paradigm thus has drawn much attention. Unfortunately, two restrictions prohibit a ubiquitous application of this learning paradigm in practice. First, whereas prior studies mainly assume a homogenous feature type, data streams generated from real applications can be heterogeneous in which Boolean, ordinal, and continuous co-exist. Existing methods that prescribe parametric distributions such as Gaussians would not suffice to model the correlation among such mixtyped features. Second, while full supervision seems to be a default setup, providing labels to all arriving data instances over a long time span is tangibly onerous, laborious, and economically unsustainable. Alas, a semi-supervised online learner that can deal with mix-typed, varying feature spaces is still missing. To fill the gap, this paper explores a novel problem, named Online Semi-supervised Learning with Mixtyped streaming Features (OSLMF), which strives to relax the restrictions on the feature type and supervision information. Our key idea to solve the new problem is to leverage copula model to align the data instances with different feature spaces so as to make their distance measurable. A geometric structure underlying data instances is then established in an online fashion based on their distances, through which the limited labeling information is propagated, from the scarce labeled instances to their close neighbors. Experimental results are documented to evidence the viability and effectiveness of our proposed approach. Code is released in https://github.com/wudi1989/OSLMF.

PDF Details DOI

YNIMG Journal 2022 Journal Article

Single-shot multi-parametric mapping based on multiple overlapping-echo detachment (MOLED) imaging

Lingceng Ma
Jian Wu
Qinqin Yang
Zihan Zhou
Hongjian He
Jianfeng Bao
Lijun Bao
Xiaoyin Wang

Multi-parametric quantitative magnetic resonance imaging (mqMRI) allows the characterization of multiple tissue properties non-invasively and has shown great potential to enhance the sensitivity of MRI measurements. However, real-time mqMRI during dynamic physiological processes or general motions remains challenging. To overcome this bottleneck, we propose a novel mqMRI technique based on multiple overlapping-echo detachment (MOLED) imaging, termed MQMOLED, to enable mqMRI in a single shot. In the data acquisition of MQMOLED, multiple MR echo signals with different multi-parametric weightings and phase modulations are generated and acquired in the same k-space. The k-space data is Fourier transformed and fed into a well-trained neural network for the reconstruction of multi-parametric maps. We demonstrated the accuracy and repeatability of MQMOLED in simultaneous mapping apparent proton density (APD) and any two parameters among T2, T2*, and apparent diffusion coefficient (ADC) in 130–170 ms. The abundant information delivered by the multiple overlapping-echo signals in MQMOLED makes the technique potentially robust to system imperfections, such as inhomogeneity of static magnetic field or radiofrequency field. Benefitting from the single-shot feature, MQMOLED exhibits a strong motion tolerance to the continuous movements of subjects. For the first time, it captured the synchronous changes of ADC, T2, and T1-weighted APD in contrast-enhanced perfusion imaging on patients with brain tumors, providing additional information about vascular density to the hemodynamic parametric maps. We expect that MQMOLED would promote the development of mqMRI technology and greatly benefit the applications of mqMRI, including therapeutics and analysis of metabolic/functional processes.

Details DOI

ICRA Conference 2019 Conference Paper

Automatic Targeting of Plant Cells via Cell Segmentation and Robust Scene-Adaptive Tracking

Ishara Paranawithana
Zhong Hoo Chau
Liangjing Yang
Zhong Chen
Kamal Youcef-Toumi
U-Xuan Tan

Automatic targeting of plant cells to perform tasks like extraction of chloroplast is often desired in the study of plant biology. Hence, this paper proposes an improved cell segmentation method combined with a robust tracking algorithm for vision-guided micromanipulation in plant cells. The objective of this work is to develop an automatic plant cell detection and localization technique to complete the automated workflow for plant cell manipulation. The complex structural properties of plant cells make both segmentation of cells and visual tracking of the microneedle immensely challenging, unlike single animal cell applications. Thus, an improved version of watershed segmentation with adaptive thresholding is proposed to detect the plant cells without the need for staining of the cells or additional tedious preparations. To manipulate the needle to reach the identified centroid of the cells, tracking of the needle tip is required. Visual and motion information from two data sources namely, template tracking and projected manipulator trajectory are combined using score-based normalized weighted averaging to continuously track the microneedle. The selection of trackers is influenced by their complementary nature as the former and latter are individually robust against physical and visual uncertainties, respectively. Experimental results validate the effectiveness of the proposed method by detecting plant cell centroids accurately, tracking the microneedle constantly and reaching the plant cell of interest despite the presence of visual disturbances.

Details

YNIMG Journal 2017 Journal Article

Changes in brain iron concentration after exposure to high-altitude hypoxia measured by quantitative susceptibility mapping

Lin Chen
Congbo Cai
Tianhe Yang
Jianzhong Lin
Shuhui Cai
Jiaxing Zhang
Zhong Chen

Hypoxia can induce physiological changes. This study aims to explore effects of high-altitude (HA) hypoxia on cerebral iron concentration. Twenty-nine healthy sea-level participants were tested shortly before and after approximately 4-week adaptation to the HA environment at fQinghai-Tibet Plateau (4200m), and were re-investigated after re-adaptation to the sea-level environment one year later. Iron concentration was quantified with quantitative susceptibility mapping (QSM), and the results were compared with transverse relaxation rate (R* 2) measurements. The variations of magnetic susceptibility indicate that the iron concentration in gray matter regions, especially in basal ganglia, including caudate nucleus, putamen, globus pallidus and substantia nigra, increases significantly after HA exposure. This increase appears consistent with the conclusion from R* 2 value variations. However, unlike QSM, the R* 2 value fails to demonstrate the statistical difference of iron content in red nucleus. The re-investigation results show that most variations are recovered after sea-level re-adaptation for one year. Additionally, hemisphere- and gender-related differences in iron concentration changes were analyzed among cerebral regions. The results show greater possibilities in the right hemisphere and females. Further studies based on diffusion tensor imaging (DTI) suggest that the fractional anisotropy increases and the mean diffusivity decreases after HA exposure in six deep gray matter nuclei, with linear dependence on iron concentration only in putamen. In conclusion, the magnetic susceptibility value can serve as a quantitative marker of brain iron, and variations of regional susceptibility reported herein indicate that HA hypoxia can result in significant iron deposition in most deep gray matter regions. Additionally, the linear dependence of DTI metrics on iron concentration in putamen indicates a potential relationship between ferritin and water diffusion.

Details DOI

YNIMG Journal 2015 Journal Article

Imaging with referenceless distortion correction and flexible regions of interest using single-shot biaxial spatiotemporally encoded MRI

Jing Li
Lin Chen
Shuhui Cai
Congbo Cai
Jianhui Zhong
Zhong Chen

Owing to its intrinsic characteristics, spatiotemporally encoded (SPEN) imaging is less sensitive to adverse effects due to field inhomogeneity in comparison with echo planar imaging, a feature highly desired for functional, diffusion, and real-time MRI. However, the quality of images obtained with SPEN MRI is still degraded by geometric distortions when field inhomogeneity exists. In this study, a single-shot biaxial SPEN (bi-SPEN) pulse sequence is implemented, utilizing a 90° and a 180° chirp pulse incorporated with two orthogonal gradients. A referenceless geometric-distortion correction based on the single-shot bi-SPEN sequence is then proposed. The distorted image acquired with the single-shot bi-SPEN sequence is corrected by iterative super-resolved reconstruction involving the field gradients estimated from a field map, which in turn is obtained from its own super-resolved data after a phase-unwrapping procedure without additional scans. In addition, the distortion correction method is applied to improve the quality of the multiple region-of-interest images obtained with single-shot bi-SPEN sequence.

Details DOI