Author name cluster

Zhenyu Liu

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

14 papers

1 author row

EAAI Journal 2026 Journal Article

An intelligent vision-based method for real-time pig disease identification through postural feature analysis

Zhe Yin
Yue Cao
Hong Feng
Qiqi Guo
Xuan Wang
Zhenyu Liu

Details DOI

EAAI Journal 2025 Journal Article

A non-negative garrote shrinkage network with adaptive Swish for rotating machinery fault diagnosis under noisy environment

Pengcheng Zhong
Zhenyu Liu
Rui Li
Hui Liu
Xiaoqi Yang
Zihan Dong
Jianrong Tan

Details DOI

NeurIPS Conference 2025 Conference Paper

A Unified Analysis of Stochastic Gradient Descent with Arbitrary Data Permutations and Beyond

Yipeng Li
Xinchen Lyu
Zhenyu Liu

We aim to provide a unified convergence analysis for permutation-based Stochastic Gradient Descent (SGD), where data examples are permuted before each epoch. By examining the relations among permutations, we categorize existing permutation-based SGD algorithms into three categories: Arbitrary Permutations, Independent Permutations (including Random Reshuffling and FlipFlop Rajput et al. , 2022), Dependent Permutations (including GraBs Lu et al. , 2022a; Cooper et al. , 2023). Existing unified analyses failed to encompass the Dependent Permutations category due to the inter-epoch permutation dependency. In this work, we propose a generalized assumption that explicitly characterizes the dependence of permutations across epochs. Building upon this assumption, we develop a unified framework for permutation-based SGD with arbitrary permutations of examples, incorporating all the existing permutation-based SGD algorithms. Furthermore, we adapt our framework for Federated Learning (FL), developing a unified framework for regularized client participation FL with arbitrary permutations of clients.

PDF Details

YNICL Journal 2025 Journal Article

Can repetitive transcranial magnetic stimulation promote recovery of consciousness in patients with disorders of consciousness? A randomized controlled trial

Zhenyu Liu
Shanshan Wu
Shuwei Wang
Huijuan Wu
Hongliang Gao
Xiao Lu

Details DOI

EAAI Journal 2025 Journal Article

Maneuvering target interception via deep reinforcement learning guidance using only line-of-sight rate measurement

Leliang Ren
Yong Xian
Zhenyu Liu
Shaopeng Li
Daqiao Zhang
Weilin Guo

Details DOI

NeurIPS Conference 2024 Conference Paper

In-Context Learning State Vector with Inner and Momentum Optimization

Dongfang Li
Zhenyu Liu
Xinshuo Hu
Zetian Sun
Baotian Hu
Min Zhang

Large Language Models (LLMs) have exhibited an impressive ability to perform In-Context Learning (ICL) from only a few examples. Recent works have indicated that the functions learned by ICL can be represented through compressed vectors derived from the transformer. However, the working mechanisms and optimization of these vectors are yet to be thoroughly explored. In this paper, we address this gap by presenting a comprehensive analysis of these compressed vectors, drawing parallels to the parameters trained with gradient descent, and introducing the concept of state vector. Inspired by the works on model soup and momentum-based gradient descent, we propose inner and momentum optimization methods that are applied to refine the state vector progressively as test-time adaptation. Moreover, we simulate state vector aggregation in the multiple example setting, where demonstrations comprising numerous examples are usually too lengthy for regular ICL, and further propose a divide-and-conquer aggregation method to address this challenge. We conduct extensive experiments using Llama-2 and GPT-J in both zero-shot setting and few-shot setting. The experimental results show that our optimization method effectively enhances the state vector and achieves the state-of-the-art performance on diverse tasks.

PDF Details DOI

AAAI Conference 2024 Conference Paper

Point Transformer with Federated Learning for Predicting Breast Cancer HER2 Status from Hematoxylin and Eosin-Stained Whole Slide Images

Bao Li
Zhenyu Liu
Lizhi Shao
Bensheng Qiu
Hong Bu
Jie Tian

Directly predicting human epidermal growth factor receptor 2 (HER2) status from widely available hematoxylin and eosin (HE)-stained whole slide images (WSIs) can reduce technical costs and expedite treatment selection. Accurately predicting HER2 requires large collections of multi-site WSIs. Federated learning enables collaborative training of these WSIs without gigabyte-size WSIs transportation and data privacy concerns. However, federated learning encounters challenges in addressing label imbalance in multi-site WSIs from the real world. Moreover, existing WSI classification methods cannot simultaneously exploit local context information and long-range dependencies in the site-end feature representation of federated learning. To address these issues, we present a point transformer with federated learning for multi-site HER2 status prediction from HE-stained WSIs. Our approach incorporates two novel designs. We propose a dynamic label distribution strategy and an auxiliary classifier, which helps to establish a well-initialized model and mitigate label distribution variations across sites. Additionally, we propose a farthest cosine sampling based on cosine distance. It can sample the most distinctive features and capture the long-range dependencies. Extensive experiments and analysis show that our method achieves state-of-the-art performance at four sites with a total of 2687 WSIs. Furthermore, we demonstrate that our model can generalize to two unseen sites with 229 WSIs. Code is available at: https://github.com/boyden/PointTransformerFL

PDF Details DOI

AAAI Conference 2024 Conference Paper

Separate the Wheat from the Chaff: Model Deficiency Unlearning via Parameter-Efficient Module Operation

Xinshuo Hu
Dongfang Li
Baotian Hu
Zihao Zheng
Zhenyu Liu
Min Zhang

Large language models (LLMs) have been widely used in various applications but are known to suffer from issues related to untruthfulness and toxicity. While parameter-efficient modules (PEMs) have demonstrated their effectiveness in equipping models with new skills, leveraging PEMs for deficiency unlearning remains underexplored. In this work, we propose a PEMs operation approach, namely Extraction-before-Subtraction (Ext-Sub), to enhance the truthfulness and detoxification of LLMs through the integration of ``expert'' PEM and ``anti-expert'' PEM. Remarkably, even anti-expert PEM possess valuable capabilities due to their proficiency in generating fabricated content, which necessitates language modeling and logical narrative competence. Rather than merely negating the parameters, our approach involves extracting and eliminating solely the deficiency capability within anti-expert PEM while preserving the general capabilities. To evaluate the effectiveness of our approach in terms of truthfulness and detoxification, we conduct extensive experiments on LLMs, encompassing additional abilities such as language modelling and mathematical reasoning. Our empirical results demonstrate that our approach effectively improves truthfulness and detoxification, while largely preserving the fundamental abilities of LLMs.

PDF Details DOI

JBHI Journal 2024 Journal Article

Stimulus-Response Patterns: The Key to Giving Generalizability to Text-Based Depression Detection Models

Zhenyu Liu
Yang Wu
Haibo Zhang
Gang Li
Zhijie Ding
Bin Hu

Text content analysis for depression detection using machine learning techniques has become a prominent area of research. However, previous studies focused mainly on analyzing the textual content, neglecting the fundamental factors driving text generation. Consequently, existing models face the challenge of poor generalization to out-of-domain data as they struggle to capture the crucial features of depression. To address this, we propose a novel computational perspective of “stimulus-response patterns” that brings us closer to the essence of clinical diagnosis of depression. Adopting this computational perspective allows us to conceptually unify diverse datasets and generalize this perspective to common datasets in the field. We introduce the Stimulus-Response Patterns-aware Network (SRP-Net) as an exemplary approach within this computational perspective. To assess the performance of the SRP-Net, we constructed a multi-stimulus dataset and conducted experimental evaluations, demonstrating its exceptional cross-stimulus generalizability. Furthermore, we demonstrated the promising performance of SPR-Net in real medical scenarios and conducted an interpretability analysis of the stimulus-response patterns. Our research investigates the critical role of stimulus-response patterns in enhancing the generalizability of text-based depression detection models, which can potentially facilitate data-driven depression detection to approach the diagnostic accuracy of psychiatrists.

Details DOI

JBHI Journal 2023 Journal Article

Development of Prognostic Biomarkers by TMB-Guided WSI Analysis: A Two-Step Approach

Xiangyu Liu
Zhenyu Liu
Ye Yan
Kai Wang
Aodi Wang
Xiongjun Ye
Liwei Wang
Wei Wei

The rapid development of computational pathology has brought new opportunities for prognosis prediction using histopathological images. However, the existing deep learning frameworks lack exploration of the relationship between images and other prognostic information, resulting in poor interpretability. Tumor mutation burden (TMB) is a promising biomarker for predicting the survival outcomes of cancer patients, but its measurement is costly. Its heterogeneity may be reflected in histopathological images. Here, we report a two-step framework for prognostic prediction using whole-slide images (WSIs). First, the framework adopts a deep residual network to encode the phenotype of WSIs and classifies patient-level TMB by the deep features after aggregation and dimensionality reduction. Then, the patients' prognosis is stratified by the TMB-related information obtained during the classification model development. Deep learning feature extraction and TMB classification model construction are performed on an in-house dataset of 295 Haematoxylin & Eosin stained WSIs of clear cell renal cell carcinoma (ccRCC). The development and evaluation of prognostic biomarkers are performed on The Cancer Genome Atlas-Kidney ccRCC (TCGA-KIRC) project with 304 WSIs. Our framework achieves good performance for TMB classification with an area under the receiver operating characteristic curve (AUC) of 0. 813 on the validation set. Through survival analysis, our proposed prognostic biomarkers can achieve significant stratification of patients' overall survival (P $< $ 0. 05) and outperform the original TMB signature in risk stratification of patients with advanced disease. The results indicate the feasibility of mining TMB-related information from WSI to achieve stepwise prognosis prediction.

Details DOI

AAAI Conference 2023 Conference Paper

One-for-All: Proposal Masked Cross-Class Anomaly Detection

Xincheng Yao
Chongyang Zhang
Ruoqi Li
Jun Sun
Zhenyu Liu

One of the most challenges for anomaly detection (AD) is how to learn one unified and generalizable model to adapt to multi-class especially cross-class settings: the model is trained with normal samples from seen classes with the objective to detect anomalies from both seen and unseen classes. In this work, we propose a novel Proposal Masked Anomaly Detection (PMAD) approach for such challenging multi- and cross-class anomaly detection. The proposed PMAD can be adapted to seen and unseen classes by two key designs: MAE-based patch-level reconstruction and prototype-guided proposal masking. First, motivated by MAE (Masked AutoEncoder), we develop a patch-level reconstruction model rather than the image-level reconstruction adopted in most AD methods for this reason: the masked patches in unseen classes can be reconstructed well by using the visible patches and the adaptive reconstruction capability of MAE. Moreover, we improve MAE by ViT encoder-decoder architecture, combinational masking, and visual tokens as reconstruction objectives to make it more suitable for anomaly detection. Second, we develop a two-stage anomaly detection manner during inference. In the proposal masking stage, the prototype-guided proposal masking module is utilized to generate proposals for suspicious anomalies as much as possible, then masked patches can be generated from the proposal regions. By masking most likely anomalous patches, the “shortcut reconstruction” issue (i.e., anomalous regions can be well reconstructed) can be mostly avoided. In the reconstruction stage, these masked patches are then reconstructed by the trained patch-level reconstruction model to determine if they are anomalies. Extensive experiments show that the proposed PMAD can outperform current state-of-the-art models significantly under the multi- and especially cross-class settings. Code will be publicly available at https://github.com/xcyao00/PMAD.

PDF Details DOI

IJCAI Conference 2023 Conference Paper

PPAT: Progressive Graph Pairwise Attention Network for Event Causality Identification

Zhenyu Liu
Baotian Hu
Zhenran Xu
Min Zhang

Event Causality Identification (ECI) aims to identify the causality between a pair of event mentions in a document, which is composed of sentence-level ECI (SECI) and document-level ECI (DECI). Previous work applies various reasoning models to identify the implicit event causality. However, they indiscriminately reason all event causality in the same way, ignoring that most inter-sentence event causality depends on intra-sentence event causality to infer. In this paper, we propose a progressive graph pairwise attention network (PPAT) to consider the above dependence. PPAT applies a progressive reasoning strategy, as it first predicts the intra-sentence event causality, and then infers the more implicit inter-sentence event causality based on the SECI result. We construct a sentence boundary event relational graph, and PPAT leverages a simple pairwise attention mechanism, which attends to different reasoning chains on the graph. In addition, we propose a causality-guided training strategy for assisting PPAT in learning causality-related representations on every layer. Extensive experiments show that our model achieves state-of-the-art performance on three benchmark datasets (5. 5%, 2. 2% and 4. 5% F1 gains on EventStoryLine, MAVEN-ERE and Causal-TimeBank). Code is available at https: //github. com/HITsz-TMG/PPAT.

PDF Details DOI

AAAI Conference 2018 Conference Paper

Computation Error Analysis of Block Floating Point Arithmetic Oriented Convolution Neural Network Accelerator Design

Zhourui Song
Zhenyu Liu
Dongsheng Wang

The heavy burdens of computation and off-chip trafﬁc impede deploying the large scale convolution neural network on embedded platforms. As CNN is attributed to the strong endurance to computation errors, employing block ﬂoating point (BFP) arithmetics in CNN accelerators could save the hardware cost and data trafﬁcs efﬁciently, while maintaining the classiﬁcation accuracy. In this paper, we verify the effects of word width deﬁnitions in BFP to the CNN performance without retraining. Several typical CNN models, including VGG16, ResNet-18, ResNet-50 and GoogLeNet, were tested in this paper. Experiments revealed that 8-bit mantissa, including sign bit, in BFP representation merely induced less than 0. 3% accuracy loss. In addition, we investigate the computational errors in theory and develop the noise-to-signal ratio (NSR) upper bound, which provides the promising guidance for BFP based CNN engine design.

PDF Details

YNICL Journal 2018 Journal Article

Radiomics analysis allows for precise prediction of epilepsy in patients with low-grade gliomas

Zhenyu Liu
Yinyan Wang
Xing Liu
Yang Du
Zhenchao Tang
Kai Wang
Jingwei Wei
Di Dong

Details DOI