Author name cluster

Zhenwei Shi

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

6 papers

2 author rows

JBHI Journal 2026 Journal Article

BLADE: Breast Lesion Analysis with Domain Expertise for DCE-MRI Diagnosis

Zhitao Wei
Yi Dai
Yanting Liang
Chinting Wong
Yanfen Cui
Xiaobo Chen
Zhihe Zhao
Xiaodong Zheng

Dynamic Contrast-Enhanced Magnetic Reso nance Imaging (DCE-MRI) is pivotal in breast cancer diag nosis, yet radiologists face challenges in interpreting its complex data due to the lack of robust automated tools. Current lesion diagnosis systems struggle with limited datasets and insufficient integration of domain knowledge. To overcome these limitations, we propose Breast Lesion Analysis with DomainExpertise(BLADE), anoveldiagnosis framework that synergizes deep learning with clinical ex pertise. BLADE leverages a pre-trained vertical foundation model (optimized via Momentum Contrast on 2. 1 million MRI slices) as its encoder, ensuring robust feature extraction. Crucially, the system incorporates prior multi-phasic hemodynamic knowledge to emulate radiologists' diagnos tic reasoning and introduces a Breast Imaging Reporting and Data System (BI-RADS)-based constraint during training to align predictions with clinical standards. Extensive experiments demonstrate that BLADE outperforms state of-the-art methods, achieving an Area Under the Curve (AUC) of 0. 9228 and 0. 9553 on two external test datasets, respectively. Notably, BLADE significantly enhances clin ical workflow; when used as an assistive tool, BLADE improves diagnostic accuracy by 14. 31%, surpassing stan daloneperformanceofclinicians. This workbridgesthegap between AI-driven analysis and clinical practice in breast MRI interpretation. The source code is available at https://github.com/GDPHMediaLab/BLADE.

Details DOI

AAAI Conference 2026 Conference Paper

Remodeling Semantic Relationships in Vision-Language Fine-Tuning

Xiangyang Wu
Liu Liu
Baosheng Yu
Jiayan Qiu
Zhenwei Shi

Vision-language fine-tuning has emerged as an efficient paradigm for constructing multimodal foundation models. While textual context often highlights semantic relationships within an image, existing fine-tuning methods typically overlook this information when aligning vision and language, thus leading to suboptimal performance. Toward solving this problem, we propose a method that can improve multimodal alignment and fusion based on both semantics and relationships.Specifically, we first extract multilevel semantic features from different vision encoder to capture more visual cues of the relationships. Then, we learn to project the vision features to group related semantics, among which are more likely to have relationships. Finally, we fuse the visual features with the textual by using inheritable cross-attention, where we globally remove the redundant visual relationships by discarding visual-language feature pairs with low correlation. We evaluate our proposed method on eight foundation models and two downstream tasks, visual question answering and image captioning, and show that it outperforms all existing methods.

PDF Details DOI

AIIM Journal 2025 Journal Article

Rethinking mitosis detection: Towards diverse data and feature representation for better domain generalization

Jiatai Lin
Hao Wang
Danyi Li
Jing Wang
Bingchao Zhao
Zhenwei Shi
Changhong Liang
Guoqiang Han

Mitosis detection is one of the fundamental tasks in computational pathology, which is extremely challenging due to the heterogeneity of mitotic cell. Most of the current studies solve the heterogeneity in the technical aspect by increasing the model complexity. However, lacking consideration of the biological knowledge and the complex model design may lead to the overfitting problem while limited the generalizability of the detection model. In this paper, we systematically study the morphological appearances in different mitotic phases as well as the ambiguous non-mitotic cells and identify that balancing the data and feature diversity can achieve better generalizability. Based on this observation, we propose a novel generalizable framework (MitDet) for mitosis detection. The data diversity is considered by the proposed diversity-guided sample balancing (DGSB). And the feature diversity is preserved by inter- and intra- class feature diversity-preserved module (InCDP). Stain enhancement (SE) module is introduced to enhance the domain-relevant diversity of both data and features simultaneously. Extensive experiments have demonstrated that our proposed model outperforms all the state-of-the-art (SOTA) approaches in several popular mitosis detection datasets in both internal and unseen test sets using point annotations only. Comprehensive ablation studies have also proven the effectiveness of the rethinking of data and feature diversity balancing. By analyzing the results quantitatively and qualitatively, we believe that our proposed model not only achieves SOTA performance but also might inspire the future studies in new perspectives. Code is available at https: //github. com/linjiatai/MitDet.

Details DOI

JBHI Journal 2025 Journal Article

Semi-Supervised Gland Segmentation via Feature-Enhanced Contrastive Learning and Dual-Consistency Strategy

Jiejiang Yu
Bingbing Li
Xipeng Pan
Zhenwei Shi
Huadeng Wang
Rushi Lan
Xiaonan Luo

In the field of gland segmentation in histopathology, deep-learning methods have made significant progress. However, most existing methods not only require a large amount of high-quality annotated data but also tend to confuse the internal of the gland with the background. To address this challenge, we propose a new semi-supervised method named DCCL-Seg for gland segmentation, which follows the teacher-student framework. Our approach can be divided into follows steps. First, we design a contrastive learning module to improve the ability of the student model's feature extractor to distinguish between gland and background features. Then, we introduce a Signed Distance Field (SDF) prediction task and employ dual-consistency strategy (across tasks and models) to better reinforce the learning of gland internal. Next, we proposed a pseudo label filtering and reweighting mechanism, which filters and reweights the pseudo labels generated by the teacher model based on confidence. However, even after reweighting, the pseudo labels may still be influenced by unreliable pixels. Finally, we further designed an assistant predictor to learn the reweighted pseudo labels, which do not interfere with the student model's predictor and ensure the reliability of the student model's predictions. Experimental results on the publicly available GlaS and CRAG datasets demonstrate that our method outperforms other semi-supervised medical image segmentation methods.

Details DOI

ICLR Conference 2025 Conference Paper

Sitcom-Crafter: A Plot-Driven Human Motion Generation System in 3D Scenes

Jianqi Chen
Panwen Hu
Xiaojun Chang
Zhenwei Shi
Michael Kampffmeyer
Xiaodan Liang

Recent advancements in human motion synthesis have focused on specific types of motions, such as human-scene interaction, locomotion or human-human interaction, however, there is a lack of a unified system capable of generating a diverse combination of motion types. In response, we introduce *Sitcom-Crafter*, a comprehensive and extendable system for human motion generation in 3D space, which can be guided by extensive plot contexts to enhance workflow efficiency for anime and game designers. The system is comprised of eight modules, three of which are dedicated to motion generation, while the remaining five are augmentation modules that ensure consistent fusion of motion sequences and system functionality. Central to the generation modules is our novel 3D scene-aware human-human interaction module, which addresses collision issues by synthesizing implicit 3D Signed Distance Function (SDF) points around motion spaces, thereby minimizing human-scene collisions without additional data collection costs. Complementing this, our locomotion and human-scene interaction modules leverage existing methods to enrich the system's motion generation capabilities. Augmentation modules encompass plot comprehension for command generation, motion synchronization for seamless integration of different motion types, hand pose retrieval to enhance motion realism, motion collision revision to prevent human collisions, and 3D retargeting to ensure visual fidelity. Experimental evaluations validate the system's ability to generate high-quality, diverse, and physically realistic motions, underscoring its potential for advancing creative workflows. Code and demonstration videos can be found in the supplementary files.

Details

JBHI Journal 2024 Journal Article

Protecting Prostate Cancer Classification From Rectal Artifacts via Targeted Adversarial Training

Lei Hu
Dawei Zhou
Jiahua Xu
Cheng Lu
Chu Han
Zhenwei Shi
Qikui Zhu
Xinbo Gao

Magnetic resonance imaging (MRI)-based deep neural networks (DNN) have been widely developed to perform prostate cancer (PCa) classification. However, in real-world clinical situations, prostate MRIs can be easily impacted by rectal artifacts, which have been found to lead to incorrect PCa classification. Existing DNN-based methods typically do not consider the interference of rectal artifacts on PCa classification, and do not design specific strategy to address this problem. In this study, we proposed a novel Targeted adversarial training with Proprietary Adversarial Samples (TPAS) strategy to defend the PCa classification model against the influence of rectal artifacts. Specifically, based on clinical prior knowledge, we generated proprietary adversarial samples with rectal artifact-pattern adversarial noise, which can severely mislead PCa classification models optimized by the ordinary training strategy. We then jointly exploited the generated proprietary adversarial samples and original samples to train the models. To demonstrate the effectiveness of our strategy, we conducted analytical experiments on multiple PCa classification models. Compared with ordinary training strategy, TPAS can effectively improve the single- and multi-parametric PCa classification at patient, slice and lesion level, and bring substantial gains to recent advanced models. In conclusion, TPAS strategy can be identified as a valuable way to mitigate the influence of rectal artifacts on deep learning models for PCa classification.

Details DOI