Author name cluster

Yuexiang Li

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

22 papers

1 author row

NeurIPS Conference 2025 Conference Paper

Degradation-Aware Dynamic Schrödinger Bridge for Unpaired Image Restoration

Jingjun Yi
Qi Bi
Hao Zheng
Huimin Huang
Yixian Shen
Haolan Zhan
Wei Ji
Yawen Huang

Image restoration is a fundamental task in computer vision and machine learning, which learns a mapping between the clear images and the degraded images under various conditions (e. g. , blur, low-light, haze). Yet, most existing image restoration methods are highly restricted by the requirement of degraded and clear image pairs, which limits the generalization and feasibility to enormous real-world scenarios without paired images. To address this bottleneck, we propose a Degradation-aware Dynamic Schr\"{o}dinger Bridge (DDSB) for unpaired image restoration. Its general idea is to learn a Schr\"{o}dinger Bridge between clear and degraded image distribution, while at the same time emphasizing the physical degradation priors to reduce the accumulation of errors during the restoration process. A Degradation-aware Optimal Transport (DOT) learning scheme is accordingly devised. Training a degradation model to learn the inverse restoration process is particularly challenging, as it must be applicable across different stages of the iterative restoration process. A Dynamic Transport with Consistency (DTC) learning objective is further proposed to reduce the loss of image details in the early iterations and therefore refine the degradation model. Extensive experiments on multiple image degradation tasks show its state-of-the-art performance over the prior arts.

PDF Details

AAAI Conference 2025 Conference Paper

DGFamba: Learning Flow Factorized State Space for Visual Domain Generalization

Qi Bi
Jingjun Yi
Hao Zheng
Haolan Zhan
Wei Ji
Yawen Huang
Yuexiang Li

Domain generalization aims to learn a representation from the source domain, which can be generalized to arbitrary unseen target domains. A fundamental challenge for visual domain generalization is the domain gap caused by the dramatic style variation whereas the image content is stable. The realm of selective state space, exemplified by VMamba, demonstrates its global receptive field in representing the content. However, the way exploiting the domain-invariant property for selective state space is rarely explored. In this paper, we propose a novel Flow Factorized State Space model, dubbed as DGFamba, for visual domain generalization. To maintain domain consistency, we innovatively map the style-augmented and the original state embeddings by flow factorization. In this latent flow space, each state embedding from a certain style is specified by a latent probability path. By aligning these probability paths in the latent space, the state embeddings are able to represent the same content distribution regardless of the style differences. Extensive experiments conducted on various visual domain generalization settings show its state-of-the-art performance.

PDF Details DOI

JBHI Journal 2025 Journal Article

Federated Pseudo Modality Generation for Incomplete Multi-Modal MRI Reconstruction

Yunlu Yan
Chun-Mei Feng
Yuexiang Li
Ping Li
Rick Siow Mong Goh
Baiying Lei
Weiming Wang
David Dagan Feng

While multi-modal learning has been widely used for MRI reconstruction, it relies on paired multi-modal data, which is difficult to acquire in real clinical scenarios. Especially in the federated setting, there is a common issue that several medical institutions suffer from missing modalities or even only have single-modal data. Therefore, it is infeasible to deploy a standard federated learning framework in such conditions. In this paper, we propose a novel communication-efficient federated learning framework (namely Fed-PMG) to address the missing modality challenge in federated multi-modal MRI reconstruction. Specifically, we utilize a pseudo modality generation mechanism to recover the missing modality for each single-modal client by sharing the distribution information of the amplitude spectrum in frequency space. However, the step of sharing the original amplitude spectrum leads to heavy communication costs. To reduce the communication cost, we introduce a clustering scheme to project the set of amplitude spectrum into a finite number of cluster centroids and share them among the clients. With such an elaborate design, our approach can effectively complete the missing modality within an acceptable communication cost. Extensive experimental results demonstrate that our proposed method can outperform state-of-the-art methods and reach a performance similar to the ideal scenario (i. e. , all clients have the full set of modalities).

Details DOI

NeurIPS Conference 2025 Conference Paper

Learning a Cross-Modal Schrödinger Bridge for Visual Domain Generalization

Hao Zheng
Jingjun Yi
Qi Bi
Huimin Huang
Haolan Zhan
Yawen Huang
Yuexiang Li
Xian Wu

Domain generalization aims to train models that perform robustly on unseen target domains without access to target data. The realm of vision-language foundation model has opened a new venue owing to its inherent out-of-distribution generalization capability. However, the static alignment to class-level textual anchors remains insufficient to handle the dramatic distribution discrepancy from diverse domain-specific visual features. In this work, we propose a novel cross-domain Schrödinger Bridge (SB) method, namely SBGen, to handle this challenge, which explicitly formulates the stochastic semantic evolution, to gain better generalization to unseen domains. Technically, the proposed \texttt{SBGen} consists of three key components: (1) \emph{text-guided domain-aware feature selection} to isolate semantically aligned image tokens; (2) \emph{stochastic cross-domain evolution} to simulate the SB dynamics via a learnable time-conditioned drift; and (3) \emph{stochastic domain-agnostic interpolation} to construct semantically grounded feature trajectories. Empirically, \texttt{SBGen} achieves state-of-the-art performance on domain generalization in both classification and segmentation. This work highlights the importance of modeling domain shifts as structured stochastic processes grounded in semantic alignment.

PDF Details

AAAI Conference 2025 Conference Paper

S³-Mamba: Small-Size-Sensitive Mamba for Lesion Segmentation

Gui Wang
Yuexiang Li
Wenting Chen
Meidan Ding
Wooi Ping Cheah
Rong Qu
Jianfeng Ren
Linlin Shen

Small lesions play a critical role in early disease diagnosis and intervention of severe infections. Popular models often face challenges in segmenting small lesions, as it occupies only a minor portion of an image, while down-sampling operations may inevitably lose focus on local features of small lesions. To tackle the challenges, we propose a Small-Size-Sensitive Mamba (S³-Mamba), which promotes the sensitivity to small lesions across three dimensions: channel, spatial, and training strategy. Specifically, an Enhanced Visual State Space block is designed to focus on small lesions through multiple residual connections to preserve local features, and selectively amplify important details while suppressing irrelevant ones through channel-wise attention. A Tensor-based Cross-feature Multi-scale Attention is designed to integrate input image features and intermediate-layer features with edge features and exploit the attentive support of features across multiple scales, thereby retaining spatial details of small lesions at various granularities. Finally, we introduce a novel regularized curriculum learning to automatically assess lesion size and sample difficulty, and gradually focus from easy samples to hard ones like small lesions. Extensive experiments on three medical image segmentation datasets show the superiority of our S³-Mamba, especially in segmenting small lesions.

PDF Details DOI

AAAI Conference 2024 Conference Paper

Combinatorial CNN-Transformer Learning with Manifold Constraints for Semi-supervised Medical Image Segmentation

Huimin Huang
Yawen Huang
Shiao Xie
Lanfen Lin
Ruofeng Tong
Yen-Wei Chen
Yuexiang Li
Yefeng Zheng

Semi-supervised learning (SSL), as one of the dominant methods, aims at leveraging the unlabeled data to deal with the annotation dilemma of supervised learning, which has attracted much attentions in the medical image segmentation. Most of the existing approaches leverage a unitary network by convolutional neural networks (CNNs) with compulsory consistency of the predictions through small perturbations applied to inputs or models. The penalties of such a learning paradigm are that (1) CNN-based models place severe limitations on global learning; (2) rich and diverse class-level distributions are inhibited. In this paper, we present a novel CNN-Transformer learning framework in the manifold space for semi-supervised medical image segmentation. First, at intra-student level, we propose a novel class-wise consistency loss to facilitate the learning of both discriminative and compact target feature representations. Then, at inter-student level, we align the CNN and Transformer features using a prototype-based optimal transport method. Extensive experiments show that our method outperforms previous state-of-the-art methods on three public medical image segmentation benchmarks.

PDF Details DOI

JBHI Journal 2024 Journal Article

Cross-Modal Vertical Federated Learning for MRI Reconstruction

Yunlu Yan
Hong Wang
Yawen Huang
Nanjun He
Lei Zhu
Yong Xu
Yuexiang Li
Yefeng Zheng

Federated learning enables multiple hospitals to cooperatively learn a shared model without privacy disclosure. Existing methods often take a common assumption that the data from different hospitals have the same modalities. However, such a setting is difficult to fully satisfy in practical applications, since the imaging guidelines may be different between hospitals, which makes the number of individuals with the same set of modalities limited. To this end, we formulate this practical-yet-challenging cross-modal vertical federated learning task, in which data from multiple hospitals have different modalities with a small amount of multi-modality data collected from the same individuals. To tackle such a situation, we develop a novel framework, namely Federated Consistent Regularization constrained Feature Disentanglement (Fed-CRFD), for boosting MRI reconstruction by effectively exploring the overlapping samples (i. e. , same patients with different modalities at different hospitals) and solving the domain shift problem caused by different modalities. Particularly, our Fed-CRFD involves an intra-client feature disentangle scheme to decouple data into modality-invariant and modality-specific features, where the modality-invariant features are leveraged to mitigate the domain shift problem. In addition, a cross-client latent representation consistency constraint is proposed specifically for the overlapping samples to further align the modality-invariant features extracted from different modalities. Hence, our method can fully exploit the multi-source data from hospitals while alleviating the domain shift problem. Extensive experiments on two typical MRI datasets demonstrate that our network clearly outperforms state-of-the-art MRI reconstruction methods.

Details DOI

NeurIPS Conference 2024 Conference Paper

Learning Frequency-Adapted Vision Foundation Model for Domain Generalized Semantic Segmentation

Qi Bi
Jingjun Yi
Hao Zheng
Haolan Zhan
Yawen Huang
Wei Ji
Yuexiang Li
Yefeng Zheng

The emerging vision foundation model (VFM) has inherited the ability to generalize to unseen images. Nevertheless, the key challenge of domain-generalized semantic segmentation (DGSS) lies in the domain gap attributed to the cross-domain styles, i. e. , the variance of urban landscape and environment dependencies. Hence, maintaining the style-invariant property with varying domain styles becomes the key bottleneck in harnessing VFM for DGSS. The frequency space after Haar wavelet transformation provides a feasible way to decouple the style information from the domain-invariant content, since the content and style information are retained in the low- and high- frequency components of the space, respectively. To this end, we propose a novel Frequency-Adapted (FADA) learning scheme to advance the frontier. Its overall idea is to separately tackle the content and style information by frequency tokens throughout the learning process. Particularly, the proposed FADA consists of two branches, i. e. , low- and high- frequency branches. The former one is able to stabilize the scene content, while the latter one learns the scene styles and eliminates its impact to DGSS. Experiments conducted on various DGSS settings show the state-of-the-art performance of our FADA and its versatility to a variety of VFMs. Source code is available at \url{https: //github. com/BiQiWHU/FADA}.

PDF Details DOI

AAAI Conference 2024 Conference Paper

Learning Generalized Medical Image Segmentation from Decoupled Feature Queries

Qi Bi
Jingjun Yi
Hao Zheng
Wei Ji
Yawen Huang
Yuexiang Li
Yefeng Zheng

Domain generalized medical image segmentation requires models to learn from multiple source domains and generalize well to arbitrary unseen target domain. Such a task is both technically challenging and clinically practical, due to the domain shift problem (i.e., images are collected from different hospitals and scanners). Existing methods focused on either learning shape-invariant representation or reaching consensus among the source domains. An ideal generalized representation is supposed to show similar pattern responses within the same channel for cross-domain images. However, to deal with the significant distribution discrepancy, the network tends to capture similar patterns by multiple channels, while different cross-domain patterns are also allowed to rest in the same channel. To address this issue, we propose to leverage channel-wise decoupled deep features as queries. With the aid of cross-attention mechanism, the long-range dependency between deep and shallow features can be fully mined via self-attention and then guides the learning of generalized representation. Besides, a relaxed deep whitening transformation is proposed to learn channel-wise decoupled features in a feasible way. The proposed decoupled fea- ture query (DFQ) scheme can be seamlessly integrate into the Transformer segmentation model in an end-to-end manner. Extensive experiments show its state-of-the-art performance, notably outperforming the runner-up by 1.31% and 1.98% with DSC metric on generalized fundus and prostate benchmarks, respectively. Source code is available at https://github.com/BiQiWHU/DFQ.

PDF Details DOI

NeurIPS Conference 2024 Conference Paper

Samba: Severity-aware Recurrent Modeling for Cross-domain Medical Image Grading

Qi Bi
Jingjun Yi
Hao Zheng
Wei Ji
Haolan Zhan
Yawen Huang
Yuexiang Li
Yefeng Zheng

Disease grading is a crucial task in medical image analysis. Due to the continuous progression of diseases, i. e. , the variability within the same level and the similarity between adjacent stages, accurate grading is highly challenging. Furthermore, in real-world scenarios, models trained on limited source domain datasets should also be capable of handling data from unseen target domains. Due to the cross-domain variants, the feature distribution between source and unseen target domains can be dramatically different, leading to a substantial decrease in model performance. To address these challenges in cross-domain disease grading, we propose a Severity-aware Recurrent Modeling (Samba) method in this paper. As the core objective of most staging tasks is to identify the most severe lesions, which may only occupy a small portion of the image, we propose to encode image patches in a sequential and recurrent manner. Specifically, a state space model is tailored to store and transport the severity information by hidden states. Moreover, to mitigate the impact of cross-domain variants, an Expectation-Maximization (EM) based state recalibration mechanism is designed to map the patch embeddings into a more compact space. We model the feature distributions of different lesions through the Gaussian Mixture Model (GMM) and reconstruct the intermediate features based on learnable severity bases. Extensive experiments show the proposed Samba outperforms the VMamba baseline by an average accuracy of 23. 5\%, 5. 6\% and 4. 1\% on the cross-domain grading of fatigue fracture, breast cancer and diabetic retinopathy, respectively. Source code is available at \url{https: //github. com/BiQiWHU/Samba}.

PDF Details DOI

AIIM Journal 2024 Journal Article

Triplet-branch network with contrastive prior-knowledge embedding for disease grading

Yuexiang Li
Yanping Wang
Guang Lin
Yawen Huang
Jingxin Liu
Yi Lin
Dong Wei
Qirui Zhang

Details DOI

IS Journal 2024 Journal Article

Unraveling Complexity: An Exploration Into Large-Scale Multimodal Signal Processing

Zhenyu Wen
Yuheng Ye
Jie Su
Taotao Li
Jinhao Wan
Shilian Zheng
Zhen Hong
Shibo He

Advanced communication systems and military reconnaissance are increasingly prevalent in high-tech environments, greatly supported by the flourishing in signal processing technologies. The recent exponential proliferation of sensors led to an unprecedented expansion in the scale and diversity of signals across various modalities. Such an influx poses significant challenges in effectively integrating multimodal signal data to deliver comprehensive and interpretive solutions across a diverse range of applications. In this article, we provide an overview of the core issues, challenges, and future research directions in different stages of developing large-scale multimodal signal processing models. Additionally, we introduce a prior investigation into signal representation learning, where we propose a contrastive-learning-based framework to extract fine-grained signal features under few-shot conditions. Our proposed framework achieves a 24. 1% performance improvement over baseline approaches, consistently demonstrating superiority over state-of-the-art methods. The code is accessible in this repository: https://github.com/YYH211/LSM.

Details DOI

JBHI Journal 2023 Journal Article

Blind Super-Resolution of 3D MRI via Unsupervised Domain Transformation

Hexiang Zhou
Yawen Huang
Yuexiang Li
Yi Zhou
Yefeng Zheng

High-resolution medical images can be effectively used for clinical diagnosis. However, the acquisition of high-resolution images is difficult and often limited by medical instruments. Super-resolution (SR) methods provide a solution, where high-resolution (HR) images can be reconstructed from low-resolution (LR) ones. Most of existing deep neural networks for 3D SR medical images trained in a non-blind process, where LR images are directly degraded from HR data via a pre-determined downscale method. Such approaches rely heavily on the assumed degradation model, resulting in inevitable deviations in real clinical practice. Blind super-resolution, as a more attractive research line for this field, aims to generate HR images from LR inputs containing unknown degradation. Towards generalizing SR models for diverse types of degradation, we propose a robust blind SR of 3D medical images in an unsupervised manner with domain correction and upscaling treatment. First, a CycleGAN-based architecture is implemented to generate the LR data from the source domain to the target one for domain correction. Then, an upscaling network is learned via pre-determined HR-LR couples for reconstruction. The proposed framework is able to automatically learn noisy and blurry correction kernels for unpaired 3D SR magnetic resonance images (MRI). Our method achieves better and more robust performances in reconstruction of HR images from LR MRI with multiple unknown degradation processes, and show its superiority to other state-of-the-art supervised models and cycle-consistency based methods, especially in severe distortion cases.

Details DOI

AAAI Conference 2023 Conference Paper

ClassFormer: Exploring Class-Aware Dependency with Transformer for Medical Image Segmentation

Huimin Huang
Shiao Xie
Lanfen Lin
Ruofeng Tong
Yen-Wei Chen
Hong Wang
Yuexiang Li
Yawen Huang

Vision Transformers have recently shown impressive performances on medical image segmentation. Despite their strong capability of modeling long-range dependencies, the current methods still give rise to two main concerns in a class-level perspective: (1) intra-class problem: the existing methods lacked in extracting class-specific correspondences of different pixels, which may lead to poor object coverage and/or boundary prediction; (2) inter-class problem: the existing methods failed to model explicit category-dependencies among various objects, which may result in inaccurate localization. In light of these two issues, we propose a novel transformer, called ClassFormer, powered by two appealing transformers, i.e., intra-class dynamic transformer and inter-class interactive transformer, to address the challenge of fully exploration on compactness and discrepancy. Technically, the intra-class dynamic transformer is first designed to decouple representations of different categories with an adaptive selection mechanism for compact learning, which optimally highlights the informative features to reflect the salient keys/values from multiple scales. We further introduce the inter-class interactive transformer to capture the category dependency among different objects, and model class tokens as the representative class centers to guide a global semantic reasoning. As a consequence, the feature consistency is ensured with the expense of intra-class penalization, while inter-class constraint strengthens the feature discriminability between different categories. Extensive empirical evidence shows that ClassFormer can be easily plugged into any architecture, and yields improvements over the state-of-the-art methods in three public benchmarks.

PDF Details DOI

AAAI Conference 2023 Conference Paper

Combating Mode Collapse via Offline Manifold Entropy Estimation

Haozhe Liu
Bing Li
Haoqian Wu
Hanbang Liang
Yawen Huang
Yuexiang Li
Bernard Ghanem
Yefeng Zheng

Generative Adversarial Networks (GANs) have shown compelling results in various tasks and applications in recent years. However, mode collapse remains a critical problem in GANs. In this paper, we propose a novel training pipeline to address the mode collapse issue of GANs. Different from existing methods, we propose to generalize the discriminator as feature embedding and maximize the entropy of distributions in the embedding space learned by the discriminator. Specifically, two regularization terms, i.e., Deep Local Linear Embedding (DLLE) and Deep Isometric feature Mapping (DIsoMap), are introduced to encourage the discriminator to learn the structural information embedded in the data, such that the embedding space learned by the discriminator can be well-formed. Based on the well-learned embedding space supported by the discriminator, a non-parametric entropy estimator is designed to efficiently maximize the entropy of embedding vectors, playing as an approximation of maximizing the entropy of the generated distribution. By improving the discriminator and maximizing the distance of the most similar samples in the embedding space, our pipeline effectively reduces the mode collapse without sacrificing the quality of generated samples. Extensive experimental results show the effectiveness of our method which outperforms the GAN baseline, MaF-GAN on CelebA (9.13 vs. 12.43 in FID) and surpasses the recent state-of-the-art energy-based model on the ANIMEFACE dataset (2.80 vs. 2.26 in Inception score).

PDF Details DOI

NeurIPS Conference 2023 Conference Paper

Dynamically Masked Discriminator for GANs

Wentian Zhang
Haozhe Liu
Bing Li
Jinheng Xie
Yawen Huang
Yuexiang Li
Yefeng Zheng
Bernard Ghanem

Training Generative Adversarial Networks (GANs) remains a challenging problem. The discriminator trains the generator by learning the distribution of real/generated data. However, the distribution of generated data changes throughout the training process, which is difficult for the discriminator to learn. In this paper, we propose a novel method for GANs from the viewpoint of online continual learning. We observe that the discriminator model, trained on historically generated data, often slows down its adaptation to the changes in the new arrival generated data, which accordingly decreases the quality of generated results. By treating the generated data in training as a stream, we propose to detect whether the discriminator slows down the learning of new knowledge in generated data. Therefore, we can explicitly enforce the discriminator to learn new knowledge fast. Particularly, we propose a new discriminator, which automatically detects its retardation and then dynamically masks its features, such that the discriminator can adaptively learn the temporally-vary distribution of generated data. Experimental results show our method outperforms the state-of-the-art approaches.

PDF Details

NeurIPS Conference 2023 Conference Paper

Learning Visual Prior via Generative Pre-Training

Jinheng Xie
Kai Ye
Yudong Li
Yuexiang Li
Kevin Qinghong Lin
Yefeng Zheng
Linlin Shen
Mike Zheng Shou

Various stuff and things in visual data possess specific traits, which can be learned by deep neural networks and are implicitly represented as the visual prior, e. g. , object location and shape, in the model. Such prior potentially impacts many vision tasks. For example, in conditional image synthesis, spatial conditions failing to adhere to the prior can result in visually inaccurate synthetic results. This work aims to explicitly learn the visual prior and enable the customization of sampling. Inspired by advances in language modeling, we propose to learn Visual prior via Generative Pre-Training, dubbed VisorGPT. By discretizing visual locations, e. g. , bounding boxes, human pose, and instance masks, into sequences, VisorGPT can model visual prior through likelihood maximization. Besides, prompt engineering is investigated to unify various visual locations and enable customized sampling of sequential outputs from the learned prior. Experimental results demonstrate the effectiveness of VisorGPT in modeling visual prior and extrapolating to novel scenes, potentially motivating that discrete visual locations can be integrated into the learning paradigm of current language models to further perceive visual world. Code is available at https: //sierkinhane. github. io/visor-gpt.

PDF Details

IJCAI Conference 2022 Conference Paper

Adaptive Convolutional Dictionary Network for CT Metal Artifact Reduction

Hong Wang
Yuexiang Li
Deyu Meng
Yefeng Zheng

Inspired by the great success of deep neural networks, learning-based methods have gained promising performances for metal artifact reduction (MAR) in computed tomography (CT) images. However, most of the existing approaches put less emphasis on modelling and embedding the intrinsic prior knowledge underlying this specific MAR task into their network designs. Against this issue, we propose an adaptive convolutional dictionary network (ACDNet), which leverages both model-based and learning-based methods. Specifically, we explore the prior structures of metal artifacts, e. g. , non-local repetitive streaking patterns, and encode them as an explicit weighted convolutional dictionary model. Then, a simple-yet-effective algorithm is carefully designed to solve the model. By unfolding every iterative substep of the proposed algorithm into a network module, we explicitly embed the prior structure into a deep network, i. e. , a clear interpretability for the MAR task. Furthermore, our ACDNet can automatically learn the prior for artifact-free CT images via training data and adaptively adjust the representation kernels for each input CT image based on its content. Hence, our method inherits the clear interpretability of model-based methods and maintains the powerful representation ability of learning-based methods. Comprehensive experiments executed on synthetic and clinical datasets show the superiority of our ACDNet in terms of effectiveness and model generalization. Code and supplementary material are available at https: //github. com/hongwang01/ACDNet.

PDF Details DOI

JBHI Journal 2022 Journal Article

Mix-and-Interpolate: A Training Strategy to Deal With Source-Biased Medical Data

Yuexiang Li
Jiawei Chen
Dong Wei
Yanchun Zhu
Jianrong Wu
Junfeng Xiong
Yadong Gang
Wenbo Sun

Till March 31st, 2021, the coronavirus disease 2019 (COVID-19) had reportedly infected more than 127 million people and caused over 2. 5 million deaths worldwide. Timely diagnosis of COVID-19 is crucial for management of individual patients as well as containment of the highly contagious disease. Having realized the clinical value of non-contrast chest computed tomography (CT) for diagnosis of COVID-19, deep learning (DL) based automated methods have been proposed to aid the radiologists in reading the huge quantities of CT exams as a result of the pandemic. In this work, we address an overlooked problem for training deep convolutional neural networks for COVID-19 classification using real-world multi-source data, namely, the data source bias problem. The data source bias problem refers to the situation in which certain sources of data comprise only a single class of data, and training with such source-biased data may make the DL models learn to distinguish data sources instead of COVID-19. To overcome this problem, we propose MIx-aNd-Interpolate (MINI), a conceptually simple, easy-to-implement, efficient yet effective training strategy. The proposed MINI approach generates volumes of the absent class by combining the samples collected from different hospitals, which enlarges the sample space of the original source-biased dataset. Experimental results on a large collection of real patient data (1, 221 COVID-19 and 1, 520 negative CT images, and the latter consisting of 786 community acquired pneumonia and 734 non-pneumonia) from eight hospitals and health institutions show that: 1) MINI can improve COVID-19 classification performance upon the baseline (which does not deal with the source bias), and 2) MINI is superior to competing methods in terms of the extent of improvement.

Details DOI

YNIMG Journal 2021 Journal Article

Comparing methods of detecting and segmenting unruptured intracranial aneurysms on TOF-MRAS: The ADAM challenge

Kimberley M. Timmins
Irene C. van der Schaaf
Edwin Bennink
Ynte M. Ruigrok
Xingle An
Michael Baumgartner
Pascal Bourdon
Riccardo De Feo

Details DOI

JBHI Journal 2020 Journal Article

Efficient and Effective Training of COVID-19 Classification Networks With Self-Supervised Dual-Track Learning to Rank

Yuexiang Li
Dong Wei
Jiawei Chen
Shilei Cao
Hongyu Zhou
Yanchun Zhu
Jianrong Wu
Lan Lan

Coronavirus Disease 2019 (COVID-19) has rapidly spread worldwide since first reported. Timely diagnosis of COVID-19 is crucial both for disease control and patient care. Non-contrast thoracic computed tomography (CT) has been identified as an effective tool for the diagnosis, yet the disease outbreak has placed tremendous pressure on radiologists for reading the exams and may potentially lead to fatigue-related mis-diagnosis. Reliable automatic classification algorithms can be really helpful; however, they usually require a considerable number of COVID-19 cases for training, which is difficult to acquire in a timely manner. Meanwhile, how to effectively utilize the existing archive of non-COVID-19 data (the negative samples) in the presence of severe class imbalance is another challenge. In addition, the sudden disease outbreak necessitates fast algorithm development. In this work, we propose a novel approach for effective and efficient training of COVID-19 classification networks using a small number of COVID-19 CT exams and an archive of negative samples. Concretely, a novel self-supervised learning method is proposed to extract features from the COVID-19 and negative samples. Then, two kinds of soft-labels (‘difficulty’ and ‘diversity’) are generated for the negative samples by computing the earth mover's distances between the features of the negative and COVID-19 samples, from which data ‘values’ of the negative samples can be assessed. A pre-set number of negative samples are selected accordingly and fed to the neural network for training. Experimental results show that our approach can achieve superior performance using about half of the negative samples, substantially reducing model training time.

Details DOI

AAAI Conference 2020 Conference Paper

Generative Adversarial Networks for Video-to-Video Domain Adaptation

Jiawei Chen
Yuexiang Li
Kai Ma
Yefeng Zheng

Endoscopic videos from multicentres often have different imaging conditions, e. g. , color and illumination, which make the models trained on one domain usually fail to generalize well to another. Domain adaptation is one of the potential solutions to address the problem. However, few of existing works focused on the translation of video-based data. In this work, we propose a novel generative adversarial network (GAN), namely VideoGAN, to transfer the video-based data across different domains. As the frames of a video may have similar content and imaging conditions, the proposed VideoGAN has an X-shape generator to preserve the intravideo consistency during translation. Furthermore, a loss function, namely color histogram loss, is proposed to tune the color distribution of each translated frame. Two colonoscopic datasets from different centres, i. e. , CVC-Clinic and ETIS- Larib, are adopted to evaluate the performance of domain adaptation of our VideoGAN. Experimental results demonstrate that the adapted colonoscopic video generated by our VideoGAN can signiﬁcantly boost the segmentation accuracy, i. e. , an improvement of 5%, of colorectal polyps on multicentre datasets. As our VideoGAN is a general network architecture, we also evaluate its performance with the CamVid driving video dataset on the cloudy-to-sunny translation task. Comprehensive experiments show that the domain gap could be substantially narrowed down by our VideoGAN.

PDF Details