Author name cluster

Xiaofeng Liu

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

28 papers

2 author rows

AAAI Conference 2026 Conference Paper

CoFact: Dynamic Coordination of Attention Heads for Improving Factual Consistency in LLMs

Shike Li
Xiaokai Wang
Xiaofeng Liu
Xin Tong
Hu Zhang

Large language models (LLMs) frequently generate fluent yet factually inaccurate content, a phenomenon known as hallucination. Recent inference-time approaches aim to improve truthfulness by steering model activations toward semantically meaningful directions. While effective to some extent, these methods typically process activations independently, neglecting the internal coordination structure of multi-head attention (MHA), where attention heads interact to form semantic representations. In this work, we propose CoFact, an adaptive inference-time mechanism that improves factual consistency by dynamically coordinating attention head behaviors. Inspired by cooperative game theory, CoFact conceptualizes attention heads as collaborative agents. It models the semantic utility and redundancy of each head and adaptively modulates their contributions to the final attention output. Notably, rather than directly altering intermediate representations, CoFact performs token-level coordination to encourage diverse and complementary attention patterns across heads. CoFact is plug-and-play compatible with mainstream LLM architectures and requires no additional supervision or model retraining. Experimental results across multiple standard factuality benchmarks demonstrate that CoFact consistently enhances factual accuracy while maintaining generation fluency.

PDF Details DOI

EAAI Journal 2026 Journal Article

Depression detection via multimodal analysis using a large language model-powered interview platform

Yangbin Chen
Chenyang Xu
Chunfeng Liang
Yanbao Tao
Fan Du
Xiaofeng Liu
Mingqia Wang
Zhuoran Li

Details DOI

AAAI Conference 2026 Conference Paper

Explainable Depression Assessment from Face Videos by Weakly Supervised Learning

Rongfan Liao
Xiangyu Kong
Shiqing Tang
Lang He
Changzeng Fu
Weicheng Xie
Xiaofeng Liu
Lu Liu

Existing video-based automatic depression assessment (ADA) approaches frequently achieve video-level depression assessment by aggregating features or predictions of individual frames or equal-length segments within the given video. While their performances have been largely enhanced by recent advanced deep learning models, they typically fail to explicitly consider the varied importance of depression-related behavioural cues across different video segments, i.e., segments within one video may contain behaviours reflecting varying levels of depression. Underestimating segment-level variations can obscure the detection of facial behaviour cues associated with depression, thereby undermining the accuracy and interpretability of video-based depression detection systems. In this paper, we propose a novel video-based ADA approach that specifically identifies and differentiates video segments that exhibit depression-related facial behaviours across varying temporal durations, providing clear insights into how each segment contributes to the video-level depression prediction. To achieve this, a novel weakly supervised strategy is proposed to compare segment-level behaviours with video-level depression label, enabling the model to assign depression-relevant scores to multiple temporal scale video segments and attend selectively to those most indicative of depressive states. Extensive experiments on the AVEC 2013 and AVEC 2014 face video depression datasets demonstrate the effectiveness of our approach.

PDF Details DOI

JBHI Journal 2026 Journal Article

Variance Extrapolated Class-Imbalance-Aware Domain Adaptive Myocardial Segmentation in Multi-Sequence Cardiac MRI

Fangxu Xing
Xiaofeng Liu
Iman Aganj
Georges El Fakhri
Panki Kim
Byoung Wook Choi
Jonghye Woo

Fully automated myocardial segmentation from cardiac magnetic resonance imaging (MRI) is vital for efficient diagnosis and treatment planning. Although numerous automated methods have been proposed, they typically focus on single MRI sequences and therefore have difficulties in generalizing across vendors and across cardiac MRI protocols. Simultaneous analysis of complementary cardiac MRI sequences, such as cine, T1 mapping, and late gadolinium enhancement (LGE) MRI, remains challenging due to their distinct image characteristics and scanner-specific variations. To address these issues, we propose an unsupervised domain adaptation approach that allows robust myocardial segmentation across multi-vendor cine, T1, and LGE MRI data. In particular, we introduce a class- imbalance self-training framework to transfer information learned from a source domain with labels to any unlabeled target domain, while maintaining consistent performance across different MRI sequences. Our framework iteratively refines segmentation accuracy by generating pseudo-labels for target data using a hardness-aware strategy, thus effectively addressing the problem of class imbalance in cardiac MRI segmentation. To mitigate data scarcity following pseudo-label selection, we employ a variance-guided vicinal feature extrapolation, which expands data points in the feature space into a probabilistic distribution. This, in turn, facilitates joint source–target training by generating a larger intersection in the feature space. Experimental results demonstrate that our framework outperforms existing methods when assessed using the Dice coefficient and Hausdorff distance. Our framework enables cardiac evaluation across MRI protocols without sequence-specific manual annotations.

Details DOI

EAAI Journal 2025 Journal Article

Environment adaptive deep reinforcement learning for intelligent fault diagnosis

Xiaofeng Liu
Zheng Zhao
Fan Yang
Fuyuan Liang
Lin Bo

Details DOI

IROS Conference 2025 Conference Paper

In-situ Value-aligned Human-Robot Interactions with Physical Constraints

Hongtao Li
Ziyuan Jiao
Xiaofeng Liu
Hangxin Liu
Zilong Zheng

Equipped with Large Language Models (LLMs), human-centered robots are now capable of performing a wide range of tasks that were previously deemed challenging or unattainable. However, merely completing tasks is insufficient for cognitive robots, who should learn and apply human preferences to future scenarios. In this work, we propose a framework that combines human preferences with physical constraints, requiring robots to complete tasks while considering both. Firstly, we developed a benchmark of everyday household activities, which are often evaluated based on specific preferences. We then introduced In-Context Learning from Human Feedback (ICLHF), where human feedback comes from direct instructions and adjustments made intentionally or unintentionally in daily life. Extensive sets of experiments, testing the ICLHF to generate task plans and balance physical constraints with preferences, have demonstrated the efficiency of our approach.

Details

ICLR Conference 2025 Conference Paper

Progressive Compositionality in Text-to-Image Generative Models

Xu Han
Linghao Jin
Xiaofeng Liu
Paul Pu Liang

Despite the impressive text-to-image (T2I) synthesis capabilities of diffusion models, they often struggle to understand compositional relationships between objects and attributes, especially in complex settings. Existing approaches through building compositional architectures or generating difficult negative captions often assume a fixed prespecified compositional structure, which limits generalization to new distributions. In this paper, we argue that curriculum training is crucial to equipping generative models with a fundamental understanding of compositionality. To achieve this, we leverage large-language models (LLMs) to automatically compose complex scenarios and harness Visual-Question Answering (VQA) checkers to automatically curate a contrastive dataset, ConPair, consisting of 15k pairs of high-quality contrastive images. These pairs feature minimal visual discrepancies and cover a wide range of attribute categories, especially complex and natural scenarios. To learn effectively from these error cases (i.e., hard negative images), we propose EvoGen, a new multi-stage curriculum for contrastive learning of diffusion models. Through extensive experiments across a wide range of compositional scenarios, we showcase the effectiveness of our proposed framework on compositional T2I benchmarks.

Details

NeurIPS Conference 2025 Conference Paper

Rethinking Evaluation of Infrared Small Target Detection

Youwei Pang
Xiaoqi Zhao
Lihe Zhang
Huchuan Lu
Georges Fakhri
Xiaofeng Liu
Shijian Lu

As an essential vision task, infrared small target detection (IRSTD) has seen significant advancements through deep learning. However, critical limitations in current evaluation protocols impede further progress. First, existing methods rely on fragmented pixel- and target-level specific metrics, which fails to provide a comprehensive view of model capabilities. Second, an excessive emphasis on overall performance scores obscures crucial error analysis, which is vital for identifying failure modes and improving real-world system performance. Third, the field predominantly adopts dataset-specific training-testing paradigms, hindering the understanding of model robustness and generalization across diverse infrared scenarios. This paper addresses these issues by introducing a hybrid-level metric incorporating pixel- and target-level performance, proposing a systematic error analysis method, and emphasizing the importance of cross-dataset evaluation. These aim to offer a more thorough and rational hierarchical analysis framework, ultimately fostering the development of more effective and robust IRSTD models. An open-source toolkit has be released to facilitate standardized benchmarking.

PDF Details

IJCAI Conference 2025 Conference Paper

Towards Robust Deterministic and Probabilistic Modeling for Predictive Learning

Xuesong Nie
Haoyuan Jin
Vijayakumar Bhagavatula
Xiaofeng Liu

Predictive modeling of unannotated spatiotemporal data presents inherent challenges, primarily due to the highly entangled visual dynamics in real-world scenes. To tackle these complexities, we introduce a novel insight through Disentangling Deterministic and Probabilistic (DDP) modeling. We note a key observation in spatiotemporal data where low-level details typically remain stable, whereas high-level motion frequently exhibits dynamic variations. The core motivation involves constructing two distinct pathways in the latent space: a deterministic path and a probabilistic path. The probabilistic path begins by defining the motion flow, which explicitly describes complex many-to-many motion patterns between patches, and models its probabilistic distribution using a motion diffuser. The deterministic path incorporates a spectral-aware enhancer to retain and amplify visual details in the frequency domain. These designs ensure visual consistency while also capturing intricate long-term motion dynamics. Extensive experiments demonstrate the superiority of DDP across diverse scenario evaluations.

PDF Details DOI

NeurIPS Conference 2025 Conference Paper

UniMRSeg: Unified Modality-Relax Segmentation via Hierarchical Self-Supervised Compensation

Xiaoqi Zhao
Youwei Pang
Chenyang Yu
Lihe Zhang
Huchuan Lu
Shijian Lu
Georges Fakhri
Xiaofeng Liu

Multi-modal image segmentation faces real-world deployment challenges from incomplete/corrupted modalities degrading performance. While existing methods address training-inference modality gaps via specialized per-combination models, they introduce high deployment costs by requiring exhaustive model subsets and model-modality matching. In this work, we propose a unified modality-relax segmentation network (UniMRSeg) through hierarchical self-supervised compensation (HSSC). Our approach hierarchically bridges representation gaps between complete and incomplete modalities across input, feature and output levels. First, we adopt modality reconstruction with the hybrid shuffled-masking augmentation, encouraging the model to learn the intrinsic modality characteristics and generate meaningful representations for missing modalities through cross-modal fusion. Next, modality-invariant contrastive learning implicitly compensates the feature space distance among incomplete-complete modality pairs. Furthermore, the proposed lightweight reverse attention adapter explicitly compensates for the weak perceptual semantics in the frozen encoder. Last, UniMRSeg is fine-tuned under the hybrid consistency constraint to ensure stable prediction under all modality combinations without large performance fluctuations. Without bells and whistles, UniMRSeg significantly outperforms the state-of-the-art methods under diverse missing modality scenarios on MRI-based brain tumor segmentation, RGB-D semantic segmentation, RGB-D/T salient object segmentation. The code will be released at \url{https: //github. com/Xiaoqi-Zhao-DLUT/UniMRSeg}.

PDF Details

JBHI Journal 2025 Journal Article

Unsupervised Domain Adaptation With Synchronized Self-Training for Cross- Domain Motor Imagery Recognition

Peiyin Chen
Xiaofeng Liu
Chao Ma
He Wang
Xiong Yang
Celso Grebogi
Xiao Gu
Zhongke Gao

Robust decoding performance is essential for the practical deployment of brain-computer interface (BCI) systems. Existing EEG decoding models often rely on large amounts of annotated data collected through specific experimental setups, which fail to address the heterogeneity of data distributions across different domains. This limitation hinders BCI systems from effectively managing the complexity and variability of real-world data. To overcome these challenges, we propose Synchronized Self-Training Domain Adaptation (SSTDA) for cross-domain motor imagery classification. Specifically, SSTDA leverages labeled signals from a source domain and applies self-training to unlabeled signals from a target domain, enabling the simultaneous training of a more robust classifier. The raw EEG signals are mapped into a latent space by a feature extractor for discriminative representation learning. A domain-shared latent space is then learned by optimizing the feature extractor with both source and target samples, using an easy-tohard self-training process. We validate the method with extensive experiments on two public motor imagery datasets: Dataset IIa of BCI Competition IV and the High Gamma dataset. In the inter-subject task, our method achieves classification accuracies of 64. 43% and 80. 40%, respectively. It also outperforms existing methods in the inter-session task. Moreover, we develope a new six-class motor imagery dataset and achieve test accuracies of 77. 09% and 80. 18% across different datasets. All experimental results demonstrate that our SSTDA outperforms existing algorithms in inter-session, inter-subject, and inter-dataset validation protocols, highlighting its capability to learn discriminative, domain-invariant representations that enhance EEG decoding performance.

Details DOI

NeurIPS Conference 2025 Conference Paper

When and How Unlabeled Data Provably Improve In-Context Learning

Yingcong Li
Xiangyu Chang
Muti Kara
Xiaofeng Liu
Amit Roy-Chowdhury
Samet Oymak

Recent research shows that in-context learning (ICL) can be effective even when demonstrations have missing or incorrect labels. To shed light on this capability, we examine a canonical setting where the demonstrations are drawn according to a binary Gaussian mixture model (GMM) and a certain fraction of the demonstrations have missing labels. We provide a comprehensive theoretical study to show that: (1) The loss landscape of one-layer linear attention models recover the optimal fully-supervised estimator but completely fail to exploit unlabeled data; (2) In contrast, multilayer or looped transformers can effectively leverage unlabeled data by implicitly constructing estimators of the form $\sum_{i\ge 0} a_i (X^\top X)^iX^\top y$ with $X$ and $y$ denoting features and partially-observed labels (with missing entries set to zero). We characterize the class of polynomials that can be expressed as a function of depth and draw connections to Expectation Maximization, an iterative pseudo-labeling algorithm commonly used in semi-supervised learning. Importantly, the leading polynomial power is exponential in depth, so mild amount of depth/looping suffices. As an application of theory, we propose looping off-the-shelf tabular foundation models to enhance their semi-supervision capabilities. Extensive evaluations on real-world datasets show that our method significantly improves the semisupervised tabular learning performance over the standard single pass inference.

PDF Details

AAAI Conference 2024 Conference Paper

Label-Efficient Few-Shot Semantic Segmentation with Unsupervised Meta-Training

Jianwu Li
Kaiyue Shi
Guo-Sen Xie
Xiaofeng Liu
Jian Zhang
Tianfei Zhou

The goal of this paper is to alleviate the training cost for few-shot semantic segmentation (FSS) models. Despite that FSS in nature improves model generalization to new concepts using only a handful of test exemplars, it relies on strong supervision from a considerable amount of labeled training data for base classes. However, collecting pixel-level annotations is notoriously expensive and time-consuming, and small-scale training datasets convey low information density that limits test-time generalization. To resolve the issue, we take a pioneering step towards label-efficient training of FSS models from fully unlabeled training data, or additionally a few labeled samples to enhance the performance. This motivates an approach based on a novel unsupervised meta-training paradigm. In particular, the approach first distills pre-trained unsupervised pixel embedding into compact semantic clusters from which a massive number of pseudo meta-tasks is constructed. To mitigate the noise in the pseudo meta-tasks, we further advocate a robust Transformer-based FSS model with a novel prototype-based cross-attention design. Extensive experiments have been conducted on two standard benchmarks, i.e., PASCAL-5i and COCO-20i, and the results show that our method produces impressive performance without any annotations, and is comparable to fully supervised competitors even using only 20% of the annotations. Our code is available at: https://github.com/SSSKYue/UMTFSS.

PDF Details DOI

EAAI Journal 2024 Journal Article

Twin data multimode collaborative transfer learning for bearing failure diagnosis

Xiaofeng Liu
Yan Jin
Fan Yang
Yingying Kang
Lin Bo

Details DOI

AIIM Journal 2023 Journal Article

Source-free domain adaptive segmentation with class-balanced complementary self-training

Yongsong Huang
Wanqing Xie
Mingzhen Li
Ethan Xiao
Jane You
Xiaofeng Liu

Details DOI

AIIM Journal 2022 Journal Article

A multi-modal fusion framework based on multi-task correlation learning for cancer prognosis prediction

Kaiwen Tan
Weixian Huang
Xiaofeng Liu
Jinlong Hu
Shoubin Dong

Details DOI

AIIM Journal 2022 Journal Article

ADHD classification using auto-encoding neural network and binary hypothesis testing

Yibin Tang
Jia Sun
Chun Wang
Yuan Zhong
Aimin Jiang
Gang Liu
Xiaofeng Liu

Details DOI

JBHI Journal 2022 Journal Article

Brain MR Atlas Construction Using Symmetric Deep Neural Inpainting

Fangxu Xing
Xiaofeng Liu
C.-C. Jay Kuo
Georges El Fakhri
Jonghye Woo

Modeling statistical properties of anatomical structures using magnetic resonance imaging is essential for revealing common information of a target population and unique properties of specific subjects. In brain imaging, a statistical brain atlas is often constructed using a number of healthy subjects. When tumors are present, however, it is difficult to either provide a common space for various subjects or align their imaging data due to the unpredictable distribution of lesions. Here we propose a deep learning-based image inpainting method to replace the tumor regions with normal tissue intensities using only a patient population. Our framework has three major innovations: 1) incompletely distributed datasets with random tumor locations can be used for training; 2) irregularly-shaped tumor regions are properly learned, identified, and corrected; and 3) a symmetry constraint between the two brain hemispheres is applied to regularize inpainted regions. Henceforth, regular atlas construction and image registration methods can be applied using inpainted data to obtain tissue deformation, thereby achieving group-specific statistical atlases and patient-to-atlas registration. Our framework was tested using the public database from the Multimodal Brain Tumor Segmentation challenge. Results showed increased similarity scores as well as reduced reconstruction errors compared with three existing image inpainting methods. Patient-to-atlas registration also yielded better results with improved normalized cross-correlation and mutual information and a reduced amount of deformation over the tumor regions.

Details DOI

JBHI Journal 2022 Journal Article

Interpreting Depression From Question-Wise Long-Term Video Recording of SDS Evaluation

Wanqing Xie
Lizhong Liang
Yao Lu
Chen Wang
Jihong Shen
Hui Luo
Xiaofeng Liu

Self-Rating Depression Scale (SDS) questionnaire has frequently been used for efficient depression preliminary screening. However, the uncontrollable self-administered measure can be easily affected by insouciantly or deceptively answering, and producing the different results with the clinician-administered Hamilton Depression Rating Scale (HDRS) and the final diagnosis. Clinically, facial expression (FE) and actions play a vital role in clinician-administered evaluation, while FE and action are underexplored for self-administered evaluations. In this work, we collect a novel dataset of 200 subjects to evidence the validity of self-rating questionnaires with their corresponding question-wise video recording. To automatically interpret depression from the SDS evaluation and the paired video, we propose an end-to-end hierarchical framework for the long-term variable-length video, which is also conditioned on the questionnaire results and the answering time. Specifically, we resort to a hierarchical model which utilizes a 3D CNN for local temporal pattern exploration and a redundancy-aware self-attention (RAS) scheme for question-wise global feature aggregation. Targeting for the redundant long-term FE video processing, our RAS is able to effectively exploit the correlations of each video clip within a question set to emphasize the discriminative information and eliminate the redundancy based on feature pair-wise affinity. Then, the question-wise video feature is concatenated with the questionnaire scores for final depression detection. Our thorough evaluations also show the validity of fusing SDS evaluation and its video recording, and the superiority of our framework to the conventional state-of-the-art temporal modeling methods.

Details DOI

JBHI Journal 2022 Journal Article

VoxelHop: Successive Subspace Learning for ALS Disease Classification Using Structural MRI

Xiaofeng Liu
Fangxu Xing
Chao Yang
Chung-Chieh Jay Kuo
Suma Babu
Georges El Fakhri
Thomas Jenkins
Jonghye Woo

Deep learning has great potential for accurate detection and classification of diseases with medical imaging data, but the performance is often limited by the number of training datasets and memory requirements. In addition, many deep learning models are considered a “black-box, ” thereby often limiting their adoption in clinical applications. To address this, we present a successive subspace learning model, termed VoxelHop, for accurate classification of Amyotrophic Lateral Sclerosis (ALS) using T2-weighted structural MRI data. Compared with popular convolutional neural network (CNN) architectures, VoxelHop has modular and transparent structures with fewer parameters without any backpropagation, so it is well-suited to small dataset size and 3D imaging data. Our VoxelHop has four key components, including (1) sequential expansion of near-to-far neighborhood for multi-channel 3D data; (2) subspace approximation for unsupervised dimension reduction; (3) label-assisted regression for supervised dimension reduction; and (4) concatenation of features and classification between controls and patients. Our experimental results demonstrate that our framework using a total of 20 controls and 26 patients achieves an accuracy of 93. 48 $\%$ and an AUC score of 0. 9394 in differentiating patients from controls, even with a relatively small number of datasets, showing its robustness and effectiveness. Our thorough evaluations also show its validity and superiority to the state-of-the-art 3D CNN classification approaches. Our framework can easily be generalized to other classification tasks using different imaging modalities.

Details DOI

JBHI Journal 2021 Journal Article

A Hierarchical Graph Convolution Network for Representation Learning of Gene Expression Data

Kaiwen Tan
Weixian Huang
Xiaofeng Liu
Jinlong Hu
Shoubin Dong

The curse of dimensionality, which is caused by high-dimensionality and low-sample-size, is a major challenge in gene expression data analysis. However, the real situation is even worse: labelling data is laborious and time-consuming, so only a small part of the limited samples will be labelled. Having such few labelled samples further increases the difficulty of training deep learning models. Interpretability is an important requirement in biomedicine. Many existing deep learning methods are trying to provide interpretability, but rarely apply to gene expression data. Recent semi-supervised graph convolution network methods try to address these problems by smoothing the label information over a graph. However, to the best of our knowledge, these methods only utilize graphs in either the feature space or sample space, which restrict their performance. We propose a transductive semi-supervised representation learning method called a hierarchical graph convolution network (HiGCN) to aggregate the information of gene expression data in both feature and sample spaces. HiGCN first utilizes external knowledge to construct a feature graph and a similarity kernel to construct a sample graph. Then, two spatial-based GCNs are used to aggregate information on these graphs. To validate the model's performance, synthetic and real datasets are provided to lend empirical support. Compared with two recent models and three traditional models, HiGCN learns better representations of gene expression data, and these representations improve the performance of downstream tasks, especially when the model is trained on a few labelled samples. Important features can be extracted from our model to provide reliable interpretability.

Details DOI

AAAI Conference 2021 Conference Paper

Deep Verifier Networks: Verification of Deep Discriminative Models with Deep Generative Models

Tong Che
Xiaofeng Liu
Site Li
Yubin Ge
Ruixiang Zhang
Caiming Xiong
Yoshua Bengio

AI Safety is a major concern in many deep learning applications such as autonomous driving. Given a trained deep learning model, an important natural problem is how to reliably verify the model’s prediction. In this paper, we propose a novel framework — deep verifier networks (DVN) to detect unreliable inputs or predictions of deep discriminative models, using separately trained deep generative models. Our proposed model is based on the concise conditional variational auto-encoders with disentanglement constraints to separate the label information from the latent representation. We give both intuitive and theoretical justifications for the model. Our verifier network is trained independently with the prediction model, which eliminates the need of retraining the verifier network for a new model. We test the verifier network on both out-of-distribution detection and adversarial example detection problems, as well as anomaly detection problems in structured prediction tasks such as image caption generation. We achieve state-of-the-art results in all of these problems.

PDF Details

IJCAI Conference 2021 Conference Paper

Domain Generalization under Conditional and Label Shifts via Variational Bayesian Inference

Xiaofeng Liu
Bo Hu
Linghao Jin
Xu Han
Fangxu Xing
Jinsong Ouyang
Jun Lu
Georges El Fakhri

In this work, we propose a domain generalization (DG) approach to learn on several labeled source domains and transfer knowledge to a target domain that is inaccessible in training. Considering the inherent conditional and label shifts, we would expect the alignment of p(x|y) and p(y). However, the widely used domain invariant feature learning (IFL) methods relies on aligning the marginal concept shift w. r. t. p(x), which rests on an unrealistic assumption that p(y) is invariant across domains. We thereby propose a novel variational Bayesian inference framework to enforce the conditional distribution alignment w. r. t. p(x|y) via the prior distribution matching in a latent space, which also takes the marginal label shift w. r. t. p(y) into consideration with the posterior alignment. Extensive experiments on various benchmarks demonstrate that our framework is robust to the label shift and the cross-domain accuracy is significantly improved, thereby achieving superior performance over the conventional IFL counterparts.

PDF Details DOI

IROS Conference 2021 Conference Paper

PNS: Population-Guided Novelty Search for Reinforcement Learning in Hard Exploration Environments

Qihao Liu
Yujia Wang
Xiaofeng Liu

Reinforcement Learning (RL) has made remarkable achievements, but it still suffers from inadequate exploration strategies, sparse reward signals, and deceptive reward functions. To alleviate these problems, a Population-guided Novelty Search (PNS) parallel learning method is proposed in this paper. In PNS, the population is divided into multiple sub-populations, each of which has one chief agent and several exploring agents. The chief agent evaluates the policies learned by exploring agents and shares the optimal policy with all sub-populations. The exploring agents learn their policies in collaboration with the guidance of the optimal policy and, simultaneously, upload their policies to the chief agent. To balance exploration and exploitation, the Novelty Search (NS) is employed in every chief agent to encourage policies with high novelty while maximizing per-episode performance. We apply PNS to the twin delayed deep deterministic (TD3) policy gradient algorithm. The effectiveness of PNS to promote exploration and improve performance in continuous control domains is demonstrated in the experimental section. Notably, PNS-TD3 achieves rewards that far exceed the SOTA methods in environments with sparse or delayed reward signals. We also demonstrate that PNS enables robotic agents to learn control policies directly from pixels for sparse-reward manipulation in both simulated and real-world settings.

Details

AAAI Conference 2021 Conference Paper

Subtype-aware Unsupervised Domain Adaptation for Medical Diagnosis

Xiaofeng Liu
Xiongchang Liu
Bo Hu
Wenxuan Ji
Fangxu Xing
Jun Lu
Jane You
C.-C. Jay Kuo

Recent advances in unsupervised domain adaptation (UDA) show that transferable prototypical learning presents a powerful means for class conditional alignment, which encourages the closeness of cross-domain class centroids. However, the cross-domain inner-class compactness and the underlying fine-grained subtype structure remained largely underexplored. In this work, we propose to adaptively carry out the fine-grained subtype-aware alignment by explicitly enforcing the class-wise separation and subtype-wise compactness with intermediate pseudo labels. Our key insight is that the unlabeled subtypes of a class can be divergent to one another with different conditional and label shifts, while inheriting the local proximity within a subtype. The cases with or without the prior information on subtype numbers are investigated to discover the underlying subtype structure in an online fashion. The proposed subtype-aware dynamic UDA achieves promising results on a medical diagnosis task.

PDF Details

AIIM Journal 2020 Journal Article

ADHD classification by dual subspace learning using resting-state functional connectivity

Ying Chen
Yibin Tang
Chun Wang
Xiaofeng Liu
Li Zhao
Zhishun Wang

Details DOI

AAAI Conference 2020 Conference Paper

Importance-Aware Semantic Segmentation in Self-Driving with Discrete Wasserstein Training

Xiaofeng Liu
Yuzhuo Han
Song Bai
Yi Ge
Tianxing Wang
Xu Han
Site Li
Jane You

Semantic segmentation (SS) is an important perception manner for self-driving cars and robotics, which classiﬁes each pixel into a pre-determined class. The widely-used cross entropy (CE) loss-based deep networks has achieved signiﬁcant progress w. r. t. the mean Intersection-over Union (mIoU). However, the cross entropy loss can not take the different importance of each class in an self-driving system into account. For example, pedestrians in the image should be much more important than the surrounding buildings when make a decisions in the driving, so their segmentation results are expected to be as accurate as possible. In this paper, we propose to incorporate the importance-aware inter-class correlation in a Wasserstein training framework by conﬁguring its ground distance matrix. The ground distance matrix can be pre-deﬁned following a priori in a speciﬁc task, and the previous importance-ignored methods can be the particular cases. From an optimization perspective, we also extend our ground metric to a linear, convex or concave increasing function w. r. t. pre-deﬁned ground distance. We evaluate our method on CamVid and Cityscapes datasets with different backbones (SegNet, ENet, FCN and Deeplab) in a plug and play fashion. In our extenssive experiments, Wasserstein loss demonstrates superior segmentation performance on the predeﬁned critical classes for safe-driving.

PDF Details

EAAI Journal 2013 Journal Article

Evaluation of a set of new ORF kernel functions of SVM for speech recognition

Xueying Zhang
Xiaofeng Liu
Zizhong John Wang

Details DOI