Author name cluster

Hui Xue

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

27 papers

1 author row

AAAI Conference 2026 Conference Paper

Adaptive Hyperbolic Kernels: Modulated Embedding in de Branges-Rovnyak Spaces

Leping Si
Meimei Yang
Hui Xue
Shipeng Zhu
Pengfei Fang

Hierarchical data pervades diverse machine learning applications, including natural language processing, computer vision, and social network analysis. Hyperbolic space, characterized by its negative curvature, has demonstrated strong potential in such tasks due to its capacity to embed hierarchical structures with minimal distortion. Previous evidence indicates that the hyperbolic representation capacity can be further enhanced through kernel methods. However, existing hyperbolic kernels still suffer from mild geometric distortion or lack adaptability. This paper addresses these issues by introducing a curvature-aware de Branges–Rovnyak space, a reproducing kernel Hilbert space (RKHS) that is isometric to a Poincaré ball. We design an adjustable multiplier to select the appropriate RKHS corresponding to the hyperbolic space with any curvature adaptively. Building on this foundation, we further construct a family of adaptive hyperbolic kernels, including the novel adaptive hyperbolic radial kernel, whose learnable parameters modulate hyperbolic features in a task-aware manner. Extensive experiments on visual and language benchmarks demonstrate that our proposed kernels outperform existing hyperbolic kernels in modeling hierarchical dependencies.

PDF Details DOI

AAAI Conference 2026 Conference Paper

HAP: Harmonized Amplitude Perturbation for Cross-Domain Few-Shot Learning

Wenqian Li
Pengfei Fang
Hui Xue

Cross-Domain Few-Shot Learning (CD-FSL) remains a significant challenge due to substantial distribution shifts between source and target domains. While prior approaches primarily focus on spatial alignment, they often overlook discrepancies in the frequency domain. In this paper, we reveal frequency band discretization as a key phenomenon, characterized by intra-domain low-frequency dominance, inter-domain amplitude divergence, and limited high-frequency variation. This spectral disharmony biases models toward low-frequency components, leading to spectral collapse. We quantify spectral collapse via the effective rank, a principled measure of spectral diversity. To mitigate spectral collapse, we propose Harmonized Amplitude Perturbation (HAP), a frequency-domain augmentation strategy that perturbs the amplitude spectrum via frequency-aware gains sampled from Harmonized Distributions, while fixing the phase spectrum to maintain semantic integrity. Extensive experiments on both Cross-Domain Few-Shot Image Classification and Object Detection benchmarks demonstrate that HAP effectively increases spectral diversity and consistently improves generalization, outperforming state-of-the-art methods without introducing extra model complexity.

PDF Details DOI

AAAI Conference 2026 Conference Paper

Intra-Image Mining and Symmetric Maximum Concept Matching for Few Shot Out-of-Distribution Detection

Kaixiang Chen
Pengfei Fang
Hui Xue

Recent vision-language model (VLM)-based methods have achieved promising results in zero-shot out-of-distribution (OOD) detection by effectively leveraging the local patch features. However, the zero-shot nature inherently comes with two limitations: 1) imperfect local feature prototypes; 2) lack of OOD prototypes. In this paper, we propose Intra-Image Mining (IIM), a lightweight framework designed to overcome these limitations in a few-shot manner. IIM is motivated by the fact that local patches within an image often exhibit diverse semantics, with some patches deviating from the main class concept. Therefore, for each image, we first select the top-k class prototype-related patches as positive samples and leverage them to refine and optimize the local feature prototype. Then, the next top-k among the remaining patches are selected as negatives—serving as OOD signals to construct OOD prototypes. This process yields coherent local positives and challenging negatives, effectively enhancing the model’s local feature discrimination. Besides, we propose a novel inference strategy named Symmetric Maximum Concept Matching (S-MCM). While existing approaches typically adopt an image-to-text scheme—comparing the image features to textual class prototypes—S-MCM further incorporate a text-to-image perspective, leading to more reliable OOD detection. We also propose two benchmarks to analyze the impact of semantic diversity within ID dataset. Built on a frozen VLM, IIM, in conjunction with S-MCM, achieves consistent gains in OOD detection on ImageNet-1k and other benchmarks, outperforming prior methods in FPR95 and AUROC across various few-shot settings.

PDF Details DOI

AAAI Conference 2025 Conference Paper

Exploring the Relationship Between Samples and Masks for Robust Defect Localization

Jiang Lin
Hui Xue
Fanxiu Sun
Yaping Yan

Defect detection aims to detect and localize regions out of the normal distribution. The previous approaches often explicitly incorporate the defect detection concept, such as by utilizing self-supervised ground truth or manually defined feature comparison. The aforementioned processes involve modeling the distribution of normal samples, and they rely on the modeled normality for accurate inference. This reliance may hinder their ability to generalize to unseen test scenarios or the test set that deviates from the training distribution. In this paper, we propose a one-stage framework that detects defective patterns directly without the modeling process. This ability is adopted through the joint efforts of three parties: a generative adversarial network (GAN), a newly proposed scaled pattern loss, and a dynamic correction mechanism that allows the network to self-correct. In training, explicit information that could indicate the position of defects is intentionally excluded to avoid learning any direct mapping. Experimental results show that the proposed method performs superior in comparison with the previous SOTA methods in various test scenarios.

PDF Details DOI

NeurIPS Conference 2025 Conference Paper

Multi-Modal Interactive Agent Layer for Few-Shot Universal Cross-Domain Retrieval and Beyond

Kaixiang Chen
Pengfei Fang
Hui Xue

This paper firstly addresses the challenge of few-shot universal cross-domain retrieval (FS-UCDR), enabling machines trained with limited data to generalize to novel retrieval scenarios, with queries from entirely unknown domains and categories. To achieve this, we first formally define the FS-UCDR task and propose the Multi-Modal Interactive Agent Layer (MAIL), which enhances the cross-modal interaction in vision-language models (VLMs) by aligning the parameter updates of target layer pairs across modalities. Specifically, MAIL freezes the selected target layer pair and introduces a trainable agent layer pair to approximate localized parameter updates. A bridge function is then introduced to couple the agent layer pair, enabling gradient communication across modalities to facilitate update alignment. The proposed MAIL offers four key advantages: 1) its cross-modal interaction mechanism improves knowledge acquisition from limited data, making it highly effective in low-data scenarios; 2) during inference, MAIL integrates seamlessly into the VLM via reparameterization, preserving inference complexity; 3) extensive experiments validate the superiority of MAIL, which achieves substantial performance gains over data-efficient UCDR methods while requiring significantly fewer training samples; 4) beyond UCDR, MAIL also performs competitively on few-shot classification tasks, underscoring its strong generalization ability. Code.

PDF Details

AAAI Conference 2025 Conference Paper

PEARL: Input-Agnostic Prompt Enhancement with Negative Feedback Regulation for Class-Incremental Learning

Yongchun Qin
Pengfei Fang
Hui Xue

Class-incremental learning (CIL) aims to continuously introduce novel categories into a classification system without forgetting previously learned ones, thus adapting to evolving data distributions. Researchers are currently focusing on leveraging the rich semantic information of pre-trained models (PTMs) in CIL tasks. Prompt learning has been adopted in CIL for its ability to adjust data distribution to better align with pre-trained knowledge. This paper critically examines the limitations of existing methods from the perspective of prompt learning, which heavily rely on input information. To address this issue, we propose a novel PTM-based CIL method called Input-Agnostic Prompt Enhancement with NegAtive Feedback ReguLation (PEARL). In PEARL, we implement an input-agnostic global prompt coupled with an adaptive momentum update strategy to reduce the model's dependency on data distribution, thereby effectively mitigating catastrophic forgetting. Guided by negative feedback regulation, this adaptive momentum update addresses the parameter sensitivity inherent in fixed-weight momentum updates. Furthermore, it fosters the continuous enhancement of the prompt for new tasks by harnessing correlations between different tasks in CIL. Experiments on six benchmarks demonstrate that our method achieves state-of-the-art performance.

PDF Details DOI

NeurIPS Conference 2025 Conference Paper

Reviving DSP for Advanced Theorem Proving in the Era of Reasoning Models

Chenrui Cao
Liangcheng Song
Zenan Li
Xinyi Le
Xian Zhang
Hui Xue
Fan Yang

Recent advancements, such as DeepSeek-Prover-V2-671B and Kimina-Prover-Preview-72B, demonstrate a prevailing trend in leveraging reinforcement learning (RL)-based large-scale training for automated theorem proving. Surprisingly, we discover that even without any training, careful neuro-symbolic coordination of existing off-the-shelf reasoning models and tactic step provers can achieve comparable performance. This paper introduces DSP+, an improved version of the Draft, Sketch, and Prove framework, featuring a fine-grained and integrated neuro-symbolic enhancement for each phase: (1) In the draft phase, we prompt reasoning models to generate concise natural-language subgoals to benefit the sketch phase, removing thinking tokens and references to human-written proofs; (2) In the sketch phase, subgoals are autoformalized with hypotheses to benefit the proving phase, and sketch lines containing syntactic errors are masked according to predefined rules; (3) In the proving phase, we tightly integrate symbolic search methods like Aesop with step provers to establish proofs for the sketch subgoals. Experimental results show that, without any additional model training or fine-tuning, DSP+ solves 80. 7%, 32. 8%, and 24 out of 644 problems from miniF2F, ProofNet, and PutnamBench, respectively, while requiring fewer budgets compared to state-of-the-arts. DSP+ proves imo 2019 p1, an IMO problem in miniF2F that is not solved by any prior work. Additionally, DSP+ generates proof patterns comprehensible by human experts, facilitating the identification of formalization errors; For example, eight wrongly formalized statements in miniF2F are discovered. Our results highlight the potential of classical reasoning patterns besides the RL-based training. All components will be open-sourced.

PDF Details

AAAI Conference 2025 Conference Paper

SVasP: Self-Versatility Adversarial Style Perturbation for Cross-Domain Few-Shot Learning

Wenqian Li
Pengfei Fang
Hui Xue

Cross-Domain Few-Shot Learning (CD-FSL) aims to transfer knowledge from seen source domains to unseen target domains, which is crucial for evaluating the generalization and robustness of models. Recent studies focus on utilizing visual styles to bridge the domain gap between different domains. However, the serious dilemma of gradient instability and local optimization problem occurs in those style-based CD-FSL methods. This paper addresses these issues and proposes a novel crop-global style perturbation method, called Self-Versatility Adversarial Style Perturbation (SVasP), which enhances the gradient stability and escapes from poor sharp minima jointly. Specifically, SVasP simulates more diverse potential target domain adversarial styles via diversifying input patterns and aggregating localized crop style gradients, to serve as global style perturbation stabilizers within one image, a concept we refer to as self-versatility. Then a novel objective function is proposed to maximize visual discrepancy while maintaining semantic consistency between global, crop, and adversarial features. Having the stabilized global style perturbation in the training phase, one can obtain a flattened minima in the loss landscape, boosting the transferability of the model to the target domains. Extensive experiments on multiple benchmark datasets demonstrate that our method significantly outperforms existing state-of-the-art methods.

PDF Details DOI

AIIM Journal 2024 Journal Article

Dynamic functional connections analysis with spectral learning for brain disorder detection

Yanfang Xue
Hui Xue
Pengfei Fang
Shipeng Zhu
Lishan Qiao
Yuexuan An

Dynamic functional connections (dFCs), can reveal neural activities, which provides an insightful way of mining the temporal patterns within the human brain and further detecting brain disorders. However, most existing studies focus on the dFCs estimation to identify brain disorders by shallow temporal features and methods, which cannot capture the inherent temporal patterns of dFCs effectively. To address this problem, this study proposes a novel method, named dynamic functional connections analysis with spectral learning (dCSL), to explore inherently temporal patterns of dFCs and further detect the brain disorders. Concretely, dCSL includes two components, dFCs estimation module and dFCs analysis module. In the former, dFCs are estimated via the sliding window technique. In the latter, the spectral kernel mapping is first constructed by combining the Fourier transform with the non-stationary kernel. Subsequently, the spectral kernel mapping is stacked into a deep kernel network to explore higher-order temporal patterns of dFCs through spectral learning. The proposed dCSL, sharing the benefits of deep architecture and non-stationary kernel, can not only calculate the long-range relationship but also explore the higher-order temporal patterns of dFCs. To evaluate the proposed method, a set of brain disorder classification tasks are conducted on several public datasets. As a result, the proposed dCSL achieves 5% accuracy improvement compared with the widely used approaches for analyzing sequence data, 1. 3% accuracy improvement compared with the state-of-the-art methods for dFCs. In addition, the discriminative brain regions are explored in the ASD detection task. The findings in this study are consistent with the clinical performance in ASD.

Details DOI

YNIMG Journal 2024 Journal Article

Homotopic functional connectivity disruptions in schizophrenia and their associated gene expression

Mengjing Cai
Yuan Ji
Qiyu Zhao
Hui Xue
Zuhao Sun
He Wang
Yijing Zhang
Yayuan Chen

It has been revealed that abnormal voxel-mirrored homotopic connectivity (VMHC) is present in patients with schizophrenia, yet there are inconsistencies in the relevant findings. Moreover, little is known about their association with brain gene expression profiles. In this study, transcription-neuroimaging association analyses using gene expression data from Allen Human Brain Atlas and case-control VMHC differences from both the discovery (meta-analysis, including 9 studies with a total of 386 patients and 357 controls) and replication (separate group-level comparisons within two datasets, including a total of 258 patients and 287 controls) phases were performed to identify genes associated with VMHC alterations. Enrichment analyses were conducted to characterize the biological functions and specific expression of identified genes, and Neurosynth decoding analysis was performed to examine the correlation between cognitive-related processes and VMHC alterations in schizophrenia. In the discovery and replication phases, patients with schizophrenia exhibited consistent VMHC changes compared to controls, which were correlated with a series of cognitive-related processes; meta-regression analysis revealed that illness duration was negatively correlated with VMHC abnormalities in the cerebellum and postcentral/precentral gyrus. The abnormal VMHC patterns were stably correlated with 1287 genes enriched for fundamental biological processes like regulation of cell communication, nervous system development, and cell communication. In addition, these genes were overexpressed in astrocytes and immune cells, enriched in extensive cortical regions and wide developmental time windows. The present findings may contribute to a more comprehensive understanding of the molecular mechanisms underlying VMHC alterations in patients with schizophrenia.

Details DOI

AAAI Conference 2024 Conference Paper

Text Image Inpainting via Global Structure-Guided Diffusion Models

Shipeng Zhu
Pengfei Fang
Chenjie Zhu
Zuoyan Zhao
Qiang Xu
Hui Xue

Real-world text can be damaged by corrosion issues caused by environmental or human factors, which hinder the preservation of the complete styles of texts, e.g., texture and structure. These corrosion issues, such as graffiti signs and incomplete signatures, bring difficulties in understanding the texts, thereby posing significant challenges to downstream applications, e.g., scene text recognition and signature identification. Notably, current inpainting techniques often fail to adequately address this problem and have difficulties restoring accurate text images along with reasonable and consistent styles. Formulating this as an open problem of text image inpainting, this paper aims to build a benchmark to facilitate its study. In doing so, we establish two specific text inpainting datasets which contain scene text images and handwritten text images, respectively. Each of them includes images revamped by real-life and synthetic datasets, featuring pairs of original images, corrupted images, and other assistant information. On top of the datasets, we further develop a novel neural framework, Global Structure-guided Diffusion Model (GSDM), as a potential solution. Leveraging the global structure of the text as a prior, the proposed GSDM develops an efficient diffusion model to recover clean texts. The efficacy of our approach is demonstrated by thorough empirical study, including a substantial boost in both recognition accuracy and image quality. These findings not only highlight the effectiveness of our method but also underscore its potential to enhance the broader field of text image understanding and processing. Code and datasets are available at: https://github.com/blackprotoss/GSDM.

PDF Details DOI

IJCAI Conference 2023 Conference Paper

Boosting Few-Shot Open-Set Recognition with Multi-Relation Margin Loss

Yongjuan Che
Yuexuan An
Hui Xue

Few-shot open-set recognition (FSOSR) has become a great challenge, which requires classifying known classes and rejecting the unknown ones with only limited samples. Existing FSOSR methods mainly construct an ambiguous distribution of known classes from scarce known samples without considering the latent distribution information of unknowns, which degrades the performance of open-set recognition. To address this issue, we propose a novel loss function called multi-relation margin (MRM) loss that can plug in few-shot methods to boost the performance of FSOSR. MRM enlarges the margin between different classes by extracting the multi-relationship of paired samples to dynamically refine the decision boundary for known classes and implicitly delineate the distribution of unknowns. Specifically, MRM separates the classes by enforcing a margin while concentrating samples of the same class on a hypersphere with a learnable radius. In order to better capture the distribution information of each class, MRM extracts the similarity and correlations among paired samples, ameliorating the optimization of the margin and radius. Experiments on public benchmarks reveal that methods with MRM loss can improve the unknown detection of AUROC by a significant margin while correctly classifying the known classes.

PDF Details DOI

NeurIPS Conference 2023 Conference Paper

CosNet: A Generalized Spectral Kernel Network

Yanfang Xue
Pengfei Fang
Jinyue Tian
Shipeng Zhu
Hui Xue

Complex-valued representation exists inherently in the time-sequential data that can be derived from the integration of harmonic waves. The non-stationary spectral kernel, realizing a complex-valued feature mapping, has shown its potential to analyze the time-varying statistical characteristics of the time-sequential data, as a result of the modeling frequency parameters. However, most existing spectral kernel-based methods eliminate the imaginary part, thereby limiting the representation power of the spectral kernel. To tackle this issue, we propose a generalized spectral kernel network, namely, \underline{Co}mplex-valued \underline{s}pectral kernel \underline{Net}work (CosNet), which includes spectral kernel mapping generalization (SKMG) module and complex-valued spectral kernel embedding (CSKE) module. Concretely, the SKMG module is devised to generalize the spectral kernel mapping in the real number domain to the complex number domain, recovering the inherent complex-valued representation for the real-valued data. Then a following CSKE module is further developed to combine the complex-valued spectral kernels and neural networks to effectively capture long-range or periodic relations of the data. Along with the CosNet, we study the effect of the complex-valued spectral kernel mapping via theoretically analyzing the bound of covering number and generalization error. Extensive experiments demonstrate that CosNet performs better than the mainstream kernel methods and complex-valued neural networks.

PDF Details

IJCAI Conference 2023 Conference Paper

Expanding the Hyperbolic Kernels: A Curvature-aware Isometric Embedding View

Meimei Yang
Pengfei Fang
Hui Xue

Modeling data relation as a hierarchical structure has proven beneficial for many learning scenarios, and the hyperbolic space, with negative curvature, can encode such data hierarchy without distortion. Several recent studies also show that the representation power of the hyperbolic space can be further improved by endowing the kernel methods. Unfortunately, the known kernel methods, developed in hyperbolic space, are limited by the adaptation capacity or distortion issues. This paper addresses the issues through a novel embedding function. To this end, we propose a curvature-aware isometric embedding, which establishes an isometry from the Poincar\'e model to a special reproducing kernel Hilbert space (RKHS). Then we can further define a series of kernels on this RKHS, including several positive definite kernels and an indefinite kernel. Thorough experiments are conducted to demonstrate the superiority of our proposals over existing-known hyperbolic and Euclidean kernels in various learning tasks, e. g. , graph learning and zero-shot learning.

PDF Details DOI

AAAI Conference 2023 Conference Paper

Improving Scene Text Image Super-resolution via Dual Prior Modulation Network

Shipeng Zhu
Zuoyan Zhao
Pengfei Fang
Hui Xue

Scene text image super-resolution (STISR) aims to simultaneously increase the resolution and legibility of the text images, and the resulting images will significantly affect the performance of downstream tasks. Although numerous progress has been made, existing approaches raise two crucial issues: (1) They neglect the global structure of the text, which bounds the semantic determinism of the scene text. (2) The priors, e.g., text prior or stroke prior, employed in existing works, are extracted from pre-trained text recognizers. That said, such priors suffer from the domain gap including low resolution and blurriness caused by poor imaging conditions, leading to incorrect guidance. Our work addresses these gaps and proposes a plug-and-play module dubbed Dual Prior Modulation Network (DPMN), which leverages dual image-level priors to bring performance gain over existing approaches. Specifically, two types of prior-guided refinement modules, each using the text mask or graphic recognition result of the low-quality SR image from the preceding layer, are designed to improve the structural clarity and semantic accuracy of the text, respectively. The following attention mechanism hence modulates two quality-enhanced images to attain a superior SR result. Extensive experiments validate that our method improves the image quality and boosts the performance of downstream tasks over five typical approaches on the benchmark. Substantial visualizations and ablation studies demonstrate the advantages of the proposed DPMN. Code is available at: https://github.com/jdfxzzy/DPMN.

PDF Details DOI

IJCAI Conference 2023 Conference Paper

Learning to Learn from Corrupted Data for Few-Shot Learning

Yuexuan An
Xingyu Zhao
Hui Xue

Few-shot learning which aims to generalize knowledge learned from annotated base training data to recognize unseen novel classes has attracted considerable attention. Existing few-shot methods rely on completely clean training data. However, in the real world, the training data are always corrupted and accompanied by noise due to the disturbance in data transmission and low-quality annotation, which severely degrades the performance and generalization capability of few-shot models. To address the problem, we propose a unified peer-collaboration learning (PCL) framework to extract valid knowledge from corrupted data for few-shot learning. PCL leverages two modules to mimic the peer collaboration process which cooperatively evaluates the importance of each sample. Specifically, each module first estimates the importance weights of different samples by encoding the information provided by the other module from both global and local perspectives. Then, both modules leverage the obtained importance weights to guide the reevaluation of the loss value of each sample. In this way, the peers can mutually absorb knowledge to improve the robustness of few-shot models. Experiments verify that our framework combined with different few-shot methods can significantly improve the performance and robustness of original models.

PDF Details DOI

IJCAI Conference 2022 Conference Paper

Automatically Gating Multi-Frequency Patterns through Rectified Continuous Bernoulli Units with Theoretical Principles

Zheng-Fan Wu
Yi-Nan Feng
Hui Xue

Different nonlinearities are only suitable for responding to different frequency signals. The locally-responding ReLU is incapable of modeling high-frequency features due to the spectral bias, whereas the globally-responding sinusoidal function is intractable to represent low-frequency concepts cheaply owing to the optimization dilemma. Moreover, nearly all the practical tasks are composed of complex multi-frequency patterns, whereas there is little prospect of designing or searching a heterogeneous network containing various types of neurons matching the frequencies, because of their exponentially-increasing combinatorial states. In this paper, our contributions are three-fold: 1) we propose a general Rectified Continuous Bernoulli (ReCB) unit paired with an efficient variational Bayesian learning paradigm, to automatically detect/gate/represent different frequency responses; 2) our numerically-tight theoretical framework proves that ReCB-based networks can achieve the optimal representation ability, which is O(m^{η/(d^2)}) times better than that of popular neural networks, for a hidden dimension of m, an input dimension of d, and a Lipschitz constant of η; 3) we provide comprehensive empirical evidence showing that ReCB-based networks can keenly learn multi-frequency patterns and push the state-of-the-art performance.

PDF Details DOI

IJCAI Conference 2021 Conference Paper

Adversarial Spectral Kernel Matching for Unsupervised Time Series Domain Adaptation

Qiao Liu
Hui Xue

Unsupervised domain adaptation (UDA) has been received increasing attention since it does not require labels in target domain. Most existing UDA methods learn domain-invariant features by minimizing discrepancy distance computed by a certain metric between domains. However, these discrepancy-based methods cannot be robustly applied to unsupervised time series domain adaptation (UTSDA). That is because discrepancy metrics in these methods contain only low-order and local statistics, which have limited expression for time series distributions and therefore result in failure of domain matching. Actually, the real-world time series are always non-local distributions, i. e. , with non-stationary and non-monotonic statistics. In this paper, we propose an Adversarial Spectral Kernel Matching (AdvSKM) method, where a hybrid spectral kernel network is specifically designed as inner kernel to reform the Maximum Mean Discrepancy (MMD) metric for UTSDA. The hybrid spectral kernel network can precisely characterize non-stationary and non-monotonic statistics in time series distributions. Embedding hybrid spectral kernel network to MMD not only guarantees precise discrepancy metric but also benefits domain matching. Besides, the differentiable architecture of the spectral kernel network enables adversarial kernel learning, which brings more discriminatory expression for discrepancy matching. The results of extensive experiments on several real-world UTSDA tasks verify the effectiveness of our proposed method.

PDF Details DOI

AAAI Conference 2021 Conference Paper

Composite Adversarial Attacks

Xiaofeng Mao
Yuefeng Chen
Shuhui Wang
Hang Su
Yuan He
Hui Xue

Adversarial attack is a technique for deceiving Machine Learning (ML) models, which provides a way to evaluate the adversarial robustness. In practice, attack algorithms are artificially selected and tuned by human experts to break a ML system. However, manual selection of attackers tends to be sub-optimal, leading to a mistakenly assessment of model security. In this paper, a new procedure called Composite Adversarial Attack (CAA) is proposed for automatically searching the best combination of attack algorithms and their hyperparameters from a candidate pool of 32 base attackers. We design a search space where attack policy is represented as an attacking sequence, i. e. , the output of the previous attacker is used as the initialization input for successors. Multiobjective NSGA-II genetic algorithm is adopted for finding the strongest attack policy with minimum complexity. The experimental result shows CAA beats 10 top attackers on 11 diverse defenses with less elapsed time (6 × faster than AutoAttack), and achieves the new state-of-the-art on l∞, l2 and unrestricted adversarial attacks.

PDF Details

IJCAI Conference 2021 Conference Paper

Conditional Self-Supervised Learning for Few-Shot Classification

Yuexuan An
Hui Xue
Xingyu Zhao
Lu Zhang

How to learn a transferable feature representation from limited examples is a key challenge for few-shot classification. Self-supervision as an auxiliary task to the main supervised few-shot task is considered to be a conceivable way to solve the problem since self-supervision can provide additional structural information easily ignored by the main task. However, learning a good representation by traditional self-supervised methods is usually dependent on large training samples. In few-shot scenarios, due to the lack of sufficient samples, these self-supervised methods might learn a biased representation, which more likely leads to the wrong guidance for the main tasks and finally causes the performance degradation. In this paper, we propose conditional self-supervised learning (CSS) to use auxiliary information to guide the representation learning of self-supervised tasks. Specifically, CSS leverages supervised information as prior knowledge to shape and improve the learning feature manifold of self-supervision without auxiliary unlabeled data, so as to reduce representation bias and mine more effective semantic information. Moreover, CSS exploits more meaningful information through supervised and the improved self-supervised learning respectively and integrates the information into a unified distribution, which can further enrich and broaden the original representation. Extensive experiments demonstrate that our proposed method without any fine-tuning can achieve a significant accuracy improvement on the few-shot classification scenarios compared to the state-of-the-art few-shot learning methods.

PDF Details DOI

IJCAI Conference 2021 Conference Paper

Learning Deeper Non-Monotonic Networks by Softly Transferring Solution Space

Zheng-Fan Wu
Hui Xue
Weimin Bai

Different from popular neural networks using quasiconvex activations, non-monotonic networks activated by periodic nonlinearities have emerged as a more competitive paradigm, offering revolutionary benefits: 1) compactly characterizing high-frequency patterns; 2) precisely representing high-order derivatives. Nevertheless, they are also well-known for being hard to train, due to easily over-fitting dissonant noise and only allowing for tiny architectures (shallower than 5 layers). The fundamental bottleneck is that the periodicity leads to many poor and dense local minima in solution space. The direction and norm of gradient oscillate continually during error backpropagation. Thus non-monotonic networks are prematurely stuck in these local minima, and leave out effective error feedback. To alleviate the optimization dilemma, in this paper, we propose a non-trivial soft transfer approach. It smooths their solution space close to that of monotonic ones in the beginning, and then improve their representational properties by transferring the solutions from the neural space of monotonic neurons to the Fourier space of non-monotonic neurons as the training continues. The soft transfer consists of two core components: 1) a rectified concrete gate is constructed to characterize the state of each neuron; 2) a variational Bayesian learning framework is proposed to dynamically balance the empirical risk and the intensity of transfer. We provide comprehensive empirical evidence showing that the soft transfer not only reduces the risk of non-monotonic networks on over-fitting noise, but also helps them scale to much deeper architectures (more than 100 layers) achieving the new state-of-the-art performance.

PDF Details DOI

IJCAI Conference 2020 Conference Paper

BaKer-Nets: Bayesian Random Kernel Mapping Networks

Hui Xue
Zheng-Fan Wu

Recently, deep spectral kernel networks (DSKNs) have attracted wide attention. They consist of periodic computational elements that can be activated across the whole feature spaces. In theory, DSKNs have the potential to reveal input-dependent and long-range characteristics, and thus are expected to perform more competitive than prevailing networks. But in practice, they are still unable to achieve the desired effects. The structural superiority of DSKNs comes at the cost of the difficult optimization. The periodicity of computational elements leads to many poor and dense local minima in loss landscapes. DSKNs are more likely stuck in these local minima, and perform worse than expected. Hence, in this paper, we propose the novel Bayesian random Kernel mapping Networks (BaKer-Nets) with preferable learning processes by escaping randomly from most local minima. Specifically, BaKer-Nets consist of two core components: 1) a prior-posterior bridge is derived to enable the uncertainty of computational elements reasonably; 2) a Bayesian learning paradigm is presented to optimize the prior-posterior bridge efficiently. With the well-tuned uncertainty, BaKer-Nets can not only explore more potential solutions to avoid local minima, but also exploit these ensemble solutions to strengthen their robustness. Systematical experiments demonstrate the significance of BaKer-Nets in improving learning processes on the premise of preserving the structural superiority.

PDF Details DOI

AAAI Conference 2020 Conference Paper

Fine-Grained Fashion Similarity Learning by Attribute-Specific Embedding Network

Zhe Ma
Jianfeng Dong
Zhongzi Long
Yao Zhang
Yuan He
Hui Xue
Shouling Ji

This paper strives to learn ﬁne-grained fashion similarity. In this similarity paradigm, one should pay more attention to the similarity in terms of a speciﬁc design/attribute among fashion items, which has potential values in many fashion related applications such as fashion copyright protection. To this end, we propose an Attribute-Speciﬁc Embedding Network (ASEN) to jointly learn multiple attributespeciﬁc embeddings in an end-to-end manner, thus measure the ﬁne-grained similarity in the corresponding space. With two attention modules, i. e. , Attribute-aware Spatial Attention and Attribute-aware Channel Attention, ASEN is able to locate the related regions and capture the essential patterns under the guidance of the speciﬁed attribute, thus make the learned attribute-speciﬁc embeddings better reﬂect the ﬁne-grained similarity. Extensive experiments on four fashion-related datasets show the effectiveness of ASEN for ﬁne-grained fashion similarity learning and its potential for fashion reranking. Code and data are available at https: //github. com/Maryeon/asen.

PDF Details

IJCAI Conference 2019 Conference Paper

Deep Spectral Kernel Learning

Hui Xue
Zheng-Fan Wu
Wei-Xiang Sun

Recently, spectral kernels have attracted wide attention in complex dynamic environments. These advanced kernels mainly focus on breaking through the crucial limitation on locality, that is, the stationarity and the monotonicity. But actually, owing to the inefficiency of shallow models in computational elements, they are more likely unable to accurately reveal dynamic and potential variations. In this paper, we propose a novel deep spectral kernel network (DSKN) to naturally integrate non-stationary and non-monotonic spectral kernels into elegant deep architectures in an interpretable way, which can be further generalized to cover most kernels. Concretely, we firstly deal with the general form of spectral kernels by the inverse Fourier transform. Secondly, DSKN is constructed by embedding the preeminent spectral kernels into each layer to boost the efficiency in computational elements, which can effectively reveal the dynamic input-dependent characteristics and potential long-range correlations by compactly representing complex advanced concepts. Thirdly, detailed analyses of DSKN are presented. Owing to its universality, we propose a unified spectral transform technique to flexibly extend and reasonably initialize domain-related DSKN. Furthermore, the representer theorem of DSKN is given. Systematical experiments demonstrate the superiority of DSKN compared to state-of-the-art relevant algorithms on varieties of standard real-world tasks.

PDF Details

IJCAI Conference 2017 Conference Paper

Multiple Indefinite Kernel Learning for Feature Selection

Hui Xue
Yu Song
Hai-Ming Xu

Multiple kernel learning for feature selection (MKL-FS) utilizes kernels to explore complex properties of features and performs better in embedded methods. However, the kernels in MKL-FS are generally limited to be positive definite. In fact, indefinite kernels often emerge in actual applications and can achieve better empirical performance. But due to the non-convexity of indefinite kernels, existing MKL-FS methods are usually inapplicable and the corresponding research is also relatively little. In this paper, we propose a novel multiple indefinite kernel feature selection method (MIK-FS) based on the primal framework of indefinite kernel support vector machine (IKSVM), which applies an indefinite base kernel for each feature and then exerts an l1-norm constraint on kernel combination coefficients to select features automatically. A two-stage algorithm is further presented to optimize the coefficients of IKSVM and kernel combination alternately. In the algorithm, we reformulate the non-convex optimization problem of primal IKSVM as a difference of convex functions (DC) programming and transform the non-convex problem into a convex one with the affine minorization approximation. Experiments on real-world datasets demonstrate that MIK-FS is superior to some related state-of-the-art methods in both feature selection and classification performance.

PDF Details

AAAI Conference 2017 Conference Paper

Solving Indefinite Kernel Support Vector Machine with Difference of Convex Functions Programming

Hai-Ming Xu
Hui Xue
Xiao-Hong Chen
Yun-Yun Wang

Indeﬁnite kernel support vector machine (IKSVM) has recently attracted increasing attentions in machine learning. Different from traditional SVMs, IKSVM essentially is a non-convex optimization problem. Some algorithms directly change the spectrum of the indeﬁnite kernel matrix at the cost of losing some valuable information involved in the kernels so as to transform the non-convex problem into a convex one. Other algorithms aim to solve the dual form of IKSVM, but suffer from the dual gap between the primal and dual problems in the case of indeﬁnite kernels. In this paper, we directly focus on the non-convex primal form of IKSVM and propose a novel algorithm termed as IKSVM-DC. According to the characteristics of the spectrum for the indeﬁnite kernel matrix, IKSVM-DC decomposes the objective function into the subtraction of two convex functions and thus reformulates the primal problem as a difference of convex functions (DC) programming which can be optimized by the DC algorithm (DCA). In order to accelerate convergence rate, IKSVM-DC further combines the classical DCA with a line search step along the descent direction at each iteration. A theoretical analysis is then presented to validate that IKSVM-DC can converge to a local minimum. Systematical experiments on real-world datasets demonstrate the superiority of IKSVM- DC compared to state-of-the-art IKSVM related algorithms.

PDF Details

YNIMG Journal 2007 Journal Article

Automatic segmentation and reconstruction of the cortex from neonatal MRI

Hui Xue
Latha Srinivasan
Shuzhou Jiang
Mary Rutherford
A. David Edwards
Daniel Rueckert
Joseph V. Hajnal

Segmentation and reconstruction of cortical surfaces from magnetic resonance (MR) images are more challenging for developing neonates than adults. This is mainly due to the dynamic changes in the contrast between gray matter (GM) and white matter (WM) in both T1- and T2-weighted images (T1w and T2w) during brain maturation. In particular in neonatal T2w images WM typically has higher signal intensity than GM. This causes mislabeled voxels during cortical segmentation, especially in the cortical regions of the brain and in particular at the interface between GM and cerebrospinal fluid (CSF). We propose an automatic segmentation algorithm detecting these mislabeled voxels and correcting errors caused by partial volume effects. Our results show that the proposed algorithm corrects errors in the segmentation of both GM and WM compared to the classic expectation maximization (EM) scheme. Quantitative validation against manual segmentation demonstrates good performance (the mean Dice value: 0. 758±0. 037 for GM and 0. 794±0. 078 for WM). The inner, central and outer cortical surfaces are then reconstructed using implicit surface evolution. A landmark study is performed to verify the accuracy of the reconstructed cortex (the mean surface reconstruction error: 0. 73 mm for inner surface and 0. 63 mm for the outer). Both segmentation and reconstruction have been tested on 25 neonates with the gestational ages ranging from ∼27 to 45 weeks. This preliminary analysis confirms previous findings that cortical surface area and curvature increase with age, and that surface area scales to cerebral volume according to a power law, while cortical thickness is not related to age or brain growth.

Details DOI