Arrow Research search

Author name cluster

Feng Zhou

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

44 papers
2 author rows

Possible papers

44

AAAI Conference 2026 Conference Paper

Exploring Position Encoding Mechanism in Diffusion U-Net for Training-free High-resolution Image Generation

  • Feng Zhou
  • Pu Cao
  • Yiyang Ma
  • Lu Yang
  • Yonghao Dang
  • Jianqin Yin

Denoising higher-resolution latents using a pre-trained U-Net often results in repetitive and disordered image patterns. In this work, we are motivated to reveal the intrinsic cause of such pattern disruption in high-resolution image generation. Through theoretical analysis and empirical studies, we reveal that the pre-trained U-Net fails to provide sufficient positional information for tokens at high-resolution. Specifically, 1) zero-padding serves as a critical mechanism for position encoding but lacks robustness across varying resolutions; and 2) tokens located farther from the feature map boundaries have increasing difficulty acquiring positional awareness, leading to pattern disruptions. Inspired by these findings, we propose a novel training-free approach for high-resolution generation, introducing a Progressive Boundary Complement (PBC) method. It creates dynamic virtual image boundaries inside the feature map to supplement position information at high resolution, enabling high-quality and rich-content high-resolution image synthesis. Extensive experiments show that our method significantly improves high-resolution image synthesis in terms of visual quality and content richness, achieving state-of-the-art performance.

AAAI Conference 2026 Conference Paper

Fair Bayesian Data Selection via Generalized Discrepancy Measures

  • Yixuan Zhang
  • Jiabin Luo
  • Zhenggang Wang
  • Feng Zhou
  • Quyu Kong

Fairness concerns are increasingly critical as machine learning models are deployed in high-stakes applications. While existing fairness-aware methods typically intervene at the model level, they often suffer from high computational costs, limited scalability, and poor generalization. To address these challenges, we propose a Bayesian data selection framework that ensures fairness by aligning group-specific posterior distributions of model parameters and sample weights with a shared central distribution. Our framework supports flexible alignment via various distributional discrepancy measures, including Wasserstein distance, maximum mean discrepancy, and f-divergence, allowing geometry-aware control without imposing explicit fairness constraints. This data-centric approach mitigates group-specific biases in training data and improves fairness in downstream tasks, with theoretical guarantees. Experiments on benchmark datasets show that our method consistently outperforms existing data selection and model-based fairness methods in both fairness and accuracy.

NeurIPS Conference 2025 Conference Paper

DanmakuTPPBench: A Multi-modal Benchmark for Temporal Point Process Modeling and Understanding

  • Yue Jiang
  • Jichu Li
  • Yang Liu
  • Dingkang Yang
  • Feng Zhou
  • Quyu Kong

We introduce DanmakuTPPBench, a comprehensive benchmark designed to advance multi-modal Temporal Point Process (TPP) modeling in the era of Large Language Models (LLMs). While TPPs have been widely studied for modeling temporal event sequences, existing datasets are predominantly unimodal, hindering progress in models that require joint reasoning over temporal, textual, and visual information. To address this gap, DanmakuTPPBench comprises two complementary components: (1) DanmakuTPP-Events, a novel dataset derived from the Bilibili video platform, where user-generated bullet comments (Danmaku) naturally form multi-modal events annotated with precise timestamps, rich textual content, and corresponding video frames; (2) DanmakuTPP-QA, a challenging question-answering dataset constructed via a novel multi-agent pipeline powered by state-of-the-art LLMs and multi-modal LLMs (MLLMs), targeting complex temporal-textual-visual reasoning. We conduct extensive evaluations using both classical TPP models and recent MLLMs, revealing significant performance gaps and limitations in current methods’ ability to model multi-modal event dynamics. Our benchmark establishes strong baselines and calls for further integration of TPP modeling into the multi-modal language modeling landscape. Project page: https: //github. com/FRENKIE-CHIANG/DanmakuTPPBench.

ICLR Conference 2025 Conference Paper

Integrating Protein Dynamics into Structure-Based Drug Design via Full-Atom Stochastic Flows

  • Xiangxin Zhou
  • Yi Xiao
  • Haowei Lin
  • Xinheng He
  • Jiaqi Guan
  • Yang Wang 0103
  • Qiang Liu 0006
  • Feng Zhou

The dynamic nature of proteins, influenced by ligand interactions, is essential for comprehending protein function and progressing drug discovery. Traditional structure-based drug design (SBDD) approaches typically target binding sites with rigid structures, limiting their practical application in drug development. While molecular dynamics simulation can theoretically capture all the biologically relevant conformations, the transition rate is dictated by the intrinsic energy barrier between them, making the sampling process computationally expensive. To overcome the aforementioned challenges, we propose to use generative modeling for SBDD considering conformational changes of protein pockets. We curate a dataset of apo and multiple holo states of protein-ligand complexes, simulated by molecular dynamics, and propose a full-atom flow model (and a stochastic version), named DynamicFlow, that learns to transform apo pockets and noisy ligands into holo pockets and corresponding 3D ligand molecules. Our method uncovers promising ligand molecules and corresponding holo conformations of pockets. Additionally, the resultant holo-like states provide superior inputs for traditional SBDD approaches, playing a significant role in practical drug discovery.

AAAI Conference 2025 Conference Paper

Navigating Towards Fairness with Data Selection

  • Yixuan Zhang
  • Zhidong Li
  • Yang Wang
  • Fang Chen
  • Xuhui Fan
  • Feng Zhou

Machine learning algorithms often struggle to eliminate inherent data biases, particularly those arising from unreliable labels, which poses a significant challenge in ensuring fairness. Existing fairness techniques that address label bias typically involve modifying models and intervening in the training process, but these lack flexibility for large-scale datasets. To address this limitation, we introduce a data selection method designed to efficiently and flexibly mitigate label bias, tailored to more practical needs. Our approach utilizes a zero-shot predictor as a proxy model that simulates training on a clean holdout set. This strategy, supported by peer predictions, ensures the fairness of the proxy model and eliminates the need for an additional holdout set, which is a common requirement in previous methods. Without altering the classifier's architecture, our modality-agnostic method effectively selects appropriate training data and has proven efficient and effective in handling label bias and improving fairness across diverse datasets in experimental evaluations.

NeurIPS Conference 2025 Conference Paper

Personalized Bayesian Federated Learning with Wasserstein Barycenter Aggregation

  • Ting Wei
  • Biao Mei
  • Junliang Lyu
  • Renquan Zhang
  • Feng Zhou
  • Yifan Sun

Personalized Bayesian federated learning (PBFL) handles non-i. i. d. client data and quantifies uncertainty by combining personalization with Bayesian inference. However, current PBFL methods face two main limitations: posterior inference on clients often assumes restrictive parametric forms, and server-side posterior aggregation typically relies on naive parameter averaging. To overcome these issues, we propose FedWBA, a novel PBFL method that enhances both local inference and global aggregation. At the client level, we use particle-based variational inference for nonparametric posterior representation. At the server level, we introduce particle-based Wasserstein barycenter aggregation, offering a more geometrically meaningful approach. Theoretically, we provide local and global convergence guarantees for FedWBA. Locally, we prove a KL divergence decrease lower bound per iteration for variational inference convergence. Globally, we show that the Wasserstein barycenter converges to the true parameter as the client data size increases. Empirically, experiments show that FedWBA outperforms baselines in prediction accuracy, uncertainty calibration, and convergence rate, with ablation studies confirming its robustness.

AAAI Conference 2025 Conference Paper

Position-Aware Guided Point Cloud Completion with CLIP Model

  • Feng Zhou
  • Qi Zhang
  • Ju Dai
  • Lei Li
  • Qing Fan
  • Junliang Xing

Point cloud completion aims to recover partial geometric and topological shapes caused by equipment defects or limited viewpoints. Current methods either solely rely on the 3D coordinates of the point cloud to complete it or incorporate additional images with well-calibrated intrinsic parameters to guide the geometric estimation of the missing parts. Although these methods have achieved excellent performance by directly predicting the location of complete points, the extracted features lack fine-grained information regarding the location of the missing area. To address this issue, we propose a rapid and efficient method to expand an unimodal framework into a multimodal framework. This approach incorporates a position-aware module designed to enhance the spatial information of the missing parts through a weighted map learning mechanism. In addition, we establish a Point-Text-Image triplet corpus PCI-TI and MVP-TI based on the existing unimodal point cloud completion dataset and use the pre-trained vision-language model CLIP to provide richer detail information for 3D shapes, thereby enhancing performance. Extensive quantitative and qualitative experiments demonstrate that our method outperforms state-of-the-art point cloud completion methods.

IROS Conference 2025 Conference Paper

Towards Physically Realizable Adversarial Attacks in Embodied Vision Navigation

  • Meng Chen
  • Jiawei Tu
  • Chao Qi
  • Yonghao Dang
  • Feng Zhou
  • Wei Wei
  • Jianqin Yin

The significant advancements in embodied vision navigation have raised concerns about its susceptibility to adversarial attacks exploiting deep neural networks. Investigating the adversarial robustness of embodied vision navigation is crucial, especially given the threat of 3D physical attacks that could pose risks to human safety. However, existing attack methods for embodied vision navigation often lack physical feasibility due to challenges in transferring digital perturbations into the physical world. Moreover, current physical attacks for object detection struggle to achieve both multi-view effectiveness and visual naturalness in navigation scenarios. To address this, we propose a practical attack method for embodied navigation by attaching adversarial patches to objects, where both opacity and textures are learnable. Specifically, to ensure effectiveness across varying viewpoints, we employ a multi-view optimization strategy based on object-aware sampling, which optimizes the patch’s texture based on feedback from the vision-based perception model used in navigation. To make the patch inconspicuous to human observers, we introduce a two-stage opacity optimization mechanism, in which opacity is fine-tuned after texture optimization. Experimental results demonstrate that our adversarial patches decrease the navigation success rate by an average of 22. 39%, outperforming previous methods in practicality, effectiveness, and naturalness. Code is available at: github.com/chen37058/Physical-Attacks-in-Embodied-Nav.

NeurIPS Conference 2025 Conference Paper

TPP-SD: Accelerating Transformer Point Process Sampling with Speculative Decoding

  • Shukai Gong
  • YIYANG FU
  • Fengyuan Ran
  • Quyu Kong
  • Feng Zhou

We propose TPP-SD, a novel approach that accelerates Transformer temporal point process (TPP) sampling by adapting speculative decoding (SD) techniques from language models. By identifying the structural similarities between thinning algorithms for TPPs and speculative decoding for language models, we develop an efficient sampling framework that leverages a smaller draft model to generate multiple candidate events, which are then verified by the larger target model. TPP-SD maintains the same output distribution as autoregressive sampling while achieving significant acceleration. Experiments on both synthetic and real datasets demonstrate that our approach produces samples from identical distributions as standard methods, but with 2-6$\times$ speedup. Our ablation studies analyze the impact of hyperparameters such as draft length and draft model size on sampling efficiency. TPP-SD bridges the gap between powerful Transformer TPP models and the practical need for rapid sequence generation.

TAAS Journal 2025 Journal Article

Web 3.0-Enabled Microservice Re-Scheduling for Heterogenous Resources Co-Optimization in Metaverse-Integrated Edge Networks

  • Yihong Yang
  • Zhangbing Zhou
  • Lei Shu
  • Feng Zhou
  • Walid Gaaloul
  • Arif Ali Khan

The Web 3.0 and metaverse can empower intelligent application of Connected Autonomous Vehicles (CAVs). The adoption of edge computing can contribute to the low latency interaction between CAVs and the metaverse. Microservices are widely deployed on edge networks and the cloud nowadays. User’s requests from CAVs are typically fulfilled through the composition of microservices, which may be hosted by contiguous edge nodes. Requests may differ on their required resources at runtime. Consequently, when requests are continuously injected into edge networks, the usage of heterogenous resources, including CPU, memory, and network bandwidth, may not be the same, or differ significantly, on certain edge nodes. This happens especially when burst requests are injected into the network to be satisfied concurrently. Therefore, the usage of heterogenous resources provided by edge nodes should be co-optimized through re-scheduling microservices. To address this challenge, this article proposes a Web 3.0-enabled M icroservice R e- S cheduling approach (called MRS ), which is a migration-based mechanism integrating a placement strategy. Specifically, we formulate the MRS task as a multi-objective and multi-constraint optimization problem, which can be solved through a penalty signal-integrated framework and an improved pointer network. Extensive experiments are conducted on two real-world datasets. Evaluation results show that our MRS performs better than the counterparts with improvements of at least 7.7%, 2.4%, and 2.2% in terms of network throughput, latency, and energy consumption, respectively.

ICML Conference 2024 Conference Paper

Accelerating Convergence in Bayesian Few-Shot Classification

  • Tianjun Ke
  • Haoqun Cao
  • Feng Zhou

Bayesian few-shot classification has been a focal point in the field of few-shot learning. This paper seamlessly integrates mirror descent-based variational inference into Gaussian process-based few-shot classification, addressing the challenge of non-conjugate inference. By leveraging non-Euclidean geometry, mirror descent achieves accelerated convergence by providing the steepest descent direction along the corresponding manifold. It also exhibits the parameterization invariance property concerning the variational distribution. Experimental results demonstrate competitive classification accuracy, improved uncertainty quantification, and faster convergence compared to baseline models. Additionally, we investigate the impact of hyperparameters and components. Code is publicly available at https: //github. com/keanson/MD-BSFC.

TMLR Journal 2024 Journal Article

Calibrating Deep Ensemble through Functional Variational Inference

  • Zhijie Deng
  • Feng Zhou
  • Jianfei Chen
  • Guoqiang Wu
  • Jun Zhu

Deep Ensemble (DE) is an effective and practical uncertainty quantification approach in deep learning. The uncertainty of DE is usually manifested by the functional inconsistency among the ensemble members, which, yet, originates from unmanageable randomness in the initialization and optimization of neural networks (NNs), and may easily collapse in specific cases. To tackle this issue, we advocate characterizing the functional inconsistency with the empirical covariance of the functions dictated by the ensemble members, and defining a Gaussian process (GP) with it. We perform functional variational inference to tune such a probabilistic model w.r.t. training data and specific prior beliefs. This way, we can explicitly manage the uncertainty of the ensemble of NNs. We further provide strategies to make the training efficient. The proposed approach achieves better uncertainty quantification than DE and its variants across diverse scenarios, while consuming only marginally added training cost compared to standard DE. The code is available at https://github.com/thudzj/DE-GP.

NeurIPS Conference 2024 Conference Paper

Conjugate Bayesian Two-step Change Point Detection for Hawkes Process

  • Zeyue Zhang
  • Xiaoling LU
  • Feng Zhou

The Bayesian two-step change point detection method is popular for the Hawkes process due to its simplicity and intuitiveness. However, the non-conjugacy between the point process likelihood and the prior requires most existing Bayesian two-step change point detection methods to rely on non-conjugate inference methods. These methods lack analytical expressions, leading to low computational efficiency and impeding timely change point detection. To address this issue, this work employs data augmentation to propose a conjugate Bayesian two-step change point detection method for the Hawkes process, which proves to be more accurate and efficient. Extensive experiments on both synthetic and real data demonstrate the superior effectiveness and efficiency of our method compared to baseline methods. Additionally, we conduct ablation studies to explore the robustness of our method concerning various hyperparameters.

NeurIPS Conference 2024 Conference Paper

Is Score Matching Suitable for Estimating Point Processes?

  • Haoqun Cao
  • Zizhuo Meng
  • Tianjun Ke
  • Feng Zhou

Score matching estimators for point processes have gained widespread attention in recent years because they do not require the calculation of intensity integrals, thereby effectively addressing the computational challenges in maximum likelihood estimation (MLE). Some existing works have proposed score matching estimators for point processes. However, this work demonstrates that the incompleteness of the estimators proposed in those works renders them applicable only to specific problems, and they fail for more general point processes. To address this issue, this work introduces the weighted score matching estimator to point processes. Theoretically, we prove the consistency of the estimator we propose. Experimental results indicate that our estimator accurately estimates model parameters on synthetic data and yields results consistent with MLE on real data. In contrast, existing score matching estimators fail to perform effectively. Codes are publicly available at \url{https: //github. com/KenCao2007/WSM_TPP}.

AAAI Conference 2024 Conference Paper

Lifting by Image – Leveraging Image Cues for Accurate 3D Human Pose Estimation

  • Feng Zhou
  • Jianqin Yin
  • Peiyang Li

The "lifting from 2D pose" method has been the dominant approach to 3D Human Pose Estimation (3DHPE) due to the powerful visual analysis ability of 2D pose estimators. Widely known, there exists a depth ambiguity problem when estimating solely from 2D pose, where one 2D pose can be mapped to multiple 3D poses. Intuitively, the rich semantic and texture information in images can contribute to a more accurate "lifting" procedure. Yet, existing research encounters two primary challenges. Firstly, the distribution of image data in 3D motion capture datasets is too narrow because of the laboratorial environment, which leads to poor generalization ability of methods trained with image information. Secondly, effective strategies for leveraging image information are lacking. In this paper, we give new insight into the cause of poor generalization problems and the effectiveness of image features. Based on that, we propose an advanced framework. Specifically, the framework consists of two stages. First, we enable the keypoints to query and select the beneficial features from all image patches. To reduce the keypoints attention to inconsequential background features, we design a novel Pose-guided Transformer Layer, which adaptively limits the updates to unimportant image patches. Then, through a designed Adaptive Feature Selection Module, we prune less significant image patches from the feature map. In the second stage, we allow the keypoints to further emphasize the retained critical image features. This progressive learning approach prevents further training on insignificant image features. Experimental results show that our model achieves state-of-the-art performance on both the Human3.6M dataset and the MPI-INF-3DHP dataset.

AAAI Conference 2024 Conference Paper

Mitigating Label Bias in Machine Learning: Fairness through Confident Learning

  • Yixuan Zhang
  • Boyu Li
  • Zenan Ling
  • Feng Zhou

Discrimination can occur when the underlying unbiased labels are overwritten by an agent with potential bias, resulting in biased datasets that unfairly harm specific groups and cause classifiers to inherit these biases. In this paper, we demonstrate that despite only having access to the biased labels, it is possible to eliminate bias by filtering the fairest instances within the framework of confident learning. In the context of confident learning, low self-confidence usually indicates potential label errors; however, this is not always the case. Instances, particularly those from underrepresented groups, might exhibit low confidence scores for reasons other than labeling errors. To address this limitation, our approach employs truncation of the confidence score and extends the confidence interval of the probabilistic threshold. Additionally, we incorporate with co-teaching paradigm for providing a more robust and reliable selection of fair instances and effectively mitigating the adverse effects of biased labels. Through extensive experimentation and evaluation of various datasets, we demonstrate the efficacy of our approach in promoting fairness and reducing the impact of label bias in machine learning models.

JBHI Journal 2024 Journal Article

Multi-Kernel Graph Attention Deep Autoencoder for MiRNA-Disease Association Prediction

  • Cui-Na Jiao
  • Feng Zhou
  • Bao-Min Liu
  • Chun-Hou Zheng
  • Jin-Xing Liu
  • Ying-Lian Gao

Accumulating evidence indicates that microRNAs (miRNAs) can control and coordinate various biological processes. Consequently, abnormal expressions of miRNAs have been linked to various complex diseases. Recognizable proof of miRNA-disease associations (MDAs) will contribute to the diagnosis and treatment of human diseases. Nevertheless, traditional experimental verification of MDAs is laborious and limited to small-scale. Therefore, it is necessary to develop reliable and effective computational methods to predict novel MDAs. In this work, a multi-kernel graph attention deep autoencoder (MGADAE) method is proposed to predict potential MDAs. In detail, MGADAE first employs the multiple kernel learning (MKL) algorithm to construct an integrated miRNA similarity and disease similarity, providing more biological information for further feature learning. Second, MGADAE combines the known MDAs, disease similarity, and miRNA similarity into a heterogeneous network, then learns the representations of miRNAs and diseases through graph convolution operation. After that, an attention mechanism is introduced into MGADAE to integrate the representations from multiple graph convolutional network (GCN) layers. Lastly, the integrated representations of miRNAs and diseases are input into the bilinear decoder to obtain the final predicted association scores. Corresponding experiments prove that the proposed method outperforms existing advanced approaches in MDA prediction. Furthermore, case studies related to two human cancers provide further confirmation of the reliability of MGADAE in practice.

NeurIPS Conference 2024 Conference Paper

Nonstationary Sparse Spectral Permanental Process

  • Zicheng Sun
  • Yixuan Zhang
  • Zenan Ling
  • Xuhui Fan
  • Feng Zhou

Existing permanental processes often impose constraints on kernel types or stationarity, limiting the model's expressiveness. To overcome these limitations, we propose a novel approach utilizing the sparse spectral representation of nonstationary kernels. This technique relaxes the constraints on kernel types and stationarity, allowing for more flexible modeling while reducing computational complexity to the linear level. Additionally, we introduce a deep kernel variant by hierarchically stacking multiple spectral feature mappings, further enhancing the model's expressiveness to capture complex patterns in data. Experimental results on both synthetic and real-world datasets demonstrate the effectiveness of our approach, particularly in scenarios with pronounced data nonstationarity. Additionally, ablation studies are conducted to provide insights into the impact of various hyperparameters on model performance.

AAAI Conference 2023 Conference Paper

Improving Crowded Object Detection via Copy-Paste

  • Jiangfan Deng
  • Dewen Fan
  • Xiaosong Qiu
  • Feng Zhou

Crowdedness caused by overlapping among similar objects is a ubiquitous challenge in the field of 2D visual object detection. In this paper, we first underline two main effects of the crowdedness issue: 1) IoU-confidence correlation disturbances (ICD) and 2) confused de-duplication (CDD). Then we explore a pathway of cracking these nuts from the perspective of data augmentation. Primarily, a particular copy- paste scheme is proposed towards making crowded scenes. Based on this operation, we first design a "consensus learning" method to further resist the ICD problem and then find out the pasting process naturally reveals a pseudo "depth" of object in the scene, which can be potentially used for alleviating CDD dilemma. Both methods are derived from magical using of the copy-pasting without extra cost for hand-labeling. Experiments show that our approach can easily improve the state-of-the-art detector in typical crowded detection task by more than 2% without any bells and whistles. Moreover, this work can outperform existing data augmentation strategies in crowded scenario.

NeurIPS Conference 2023 Conference Paper

Integration-free Training for Spatio-temporal Multimodal Covariate Deep Kernel Point Processes

  • Yixuan Zhang
  • Quyu Kong
  • Feng Zhou

In this study, we propose a novel deep spatio-temporal point process model, Deep Kernel Mixture Point Processes (DKMPP), that incorporates multimodal covariate information. DKMPP is an enhanced version of Deep Mixture Point Processes (DMPP), which uses a more flexible deep kernel to model complex relationships between events and covariate data, improving the model's expressiveness. To address the intractable training procedure of DKMPP due to the non-integrable deep kernel, we utilize an integration-free method based on score matching, and further improve efficiency by adopting a scalable denoising score matching method. Our experiments demonstrate that DKMPP and its corresponding score-based estimators outperform baseline models, showcasing the advantages of incorporating covariate information, utilizing a deep kernel, and employing score-based estimators.

NeurIPS Conference 2023 Conference Paper

Revisiting Logistic-softmax Likelihood in Bayesian Meta-Learning for Few-Shot Classification

  • Tianjun Ke
  • Haoqun Cao
  • Zenan Ling
  • Feng Zhou

Meta-learning has demonstrated promising results in few-shot classification (FSC) by learning to solve new problems using prior knowledge. Bayesian methods are effective at characterizing uncertainty in FSC, which is crucial in high-risk fields. In this context, the logistic-softmax likelihood is often employed as an alternative to the softmax likelihood in multi-class Gaussian process classification due to its conditional conjugacy property. However, the theoretical property of logistic-softmax is not clear and previous research indicated that the inherent uncertainty of logistic-softmax leads to suboptimal performance. To mitigate these issues, we revisit and redesign the logistic-softmax likelihood, which enables control of the \textit{a priori} confidence level through a temperature parameter. Furthermore, we theoretically and empirically show that softmax can be viewed as a special case of logistic-softmax and logistic-softmax induces a larger family of data distribution than softmax. Utilizing modified logistic-softmax, we integrate the data augmentation technique into the deep kernel based Gaussian process meta-learning framework, and derive an analytical mean-field approximation for task-specific updates. Our approach yields well-calibrated uncertainty estimates and achieves comparable or superior results on standard benchmark datasets. Code is publicly available at \url{https: //github. com/keanson/revisit-logistic-softmax}.

NeurIPS Conference 2022 Conference Paper

Accelerated Linearized Laplace Approximation for Bayesian Deep Learning

  • Zhijie Deng
  • Feng Zhou
  • Jun Zhu

Laplace approximation (LA) and its linearized variant (LLA) enable effortless adaptation of pretrained deep neural networks to Bayesian neural networks. The generalized Gauss-Newton (GGN) approximation is typically introduced to improve their tractability. However, LA and LLA are still confronted with non-trivial inefficiency issues and should rely on Kronecker-factored, diagonal, or even last-layer approximate GGN matrices in practical use. These approximations are likely to harm the fidelity of learning outcomes. To tackle this issue, inspired by the connections between LLA and neural target kernels (NTKs), we develop a Nystrom approximation to NTKs to accelerate LLA. Our method benefits from the capability of popular deep learning libraries for forward mode automatic differentiation, and enjoys reassuring theoretical guarantees. Extensive studies reflect the merits of the proposed method in aspects of both scalability and performance. Our method can even scale up to architectures like vision transformers. We also offer valuable ablation studies to diagnose our method. Code is available at https: //github. com/thudzj/ELLA.

JMLR Journal 2022 Journal Article

Efficient Inference for Dynamic Flexible Interactions of Neural Populations

  • Feng Zhou
  • Quyu Kong
  • Zhijie Deng
  • Jichao Kan
  • Yixuan Zhang
  • Cheng Feng
  • Jun Zhu

Hawkes process provides an effective statistical framework for analyzing the interactions of neural spiking activities. Although utilized in many real applications, the classic Hawkes process is incapable of modeling inhibitory interactions among neural population. Instead, the nonlinear Hawkes process allows for modeling a more flexible influence pattern with excitatory or inhibitory interactions. This work proposes a flexible nonlinear Hawkes process variant based on sigmoid nonlinearity. To ease inference, three sets of auxiliary latent variables (Polya-Gamma variables, latent marked Poisson processes and sparsity variables) are augmented to make functional connection weights appear in a Gaussian form, which enables simple iterative algorithms with analytical updates. As a result, the efficient Gibbs sampler, expectation-maximization algorithm and mean-field approximation are derived to estimate the interactions among neural populations. Furthermore, to reconcile with time-varying neural systems, the proposed time-invariant model is extended to a dynamic version by introducing a Markov state process. Similarly, three analytical iterative inference algorithms: Gibbs sampler, EM algorithm and mean-field approximation are derived. We compare the accuracy and efficiency of these inference algorithms on synthetic data, and further experiment on real neural recordings to demonstrate that the developed models achieve superior performance over the state-of-the-art competitors. [abs] [ pdf ][ bib ] &copy JMLR 2022. ( edit, beta )

NeurIPS Conference 2021 Conference Paper

Continuous-time edge modelling using non-parametric point processes

  • Xuhui Fan
  • Bin Li
  • Feng Zhou
  • Scott SIsson

The mutually-exciting Hawkes process (ME-HP) is a natural choice to model reciprocity, which is an important attribute of continuous-time edge (dyadic) data. However, existing ways of implementing the ME-HP for such data are either inflexible, as the exogenous (background) rate functions are typically constant and the endogenous (excitation) rate functions are specified parametrically, or inefficient, as inference usually relies on Markov chain Monte Carlo methods with high computational costs. To address these limitations, we discuss various approaches to model design, and develop three variants of non-parametric point processes for continuous-time edge modelling (CTEM). The resulting models are highly adaptable as they generate intensity functions through sigmoidal Gaussian processes, and so provide greater modelling flexibility than parametric forms. The models are implemented via a fast variational inference method enabled by a novel edge modelling construction. The superior performance of the proposed CTEM models is demonstrated through extensive experimental evaluations on four real-world continuous-time edge data sets.

AAAI Conference 2021 Conference Paper

Robust Lightweight Facial Expression Recognition Network with Label Distribution Training

  • Zengqun Zhao
  • Qingshan Liu
  • Feng Zhou

This paper presents an efficiently robust facial expression recognition (FER) network, named EfficientFace, which holds much fewer parameters but more accurate and robust to the FER in the wild. Firstly, to improve the robustness of the lightweight network, a local-feature extractor and a channel-spatial modulator are designed, in which the depthwise convolution is employed. As a result, the network is aware of local and global-salient facial features. Then, considering the fact that most emotions occur as combinations, mixtures, or compounds of the basic emotions, we introduce a simple but efficient label distribution learning (LDL) method as a novel training strategy. Experiments conducted on realistic occlusion and pose variation datasets demonstrate that the proposed EfficientFace is robust under occlusion and pose variation conditions. Moreover, the proposed method achieves state-of-the-art results on RAF-DB, CAER-S, and AffectNet-7 datasets with accuracies of 88. 36%, 85. 87%, and 63. 70%, respectively, and a comparable result on the AffectNet-8 dataset with an accuracy of 59. 89%. The code and training logs are available at https: //github. com/zengqunzhao/EfficientFace.

AAAI Conference 2021 Conference Paper

Searching for Alignment in Face Recognition

  • Xiaqing Xu
  • Qiang Meng
  • Yunxiao Qin
  • Jianzhu Guo
  • Chenxu Zhao
  • Feng Zhou
  • Zhen Lei

A standard pipeline of current face recognition frameworks consists of four individual steps: locating a face with a rough bounding box and several fiducial landmarks, aligning the face image using a pre-defined template, extracting representations and comparing. Among them, face detection, landmark detection and representation learning have long been studied and a lot of works have been proposed. As an essential step with a significant impact on recognition performance, the alignment step has attracted little attention. In this paper, we first explore and highlight the effects of different alignment templates on face recognition. Then, for the first time, we try to search for the optimal template automatically. We construct a well-defined searching space by decomposing the template searching into the crop size and vertical shift, and propose an efficient method Face Alignment Policy Search (FAPS). Besides, a well-designed benchmark is proposed to evaluate the searched policy. Experiments on our proposed benchmark validate the effectiveness of our method to improve face recognition performance.

JMLR Journal 2020 Journal Article

Efficient Inference for Nonparametric Hawkes Processes Using Auxiliary Latent Variables

  • Feng Zhou
  • Zhidong Li
  • Xuhui Fan
  • Yang Wang
  • Arcot Sowmya
  • Fang Chen

The expressive ability of classic Hawkes processes is limited due to the parametric assumption on the baseline intensity and triggering kernel. Therefore, it is desirable to perform inference in a data-driven, nonparametric approach. Many recent works have proposed nonparametric Hawkes process models based on Gaussian processes (GP). However, the likelihood is non-conjugate to the prior resulting in a complicated and time-consuming inference procedure. To address the problem, we present the sigmoid Gaussian Hawkes process model in this paper: the baseline intensity and triggering kernel are both modeled as the sigmoid transformation of random trajectories drawn from a GP. By introducing auxiliary latent random variables (branching structure, P\'{o}lya-Gamma random variables and latent marked Poisson processes), the likelihood is converted to two decoupled components with a Gaussian form which allows for an efficient conjugate analytical inference. Using the augmented likelihood, we derive an efficient Gibbs sampling algorithm to sample from the posterior; an efficient expectation-maximization (EM) algorithm to obtain the maximum a posteriori (MAP) estimate and furthermore an efficient mean-field variational inference algorithm to approximate the posterior. To further accelerate the inference, a sparse GP approximation is introduced to reduce complexity. We demonstrate the performance of our three algorithms on both simulated and real data. The experiments show that our proposed inference algorithms can recover well the underlying prompting characteristics efficiently. [abs] [ pdf ][ bib ] &copy JMLR 2020. ( edit, beta )

AAAI Conference 2020 Conference Paper

Learning Meta Model for Zero- and Few-Shot Face Anti-Spoofing

  • Yunxiao Qin
  • Chenxu Zhao
  • Xiangyu Zhu
  • Zezheng Wang
  • Zitong Yu
  • Tianyu Fu
  • Feng Zhou
  • Jingping Shi

Face anti-spoofing is crucial to the security of face recognition systems. Most previous methods formulate face antispoofing as a supervised learning problem to detect various predefined presentation attacks, which need large scale training data to cover as many attacks as possible. However, the trained model is easy to overfit several common attacks and is still vulnerable to unseen attacks. To overcome this challenge, the detector should: 1) learn discriminative features that can generalize to unseen spoofing types from predefined presentation attacks; 2) quickly adapt to new spoofing types by learning from both the predefined attacks and a few examples of the new spoofing types. Therefore, we define face anti-spoofing as a zero- and few-shot learning problem. In this paper, we propose a novel Adaptive Inner-update Meta Face Anti-Spoofing (AIM-FAS) method to tackle this problem through meta-learning. Specifically, AIM-FAS trains a meta-learner focusing on the task of detecting unseen spoofing types by learning from predefined living and spoofing faces and a few examples of new attacks. To assess the proposed approach, we propose several benchmarks for zeroand few-shot FAS. Experiments show its superior performances on the presented benchmarks to existing methods in existing zero-shot FAS protocols.

JBHI Journal 2019 Journal Article

Dense Deconvolutional Network for Skin Lesion Segmentation

  • Hang Li
  • Xinzi He
  • Feng Zhou
  • Zhen Yu
  • Dong Ni
  • Siping Chen
  • Tianfu Wang
  • Baiying Lei

Automatic delineation of skin lesion contours from dermoscopy images is a basic step in the process of diagnosis and treatment of skin lesions. However, it is a challenging task due to the high variation of appearances and sizes of skin lesions. In order to deal with such challenges, we propose a new dense deconvolutional network (DDN) for skin lesion segmentation based on residual learning. Specifically, the proposed network consists of dense deconvolutional layers (DDLs), chained residual pooling (CRP), and hierarchical supervision (HS). First, unlike traditional deconvolutional layers, DDLs are adopted to maintain the dimensions of the input and output images unchanged. The DDNs are trained in an end-to-end manner without the need of prior knowledge or complicated postprocessing procedures. Second, the CRP aims to capture rich contextual background information and to fuse multilevel features. By combining the local and global contextual information via multilevel feature fusion, the high-resolution prediction output is obtained. Third, HS is added to serve as an auxiliary loss and to refine the prediction mask. Extensive experiments based on the public ISBI 2016 and 2017 skin lesion challenge datasets demonstrate the superior segmentation results of our proposed method over the state-of-the-art methods.

JBHI Journal 2019 Journal Article

Neuroimaging Retrieval via Adaptive Ensemble Manifold Learning for Brain Disease Diagnosis

  • Baiying Lei
  • Peng Yang
  • Yinan Zhuo
  • Feng Zhou
  • Dong Ni
  • Siping Chen
  • Xiaohua Xiao
  • Tianfu Wang

Alzheimer's disease (AD) is a neurodegenerative and non-curable disease, with serious cognitive impairment, such as dementia. Clinically, it is critical to study the disease with multi-source data in order to capture a global picture of it. In this respect, an adaptive ensemble manifold learning (AEML) algorithm is proposed to retrieve multi-source neuroimaging data. Specifically, an objective function based on manifold learning is formulated to impose geometrical constraints by similarity learning. The complementary characteristics of various sources of brain disease data for disorder discovery are investigated by tuning weights from ensemble learning. In addition, a generalized norm is explicitly explored for adaptive sparseness degree control. The proposed AEML algorithm is evaluated by the public AD neuroimaging initiative database. Results obtained from the extensive experiments demonstrate that our algorithm outperforms the traditional methods.

JBHI Journal 2019 Journal Article

Parkinson's Disease Diagnosis via Joint Learning From Multiple Modalities and Relations

  • Haijun Lei
  • Zhongwei Huang
  • Feng Zhou
  • Ahmed Elazab
  • Ee-Leng Tan
  • Hancong Li
  • Jing Qin
  • Baiying Lei

Parkinson's disease (PD) is a neurodegenerative progressive disease that mainly affects the motor systems of patients. To slow this disease deterioration, early and accurate diagnosis of PD is an effective way, which alleviates mental and physical sufferings by clinical intervention. In this paper, we propose a joint regression and classification framework for PD diagnosis via magnetic resonance and diffusion tensor imaging data. Specifically, we devise a unified multitask feature selection model to explore multiple relationships among features, samples, and clinical scores. We regress four clinical variables of depression, sleep, olfaction, cognition scores, as well as perform the classification of PD disease from the multimodal data. The multitask model explores the relationships at the level of clinical scores, image features, and subjects, to select the most informative and diseased-related features for diagnosis. The proposed method is evaluated on the public Parkinson's progression markers initiative dataset. The extensive experimental results show that the multitask framework can effectively boost the performance of regression and classification and outperforms other state-of-the-art methods. The computerized predictions of clinical scores and label for PD diagnosis may offer quantitative reference for decision support as well.

NeurIPS Conference 2018 Conference Paper

Compact Generalized Non-local Network

  • Kaiyu Yue
  • Ming Sun
  • Yuchen Yuan
  • Feng Zhou
  • Errui Ding
  • Fuxin Xu

The non-local module is designed for capturing long-range spatio-temporal dependencies in images and videos. Although having shown excellent performance, it lacks the mechanism to model the interactions between positions across channels, which are of vital importance in recognizing fine-grained objects and actions. To address this limitation, we generalize the non-local module and take the correlations between the positions of any two channels into account. This extension utilizes the compact representation for multiple kernel functions with Taylor expansion that makes the generalized non-local module in a fast and low-complexity computation flow. Moreover, we implement our generalized non-local method within channel groups to ease the optimization. Experimental results illustrate the clear-cut improvements and practical applicability of the generalized non-local module on both fine-grained object recognition and video classification. Code is available at: https: //github. com/KaiyuYue/cgnl-network. pytorch.

JBHI Journal 2017 Journal Article

Segmentation, Splitting, and Classification of Overlapping Bacteria in Microscope Images for Automatic Bacterial Vaginosis Diagnosis

  • Youyi Song
  • Liang He
  • Feng Zhou
  • Siping Chen
  • Dong Ni
  • Baiying Lei
  • Tianfu Wang

Quantitative analysis of bacterial morphotypes in the microscope images plays a vital role in diagnosis of bacterial vaginosis (BV) based on the Nugent score criterion. However, there are two main challenges for this task: 1) It is quite difficult to identify the bacterial regions due to various appearance, faint boundaries, heterogeneous shapes, low contrast with the background, and small bacteria sizes with regards to the image. 2) There are numerous bacteria overlapping each other, which hinder us to conduct accurate analysis on individual bacterium. To overcome these challenges, we propose an automatic method in this paper to diagnose BV by quantitative analysis of bacterial morphotypes, which consists of a three-step approach, i. e. , bacteria regions segmentation, overlapping bacteria splitting, and bacterial morphotypes classification. Specifically, we first segment the bacteria regions via saliency cut, which simultaneously evaluates the global contrast and spatial weighted coherence. And then Markov random field model is applied for high-quality unsupervised segmentation of small object. We then decompose overlapping bacteria clumps into markers, and associate a pixel with markers to identify evidence for eventual individual bacterium splitting. Next, we extract morphotype features from each bacterium to learn the descriptors and to characterize the types of bacteria using an Adaptive Boosting machine learning framework. Finally, BV diagnosis is implemented based on the Nugent score criterion. Experiments demonstrate that our proposed method achieves high accuracy and efficiency in computation for BV diagnosis.

NeurIPS Conference 2009 Conference Paper

Canonical Time Warping for Alignment of Human Behavior

  • Feng Zhou
  • Fernando Torre

Alignment of time series is an important problem to solve in many scientific disciplines. In particular, temporal alignment of two or more subjects performing similar activities is a challenging problem due to the large temporal scale difference between human actions as well as the inter/intra subject variability. In this paper we present canonical time warping (CTW), an extension of canonical correlation analysis (CCA) for spatio-temporal alignment of the behavior between two subjects. CTW extends previous work on CCA in two ways: (i) it combines CCA with dynamic time warping for temporal alignment; and (ii) it extends CCA to allow local spatial deformations. We show CTWs effectiveness in three experiments: alignment of synthetic data, alignment of motion capture data of two subjects performing similar actions, and alignment of two people with similar facial expressions. Our results demonstrate that CTW provides both visually and qualitatively better alignment than state-of-the-art techniques based on dynamic time warping.