Arrow Research search

Author name cluster

Peng Yang

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

21 papers
2 author rows

Possible papers

21

AAAI Conference 2026 Conference Paper

SecMoE: Communication-Efficient Secure MoE Inference via Select-Then-Compute

  • Bowen Shen
  • Yuyue Chen
  • Peng Yang
  • Bin Zhang
  • Xi Zhang
  • Zoe L. Jiang

Privacy-preserving Transformer inference has gained attention due to the potential leakage of private information. Despite recent progress, existing frameworks still fall short of practical model scales, with gaps up to a hundredfold. A possible way to close this gap is the Mixture of Experts (MoE) architecture, which has emerged as a promising technique to scale up model capacity with minimal overhead. However, given that the current secure two-party (2-PC) protocols allow the server to homomorphically compute the FFN layer with its plaintext model weight, under the MoE setting, this could reveal which expert is activated to the server, exposing token-level privacy about the client's input. While naively evaluating all the experts before selection could protect privacy, it nullifies MoE sparsity and incurs the heavy computational overhead that sparse MoE seeks to avoid. To address the privacy and efficiency limitations above, we propose a 2-PC privacy-preserving inference framework, SecMoE. Unifying per-entry circuits in both the MoE layer and piecewise polynomial functions, SecMoE obliviously selects the extracted parameters from circuits and only computes one encrypted entry, which we refer to as Select-Then-Compute. This makes the model for private inference scale to 63× larger while only having a 15.2× increase in end-to-end runtime. Extensive experiments show that, under 5 expert settings, SecMoE lowers the end-to-end private inference communication by 1.8~7.1× and achieves 1.3~3.8× speedup compared to the state-of-the-art (SOTA) protocols.

AAAI Conference 2025 Conference Paper

Expensive Multi-Objective Bayesian Optimization Based on Diffusion Models

  • Bingdong Li
  • Zixiang Di
  • Yongfan Lu
  • Hong Qian
  • Feng Wang
  • Peng Yang
  • Ke Tang
  • Aimin Zhou

Multi-objective Bayesian optimization (MOBO) has shown promising performance on various expensive multi-objective optimization problems (EMOPs). However, effectively modeling complex distributions of the Pareto optimal solutions is difficult with limited function evaluations. Existing Pareto set learning algorithms may exhibit considerable instability in such expensive scenarios, leading to significant deviations between the obtained solution set and the Pareto set (PS). In this paper, we propose a novel Composite Diffusion Model based Pareto Set Learning algorithm (CDM-PSL) for expensive MOBO. CDM-PSL includes both unconditional and conditional diffusion model for generating high-quality samples efficiently. Besides, we introduce a weighting method based on information entropy to balance different objectives. This method is integrated with a guiding strategy to appropriately balancing different objectives during the optimization process. Experimental results on both synthetic and real-world problems demonstrates that CDM-PSL attains superior performance compared with state-of-the-art MOBO algorithms.

AAAI Conference 2023 Conference Paper

Defending Backdoor Attacks on Vision Transformer via Patch Processing

  • Khoa D. Doan
  • Yingjie Lao
  • Peng Yang
  • Ping Li

Vision Transformers (ViTs) have a radically different architecture with significantly less inductive bias than Convolutional Neural Networks. Along with the improvement in performance, security and robustness of ViTs are also of great importance to study. In contrast to many recent works that exploit the robustness of ViTs against adversarial examples, this paper investigates a representative causative attack, i.e., backdoor. We first examine the vulnerability of ViTs against various backdoor attacks and find that ViTs are also quite vulnerable to existing attacks. However, we observe that the clean-data accuracy and backdoor attack success rate of ViTs respond distinctively to patch transformations before the positional encoding. Then, based on this finding, we propose an effective method for ViTs to defend both patch-based and blending-based trigger backdoor attacks via patch processing. The performances are evaluated on several benchmark datasets, including CIFAR10, GTSRB, and TinyImageNet, which show the proposedds defense is very successful in mitigating backdoor attacks for ViTs. To the best of our knowledge, this paper presents the first defensive strategy that utilizes a unique characteristic of ViTs against backdoor attacks.

AAAI Conference 2022 Conference Paper

DeepAuth: A DNN Authentication Framework by Model-Unique and Fragile Signature Embedding

  • Yingjie Lao
  • Weijie Zhao
  • Peng Yang
  • Ping Li

Along with the evolution of deep neural networks (DNNs) in many real-world applications, the complexity of model building has also dramatically increased. It is thus vital to protect the intellectual property (IP) of the model builder and ensure the trustworthiness of the deployed models. Meanwhile, adversarial attacks on DNNs (e. g. , backdoor and poisoning attacks) that seek to inject malicious behaviors have been investigated recently, demanding a means for verifying the integrity of the deployed model to protect the users. In this paper, we present a novel DNN authentication framework Deep- Auth which embeds a unique and fragile signature to each protected DNN model. Our approach exploits sensitive key samples that are well crafted from the input space to latent space and then to logit space for producing signatures. After embedding, each model will respond distinctively to these key samples, which creates a model-unique signature as a strong tool for authentication and user identity. The signature embedding process is also designed to ensure the fragility of the signature, which can be used to detect malicious modifications such that an illegitimate user or an altered model should not have the intact signature. Extensive evaluations on various models over a wide range of datasets demonstrate the effectiveness and efficiency of the proposed DeepAuth.

NeurIPS Conference 2021 Conference Paper

Mitigating Forgetting in Online Continual Learning with Neuron Calibration

  • Haiyan Yin
  • Peng Yang
  • Ping Li

Inspired by human intelligence, the research on online continual learning aims to push the limits of the machine learning models to constantly learn from sequentially encountered tasks, with the data from each task being observed in an online fashion. Though recent studies have achieved remarkable progress in improving the online continual learning performance empowered by the deep neural networks-based models, many of today's approaches still suffer a lot from catastrophic forgetting, a persistent challenge for continual learning. In this paper, we present a novel method which attempts to mitigate catastrophic forgetting in online continual learning from a new perspective, i. e. , neuron calibration. In particular, we model the neurons in the deep neural networks-based models as calibrated units under a general formulation. Then we formalize a learning framework to effectively train the calibrated model, where neuron calibration could give ubiquitous benefit to balance the stability and plasticity of online continual learning algorithms through influencing both their forward inference path and backward optimization path. Our proposed formulation for neuron calibration is lightweight and applicable to general feed-forward neural networks-based models. We perform extensive experiments to evaluate our method on four benchmark continual learning datasets. The results show that neuron calibration plays a vital role in improving online continual learning performance and our method could substantially improve the state-of-the-art performance on all~the~evaluated~datasets.

AAAI Conference 2020 Conference Paper

Distributed Primal-Dual Optimization for Online Multi-Task Learning

  • Peng Yang
  • Ping Li

Conventional online multi-task learning algorithms suffer from two critical limitations: 1) Heavy communication caused by delivering high velocity of sequential data to a central machine; 2) Expensive runtime complexity for building task relatedness. To address these issues, in this paper we consider a setting where multiple tasks are geographically located in different places, where one task can synchronize data with others to leverage knowledge of related tasks. Specifically, we propose an adaptive primal-dual algorithm, which not only captures task-specific noise in adversarial learning but also carries out a projection-free update with runtime efficiency. Moreover, our model is well-suited to decentralized periodic-connected tasks as it allows the energy-starved or bandwidth-constraint tasks to postpone the update. Theoretical results demonstrate the convergence guarantee of our distributed algorithm with an optimal regret. Empirical results confirm that the proposed model is highly effective on various real-world datasets.

AAAI Conference 2019 Conference Paper

Approximate Kernel Selection with Strong Approximate Consistency

  • Lizhong Ding
  • Yong Liu
  • Shizhong Liao
  • Yu Li
  • Peng Yang
  • Yijie Pan
  • Chao Huang
  • Ling Shao

Kernel selection is fundamental to the generalization performance of kernel-based learning algorithms. Approximate kernel selection is an efficient kernel selection approach that exploits the convergence property of the kernel selection criteria and the computational virtue of kernel matrix approximation. The convergence property is measured by the notion of approximate consistency. For the existing Nyström approximations, whose sampling distributions are independent of the specific learning task at hand, it is difficult to establish the strong approximate consistency. They mainly focus on the quality of the low-rank matrix approximation, rather than the performance of the kernel selection criterion used in conjunction with the approximate matrix. In this paper, we propose a novel Nyström approximate kernel selection algorithm by customizing a criterion-driven adaptive sampling distribution for the Nyström approximation, which adaptively reduces the error between the approximate and accurate criteria. We theoretically derive the strong approximate consistency of the proposed Nyström approximate kernel selection algorithm. Finally, we empirically evaluate the approximate consistency of our algorithm as compared to state-of-the-art methods.

AAAI Conference 2019 Conference Paper

Confidence Weighted Multitask Learning

  • Peng Yang
  • Peilin Zhao
  • Jiayu Zhou
  • Xin Gao

Traditional online multitask learning only utilizes the firstorder information of the datastream. To remedy this issue, we propose a confidence weighted multitask learning algorithm, which maintains a Gaussian distribution over each task model to guide online learning process. The mean (covariance) of the Gaussian Distribution is a sum of a local component and a global component that is shared among all the tasks. In addition, this paper also addresses the challenge of active learning on the online multitask setting. Instead of requiring labels of all the instances, the proposed algorithm determines whether the learner should acquire a label by considering the confidence from its related tasks over label prediction. Theoretical results show the regret bounds can be significantly reduced. Empirical results demonstrate that the proposed algorithm is able to achieve promising learning efficacy, while simultaneously minimizing the labeling cost.

AAAI Conference 2019 Conference Paper

Linear Kernel Tests via Empirical Likelihood for High-Dimensional Data

  • Lizhong Ding
  • Zhi Liu
  • Yu Li
  • Shizhong Liao
  • Yong Liu
  • Peng Yang
  • Ge Yu
  • Ling Shao

We propose a framework for analyzing and comparing distributions without imposing any parametric assumptions via empirical likelihood methods. Our framework is used to study two fundamental statistical test problems: the two-sample test and the goodness-of-fit test. For the two-sample test, we need to determine whether two groups of samples are from different distributions; for the goodness-of-fit test, we examine how likely it is that a set of samples is generated from a known target distribution. Specifically, we propose empirical likelihood ratio (ELR) statistics for the two-sample test and the goodness-of-fit test, both of which are of linear time complexity and show higher power (i. e. , the probability of correctly rejecting the null hypothesis) than the existing linear statistics for high-dimensional data. We prove the nonparametric Wilks’ theorems for the ELR statistics, which illustrate that the limiting distributions of the proposed ELR statistics are chi-square distributions. With these limiting distributions, we can avoid bootstraps or simulations to determine the threshold for rejecting the null hypothesis, which makes the ELR statistics more efficient than the recently proposed linear statistic, finite set Stein discrepancy (FSSD). We also prove the consistency of the ELR statistics, which guarantees that the test power goes to 1 as the number of samples goes to infinity. In addition, we experimentally demonstrate and theoretically analyze that FSSD has poor performance or even fails to test for high-dimensional data. Finally, we conduct a series of experiments to evaluate the performance of our ELR statistics as compared to state-of-the-art linear statistics.

JBHI Journal 2019 Journal Article

Neuroimaging Retrieval via Adaptive Ensemble Manifold Learning for Brain Disease Diagnosis

  • Baiying Lei
  • Peng Yang
  • Yinan Zhuo
  • Feng Zhou
  • Dong Ni
  • Siping Chen
  • Xiaohua Xiao
  • Tianfu Wang

Alzheimer's disease (AD) is a neurodegenerative and non-curable disease, with serious cognitive impairment, such as dementia. Clinically, it is critical to study the disease with multi-source data in order to capture a global picture of it. In this respect, an adaptive ensemble manifold learning (AEML) algorithm is proposed to retrieve multi-source neuroimaging data. Specifically, an objective function based on manifold learning is formulated to impose geometrical constraints by similarity learning. The complementary characteristics of various sources of brain disease data for disorder discovery are investigated by tuning weights from ensemble learning. In addition, a generalized norm is explicitly explored for adaptive sparseness degree control. The proposed AEML algorithm is evaluated by the public AD neuroimaging initiative database. Results obtained from the extensive experiments demonstrate that our algorithm outperforms the traditional methods.

NeurIPS Conference 2019 Conference Paper

Optimal Stochastic and Online Learning with Individual Iterates

  • Yunwen Lei
  • Peng Yang
  • Ke Tang
  • Ding-Xuan Zhou

Stochastic composite mirror descent (SCMD) is a simple and efficient method able to capture both geometric and composite structures of optimization problems in machine learning. Existing strategies require to take either an average or a random selection of iterates to achieve optimal convergence rates, which, however, can either destroy the sparsity of solutions or slow down the practical training speed. In this paper, we propose a theoretically sound strategy to select an individual iterate of the vanilla SCMD, which is able to achieve optimal rates for both convex and strongly convex problems in a non-smooth learning setting. This strategy of outputting an individual iterate can preserve the sparsity of solutions which is crucial for a proper interpretation in sparse learning problems. We report experimental comparisons with several baseline methods to show the effectiveness of our method in achieving a fast training speed as well as in outputting sparse solutions.

IJCAI Conference 2018 Conference Paper

Bandit Online Learning on Graphs via Adaptive Optimization

  • Peng Yang
  • Peilin Zhao
  • Xin Gao

Traditional online learning on graphs adapts graph Laplacian into ridge regression, which may not guarantee reasonable accuracy when the data are adversarially generated. To solve this issue, we exploit an adaptive optimization framework for online classification on graphs. The derived model can achieve a min-max regret under an adversarial mechanism of data generation. To take advantage of the informative labels, we propose an adaptive large-margin update rule, which enjoys a lower regret than the algorithms using error-driven update rules. However, this algorithm assumes that the full information label is provided for each node, which is violated in many practical applications where labeling is expensive and the oracle may only tell whether the prediction is correct or not. To address this issue, we propose a bandit online algorithm on graphs. It derives per-instance confidence region of the prediction, from which the model can be learned adaptively to minimize the online regret. Experiments on benchmark graph datasets show that the proposed bandit algorithm outperforms state-of-the-art competitors, even sometimes beats the algorithms using full information label feedback.

AAAI Conference 2018 Conference Paper

Randomized Kernel Selection With Spectra of Multilevel Circulant Matrices

  • Lizhong Ding
  • Shizhong Liao
  • Yong Liu
  • Peng Yang
  • Xin Gao

Kernel selection aims at choosing an appropriate kernel function for kernel-based learning algorithms to avoid either underfitting or overfitting of the resulting hypothesis. One of the main problems faced by kernel selection is the evaluation of the goodness of a kernel, which is typically difficult and computationally expensive. In this paper, we propose a randomized kernel selection approach to evaluate and select the kernel with the spectra of the specifically designed multilevel circulant matrices (MCMs), which is statistically sound and computationally efficient. Instead of constructing the kernel matrix, we construct the randomized MCM to encode the kernel function and all data points together with labels. We build a one-to-one correspondence between all candidate kernel functions and the spectra of the randomized MCMs by Fourier transform. We prove the statistical properties of the randomized MCMs and the randomized kernel selection criteria, which theoretically qualify the utility of the randomized criteria in kernel selection. With the spectra of the randomized MCMs, we derive a series of randomized criteria to conduct kernel selection, which can be computed in log-linear time and linear space complexity by fast Fourier transform (FFT). Experimental results demonstrate that our randomized kernel selection criteria are significantly more efficient than the existing classic and widely-used criteria while preserving similar predictive performance.

AAAI Conference 2017 Conference Paper

Incorporating Expert Knowledge into Keyphrase Extraction

  • Sujatha Das Gollapalli
  • Xiao-li Li
  • Peng Yang

Keyphrases that efficiently summarize a document’s content are used in various document processing and retrieval tasks. Current state-of-the-art techniques for keyphrase extraction operate at a phrase-level and involve scoring candidate phrases based on features of their component words. In this paper, we learn keyphrase taggers for research papers using token-based features incorporating linguistic, surfaceform, and document-structure information through sequence labeling. We experimentally illustrate that using withindocument features alone, our tagger trained with Conditional Random Fields performs on-par with existing state-of-the-art systems that rely on information from Wikipedia and citation networks. In addition, we are also able to harness recent work on feature labeling to seamlessly incorporate expert knowledge and predictions from existing systems to enhance the extraction performance further. We highlight the modeling advantages of our keyphrase taggers and show significant performance improvements on two recently-compiled datasets of keyphrases from Computer Science research papers.

ICRA Conference 2007 Conference Paper

Distributed Cooperative Active Sensing Using Consensus Filters

  • Peng Yang
  • Randy A. Freeman
  • Kevin M. Lynch

We consider the problem of multiple mobile sensor agents tracking the position of one or more moving targets. In our formulation, each agent maintains a target estimate, and each agent moves so as to maximize the expected information from its sensor, relative to the current uncertainty in the estimate. The novelty of our approach is that each agent need only communicate with one-hop neighbors in a communication network, resulting in a fully distributed and scalable algorithm, yet the performance of the system approximates that of a centralized optimal solution to the same problem. We provide two fully distributed algorithms based on one-time measurements and a Kalman filter approach, and we validate the algorithms with simulations.

ICRA Conference 2006 Conference Paper

Optimal Information Propagation in Sensor Networks

  • Peng Yang
  • Randy A. Freeman
  • Kevin M. Lynch

This paper studies the effect of communication link weighting schemes and long-range connections in speeding convergence to consensus on the average of the inputs to agents in a sensor network. Linear programming and LMI solutions are provided for the problem of centralized optimization of communication weights considering the possibility of intermittent bounded communication delays. Heuristic weighting schemes that can be implemented in a distributed fashion are also studied. Consensus convergence time may also be reduced by introducing long-range interactions, creating a "small world" network