Arrow Research search

Author name cluster

Peng Cheng

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

14 papers
1 author row

Possible papers

14

AAAI Conference 2026 Conference Paper

Beyond Content: A Comprehensive Speech Toxicity Dataset and Detection Framework Incorporating Paralinguistic Cues

  • Zhongjie Ba
  • Liang Yi
  • Peng Cheng
  • Qingcao Li
  • Qinglong Wang
  • Li Lu

Toxic speech detection has become a crucial challenge in maintaining safe online communication environments. However, existing approaches to toxic speech detection often neglect the contribution of paralinguistic cues, such as emotion, intonation, and speech rate, which are key to detecting speech toxicity. Moreover, current toxic speech datasets are predominantly text-based, limiting the development of models that can capture paralinguistic cues. To address these challenges, we present ToxiAlert-Bench, a large-scale audio dataset comprising over 30,000 audio clips annotated with seven major toxic categories and twenty fine-grained toxic labels. Uniquely, our dataset annotates toxicity sources—distinguishing between textual content and paralinguistic origins—for comprehensive toxic speech analysis. Furthermore, we propose a dual-head neural network with a multi-stage training strategy tailored for toxic speech detection. This architecture features two task-specific classification headers: one for identifying the source of sensitivity (textual or paralinguistic), and the other for categorizing the specific toxic type. The training process involves independent head training followed by joint fine-tuning to reduce task interference. To mitigate data class imbalance, we incorporate class-balanced sampling and weighted loss functions. Our experimental results show that leveraging paralinguistic features significantly improves detection performance. Our method consistently outperforms existing baselines across multiple evaluation metrics, with a 21.1% relative improvement in Macro-F1 score and a 13.0% relative gain in accuracy over the strongest baseline, highlighting its enhanced effectiveness and practical applicability.

AAAI Conference 2025 Conference Paper

Enhancing Large Language Model Performance with Gradient-Based Parameter Selection

  • Haoling Li
  • Xin Zhang
  • Xiao Liu
  • Yeyun Gong
  • Yifan Wang
  • Qi Chen
  • Peng Cheng

Large language models (LLMs) have revolutionized numerous fields of research, driving significant advancements in natural language processing, machine translation, and beyond. Although the extensive number of parameters contributes a lot to the great success, existing studies indicate that not all model parameters hold equal importance, which further leads to redundancy during the parameter update process. Recent works for reducing redundant parameter updates for LLMs either lack task-specific data information, may leading to suboptimal model performance, or discard transformer components or insignificant parameters, limiting the model's scalability across different tasks and potentially compromising the LLM structure. To address these issues and further enhance the performance of LLMs, we propose Gradient-Mask Tuning (GMT), a method that selectively updates parameters based on gradient information, which is specific to the target tasks. Specifically, after calculating gradients during back propagation, we measure their absolute values and mask those with small absolute values. Our empirical results in various training paradigms like SFT and DPO for various domains of tasks demonstrate that GMT not only preserves the original network structure but also enhances the potential performance of LLMs. Further analysis indicates that GMT exhibits insensitivity to mask ratio and possesses computational efficiency comparable to vanilla training approach.

AAMAS Conference 2025 Conference Paper

Enhancing Offline Safe Reinforcement Learning with Trajectory-Constrained Diffusion Planning

  • Hengrui Zhang
  • Youfang Lin
  • Shuo Shen
  • Hanfeng Lin
  • Peng Cheng
  • Sheng Han
  • Kai Lv

Recent approaches have utilized the RL via Supervised Learning (RvS) framework to model offline safe RL. However, these methods overlook the fundamental differences between reward maximization and constraint satisfaction, treating them identically with guidance sampling, and requiring different hyperparameters for different constraint conditions. To address these limitations, we propose a novel framework, the Trajectory-Constrained Diffusion Planner (TCDP), which reframes offline safe RL as a product of trajectory conditional probabilities and energy functions. Additionally, we introduce Cost-returns-To-Go relabeling with Data Augmentation (CTGDA) and the Quantile Normalization (QN) technique, enabling the adaptation to various constraints without retraining or extensive hyperparameter adjustments.

AAAI Conference 2025 Conference Paper

Fed-DFA: Federated Distillation for Heterogeneous Model Fusion Through the Adversarial Lens

  • Zichen Wang
  • Feng Yan
  • Tianyi Wang
  • Cong Wang
  • Yuanchao Shu
  • Peng Cheng
  • Jiming Chen

Most of the federated learning techniques are limited to homogeneous model fusion. With the rapid growth of smart applications on resource-constrained edge devices, it becomes a barrier to accommodate their heterogeneous computing power and memory in the real world. Federated Distillation is a promising alternative to enable aggregation from heterogeneous models. However, the effectiveness of knowledge transfer still remains elusive under the shadow of distinct representation power from heterogeneous models. In this paper, we approach from an adversarial perspective to characterize the decision boundaries during distillation. By leveraging K-step PGD attacks, we successfully model the dynamics of the closest boundary points and establish a quantitative connection between the predictive uncertainty and boundary margin. Based on these findings, we further propose a new loss function to make the distillation attend to samples close to the decision boundaries, thus learning from more informed logit distributions. The extensive experiments over CIFAR-10/100 and Tiny-ImageNet demonstrate about 0.5-3.5% improvement of accuracy under different IID and non-IID settings, with only a small increment of computational overhead.

IJCAI Conference 2025 Conference Paper

FedSaaS: Class-Consistency Federated Semantic Segmentation via Global Prototype Supervision and Local Adversarial Harmonization

  • Xiaoyang Yu
  • Xiaoming Wu
  • Xin Wang
  • Dongrun Li
  • Ming Yang
  • Peng Cheng

Federated semantic segmentation enables pixel-level classification in images through collaborative learning while maintaining data privacy. However, existing research commonly overlooks the fine-grained class relationships within the semantic space when addressing heterogeneous problems, particularly domain shift. This oversight results in ambiguities between class representation. To overcome this challenge, we propose a novel federated segmentation framework that strikes class consistency, termed FedSaaS. Specifically, we introduce class exemplars as a criterion for both local- and global-level class representations. On the server side, the uploaded class exemplars are leveraged to model class prototypes, which supervise global branch of clients, ensuring alignment with global-level representation. On the client side, we incorporate an adversarial mechanism to harmonize contributions of global and local branches, leading to consistent output. Moreover, multilevel contrastive losses are employed on both sides to enforce consistency between two-level representations in the same semantic space. Extensive experiments on five driving scene segmentation datasets demonstrate that our framework outperforms state-of-the-art methods, significantly improving average segmentation accuracy and effectively addressing the class-consistency representation problem.

NeurIPS Conference 2025 Conference Paper

WMCopier: Forging Invisible Watermarks on Arbitrary Images

  • Ziping Dong
  • Chao Shuai
  • Zhongjie Ba
  • Peng Cheng
  • Zhan Qin
  • Qinglong Wang
  • Kui Ren

Invisible Image Watermarking is crucial for ensuring content provenance and accountability in generative AI. While Gen-AI providers are increasingly integrating invisible watermarking systems, the robustness of these schemes against forgery attacks remains poorly characterized. This is critical, as forging traceable watermarks onto illicit content leads to false attribution, potentially harming the reputation and legal standing of Gen-AI service providers who are not responsible for the content. In this work, we propose WMCopier, an effective watermark forgery attack that operates without requiring any prior knowledge of or access to the target watermarking algorithm. Our approach first models the target watermark distribution using an unconditional diffusion model, and then seamlessly embeds the target watermark into a non-watermarked image via a shallow inversion process. We also incorporate an iterative optimization procedure that refines the reconstructed image to further trade off the fidelity and forgery efficiency. Experimental results demonstrate that WMCopier effectively deceives both open-source and closed-source watermark systems (e. g. , Amazon’s system), achieving a significantly higher success rate than existing methods. Additionally, we evaluate the robustness of forged samples and discuss the potential defense against our attack. Code is available at: https: //github. com/holdrain/WMCopier.

EWRL Workshop 2024 Workshop Paper

Private Online Learning in Adversarial MDPs: Full-Information and Bandit

  • Shaojie Bai
  • Lanting Zeng
  • Chengcheng Zhao
  • Xiaoming Duan
  • Mohammad Sadegh Talebi
  • Peng Cheng
  • Jiming Chen

We study learning adversarial Markov decision process (MDP) in the episodic setting under the constraint of differential privacy (DP). This is motivated by the widespread applications of reinforcement learning (RL) in non-stationary and even adversarial scenarios, where protecting users' sensitive information is vital. We first propose two efficient frameworks for adversarial MDPs, spanning full-information and bandit settings. Within each framework, we consider both Joint DP (JDP), where a central agent is trusted to protect the sensitive data, and Local DP (LDP), where the information is protected directly on the user side. Then, we design novel privacy mechanisms to privatize the stochastic transition and adversarial losses. By instantiating such privacy mechanisms to satisfy JDP and LDP requirements, we obtain near-optimal regret guarantees for both frameworks. To our knowledge, these are the first algorithms to tackle the challenge of private learning in adversarial MDPs.

AAMAS Conference 2024 Conference Paper

Stability of Weighted Majority Voting under Estimated Weights

  • Shaojie Bai
  • Dongxia Wang
  • Tim Muller
  • Peng Cheng
  • Jiming Chen

Weighted Majority Voting (WMV) is a well-known decision making rule. The weights of sources are determined by the probabilities that sources provide accurate information (trustworthiness). However, in reality, the trustworthiness is usually not a known quantity to the decision maker – they have to rely on an estimate called trust. An algorithm that computes trust is called unbiased when it has the property that it does not systematically overestimate or underestimate the trustworthiness. To formally analyze the uncertainty to the decision process brought by such unbiased trust values, we introduce and analyze two important properties of WMV: Stability of Correctness and Stability of Optimality. Stability of Correctness measures the difference between the decision accuracy that the decision maker believes he can achieve and the accuracy he actually achieves. We prove Stability of Correctness absolutely holds for WMV – the difference is 0. Stability of Optimality measures the difference between the actual accuracy of decisions made using trust values, and those made using trustworthiness values. We find a relatively tight upper bound on the Stability of Optimality, meaning that, although using (unbiased) trust values is suboptimal compared to using the true trustworthiness values, the difference is small. Meanwhile, a counter-intuitive observation is that while distributions of trustworthiness influence the Stability of Optimality, the number of sources barely influences it. We also provide an overview of how sensitive decision accuracy is to the changes in trust and trustworthiness.

NeurIPS Conference 2023 Conference Paper

Look Beneath the Surface: Exploiting Fundamental Symmetry for Sample-Efficient Offline RL

  • Peng Cheng
  • Xianyuan Zhan
  • Zhihao Wu
  • Wenjia Zhang
  • Youfang Lin
  • Shou cheng Song
  • Han Wang
  • Li Jiang

Offline reinforcement learning (RL) offers an appealing approach to real-world tasks by learning policies from pre-collected datasets without interacting with the environment. However, the performance of existing offline RL algorithms heavily depends on the scale and state-action space coverage of datasets. Real-world data collection is often expensive and uncontrollable, leading to small and narrowly covered datasets and posing significant challenges for practical deployments of offline RL. In this paper, we provide a new insight that leveraging the fundamental symmetry of system dynamics can substantially enhance offline RL performance under small datasets. Specifically, we propose a Time-reversal symmetry (T-symmetry) enforced Dynamics Model (TDM), which establishes consistency between a pair of forward and reverse latent dynamics. TDM provides both well-behaved representations for small datasets and a new reliability measure for OOD samples based on compliance with the T-symmetry. These can be readily used to construct a new offline RL algorithm (TSRL) with less conservative policy constraints and a reliable latent space data augmentation procedure. Based on extensive experiments, we find TSRL achieves great performance on small benchmark datasets with as few as 1% of the original samples, which significantly outperforms the recent offline RL algorithms in terms of data efficiency and generalizability. Code is available at: https: //github. com/pcheng2/TSRL

AAMAS Conference 2023 Conference Paper

Stability of Weighted Majority Voting under Estimated Weights

  • Shaojie Bai
  • Dongxia Wang
  • Tim Muller
  • Peng Cheng
  • Jiming Chen

Weighted Majority Voting (WMV) is a well-known decision making rule. The weights of sources are determined by the probabilities that sources provide accurate information (trustworthiness). However, in reality, the trustworthiness is usually not a known quantity to the decision maker – they have to rely on an estimate called trust. An algorithm that computes trust is called unbiased when it has the property that it does not systematically overestimate or underestimate the trustworthiness. To formally analyze the uncertainty to the decision process brought by such unbiased trust values, we introduce and analyze two important properties of WMV: stability of correctness and stability of optimality. We also provide an overview of how sensitive decision accuracy is to the changes in trust and trustworthiness.

NeurIPS Conference 2022 Conference Paper

An Adaptive Deep RL Method for Non-Stationary Environments with Piecewise Stable Context

  • Xiaoyu Chen
  • Xiangming Zhu
  • Yufeng Zheng
  • Pushi Zhang
  • Li Zhao
  • Wenxue Cheng
  • Peng Cheng
  • Yongqiang Xiong

One of the key challenges in deploying RL to real-world applications is to adapt to variations of unknown environment contexts, such as changing terrains in robotic tasks and fluctuated bandwidth in congestion control. Existing works on adaptation to unknown environment contexts either assume the contexts are the same for the whole episode or assume the context variables are Markovian. However, in many real-world applications, the environment context usually stays stable for a stochastic period and then changes in an abrupt and unpredictable manner within an episode, resulting in a segment structure, which existing works fail to address. To leverage the segment structure of piecewise stable context in real-world applications, in this paper, we propose a \textit{\textbf{Se}gmented \textbf{C}ontext \textbf{B}elief \textbf{A}ugmented \textbf{D}eep~(SeCBAD)} RL method. Our method can jointly infer the belief distribution over latent context with the posterior over segment length and perform more accurate belief context inference with observed data within the current context segment. The inferred belief context can be leveraged to augment the state, leading to a policy that can adapt to abrupt variations in context. We demonstrate empirically that SeCBAD can infer context segment length accurately and outperform existing methods on a toy grid world environment and Mujuco tasks with piecewise-stable context.

AAAI Conference 2016 Conference Paper

BRBA: A Blocking-Based Association Rule Hiding Method

  • Peng Cheng
  • Ivan Lee
  • Li Li
  • Kuo-Kun Tseng
  • Jeng-Shyang Pan

Privacy preserving in association rule mining is an important research topic in the database security field. This paper has proposed a blocking-based method to solve the association rule hiding problem for data sharing. It aims at reducing undesirable side effects and increasing desirable side effects, while ensuring to conceal all sensitive rules. The candidate transactions are selected for sanitization based on their relations with border rules. Comparative experiments on real datasets demonstrate that the proposed method can achieve its goals.

AAAI Conference 2014 Conference Paper

Association Rule Hiding Based on Evolutionary Multi-Objective Optimization by Removing Items

  • Peng Cheng
  • Jeng-Shyang Pan

Today, people benefit from utilizing data mining technologies, such as association rule mining methods, to find valuable knowledge residing in a large amount of data. However, they also face the risk of exposing sensitive or confidential information, when data is shared among different organizations. Thus, a question arises: how can we prevent that sensitive knowledge is discovered, while ensuring that ordinary non-sensitive knowledge can be mined to the maximum extent possible. In this paper, we address the problem of privacy preserving in association rule mining from the perspective of multi-objective optimization. A new hiding method based on evolutionary multi-objective optimization (EMO) is proposed and the side effects generated by the hiding process are formulated as optimization goals. EMO is used to find candidate transactions to modify so that side effects are minimized. Comparative experiments with exact methods on real datasets demonstrated that the proposed method can hide sensitive rules with fewer side effects.