Arrow Research search

Author name cluster

Hui Zeng

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

10 papers
2 author rows

Possible papers

10

AAAI Conference 2026 Conference Paper

Lethe: Layer- and Time-Adaptive KV Cache Pruning for Reasoning-Intensive LLM Serving

  • Hui Zeng
  • Daming Zhao
  • Pengfei Yang
  • WenXuan Hou
  • Tianyang Zheng
  • Hui Li
  • Weiye Ji
  • Jidong Zhai

Generative reasoning with large language models (LLMs) often involves long decoding sequences, leading to substantial memory and latency overheads from accumulating key-value (KV) caches. While existing KV compression methods primarily focus on reducing prefill memory from long input sequences, they fall short in addressing the dynamic and layer-sensitive nature of long-form generation, which is central to reasoning tasks. We propose Lethe, a dynamic KV cache management framework that introduces adaptivity along both the spatial and temporal dimensions of decoding. Along the spatial dimension, Lethe performs layerwise sparsity-aware allocation, assigning token pruning budgets to each transformer layer based on estimated attention redundancy. Along the temporal dimension, Lethe conducts multi-round token pruning during generation, driven by a Recency-Aware Selective Retention (RASR) mechanism. RASR extends traditional recency-based heuristics by also considering token relevance derived from evolving attention patterns, enabling informed decisions about which tokens to retain or evict. Empirical results demonstrate that Lethe achieves a favorable balance between efficiency and generation quality across diverse models and tasks, increases throughput by up to 2.56×.

AAAI Conference 2026 Conference Paper

ObjectAdv: Object-Level Unrestricted Adversarial Attacks via Diffusion Models

  • Shijie Zhao
  • Zhenyu Liang
  • Xing Yang
  • Haoqi Gao
  • Anjie Peng
  • Hui Zeng

Unrestricted adversarial attacks aim to fool DNNs by generating effective yet photorealistic examples. However, previous methods usually rely on global perturbations to enhance attack performance, which inevitably introduces visual distortions. To reduce visual distortions in the background, we propose a diffusion-based framework that focuses on local perturbations to generate object-level unrestricted adversarial examples (ObjectAdv). Since the cross-attention maps of Stable Diffusion contain the object information, we directly leverage the attention maps to localize the semantic region of object where for attacking. Second, a prompt-switching strategy is proposed for both imperceptibility and attack capacity. Specifically, to preserve layout and object shape of clean image, a prompt of true category is used at early denoising steps. At the later steps, we propose a well-designed prompt to guide the diffusion model to generate transferable adversarial examples. This local attack may cause inconsistency between the perturbed object and the background in adversarial examples. An FFT-based edge smoother is utilized to ensure seamless blending of the edges. ObjectAdv achieves an average ASR of 99.2% in white-box test on the ImageNet-compatible dataset, and outperforms existing methods on defense performance (+5%) and image quality metrics, e.g., SSIM of 0.9140 (+0.1048) and FID of 25.63 (-19.27).

NeurIPS Conference 2025 Conference Paper

BurstDeflicker: A Benchmark Dataset for Flicker Removal in Dynamic Scenes

  • Lishen Qu
  • Zhihao Liu
  • Shihao Zhou
  • LUO YAQI
  • Jie Liang
  • Hui Zeng
  • Lei Zhang
  • Jufeng Yang

Flicker artifacts in short-exposure images are caused by the interplay between the row-wise exposure mechanism of rolling shutter cameras and the temporal intensity variations of alternating current (AC)-powered lighting. These artifacts typically appear as uneven brightness distribution across the image, forming noticeable dark bands. Beyond compromising image quality, this structured noise also affects high-level tasks, such as object detection and tracking, where reliable lighting is crucial. Despite the prevalence of flicker, the lack of a large-scale, realistic dataset has been a significant barrier to advancing research in flicker removal. To address this issue, we present BurstDeflicker, a scalable benchmark constructed using three complementary data acquisition strategies. First, we develop a Retinex-based synthesis pipeline that redefines the goal of flicker removal and enables controllable manipulation of key flicker-related attributes (e. g. , intensity, area, and frequency), thereby facilitating the generation of diverse flicker patterns. Second, we capture 4, 000 real-world flicker images from different scenes, which help the model better understand the spatial and temporal characteristics of real flicker artifacts and generalize more effectively to wild scenarios. Finally, due to the non-repeatable nature of dynamic scenes, we propose a green-screen method to incorporate motion into image pairs while preserving real flicker degradation. Comprehensive experiments demonstrate the effectiveness of our dataset and its potential to advance research in flicker removal.

ICML Conference 2025 Conference Paper

Does One-shot Give the Best Shot? Mitigating Model Inconsistency in One-shot Federated Learning

  • Hui Zeng
  • Wenke Huang 0003
  • Tongqing Zhou
  • Xinyi Wu
  • Guancheng Wan
  • Yingwen Chen 0001
  • Zhiping Cai

Turning the multi-round vanilla Federated Learning into one-shot FL (OFL) significantly reduces the communication burden and makes a big leap toward practical deployment. However, this work empirically and theoretically unravels that existing OFL falls into a garbage (inconsistent one-shot local models) in and garbage (degraded global model) out pitfall. The inconsistency manifests as divergent feature representations and sample predictions. This work presents a novel OFL framework FAFI that enhances the one-shot training on the client side to essentially overcome inferior local uploading. Specifically, unsupervised feature alignment and category-wise prototype learning are adopted for clients’ local training to be consistent in representing local samples. On this basis, FAFI uses informativeness-aware feature fusion and prototype aggregation for global inference. Extensive experiments on three datasets demonstrate the effectiveness of FAFI, which facilitates superior performance compared with 11 OFL baselines (+10. 86% accuracy). Code available at https: //github. com/zenghui9977/FAFI_ICML25

AAAI Conference 2025 Conference Paper

Everywhere Attack: Attacking Locally and Globally to Boost Targeted Transferability

  • Hui Zeng
  • Sanshuai Cui
  • Biwei Chen
  • Anjie Peng

Adversarial examples’ (AE) transferability refers to the phenomenon that AEs crafted with one surrogate model can also fool other models. Notwithstanding remarkable progress in untargeted transferability, its targeted counterpart remains challenging. This paper proposes an everywhere scheme to boost targeted transferability. Our idea is to attack a victim image both globally and locally. We aim to optimize ‘an army of targets’ in every local image region instead of the previous works that optimize a high-confidence target in the image. Specifically, we split a victim image into non-overlap blocks and jointly mount a targeted attack on each block. Such a strategy mitigates transfer failures caused by attention inconsistency between surrogate and victim models and thus results in stronger transferability. Our approach is method-agnostic, which means it can be easily combined with existing transferable attacks for even higher transferability. Extensive experiments on ImageNet demonstrate that the proposed approach universally improves the state-of-the-art targeted attacks by a clear margin, e.g., the transferability of the widely adopted Logit attack can be improved by 28.8%-300%. We also evaluate the crafted AEs on a real-world platform: Google Cloud Vision. Results further support the superiority of the proposed method.