Author name cluster

Dong Zheng

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

7 papers

2 author rows

TCS Journal 2025 Journal Article

A continuous leakage-amplified IBE scheme with perfect key update

Zirui Qiao
Yong Yu
Yanwei Zhou
Dong Zheng

In the practical deployment of cryptographic solutions, diverse applications pose unique challenges in terms of leakage resilience. The one-size-fits-all approach of traditional cryptographic primitives and fixed leakage-resilient ability often fails to satisfy the nuanced security demands of different scenarios. Recognizing this, the concept of a continuous Identity-based Encryption (IBE) scheme has been introduced. This innovative approach allows for the dynamic adjustment of private key lengths to tailor the system’s resistance to various leakage attacks, based on the specific needs of an application. Despite the strides made, the existing implementations of this scheme exhibit limitations, particularly in the aspect of key updates. The current method for key updates is incomplete, only updating some elements of the key. This process relies on a trapdoor mechanism, which results in suboptimal storage efficiency. This paper introduces a new continuous leakage amplified IBE scheme to address these shortcomings. This improved model features a comprehensive key update mechanism that enables users to refresh every element of the private key without utilizing a trapdoor. Our analysis confirms that this enhanced IBE scheme stands out for its security, efficiency, and practicality. Moreover, in pursuit of optimizing performance, we present a novel general construction. This construction illustrates that it is feasible to construct an IBE scheme resilient against chosen-ciphertext attacks, equipped with an impeccable update function, by building upon any semantically secure IBE scheme.

Details DOI

NeurIPS Conference 2025 Conference Paper

ALTo: Adaptive-Length Tokenizer for Autoregressive Mask Generation

Lingfeng Wang
Hualing Lin
Senda Chen
Tao Wang
Changxu Cheng
Yangyang Zhong
Dong Zheng
Wuyue Zhao

While humans effortlessly draw visual objects and shapes by adaptively allocating attention based on their complexity, existing multimodal large language models (MLLMs) remain constrained by rigid token representations. Bridging this gap, we propose ALTo, an adaptive length tokenizer for autoregressive mask generation. To achieve this, a novel token length predictor is designed, along with a length regularization term and a differentiable token chunking strategy. We further build ALToLLM that seamlessly integrates ALTo into MLLM. Preferences on the trade-offs between mask quality and efficiency is implemented by group relative policy optimization (GRPO). Experiments demonstrate that ALToLLM achieves state-of-the-art performance with adaptive token cost on popular segmentation benchmarks. Code and models will be released.

PDF Details

ICML Conference 2025 Conference Paper

Event-Customized Image Generation

Zhen Wang 0004
Yilei Jiang
Dong Zheng
Jun Xiao 0001
Long Chen 0016

Customized Image Generation, generating customized images with user-specified concepts, has raised significant attention due to its creativity and novelty. With impressive progress achieved in subject customization, some pioneer works further explored the customization of action and interaction beyond entity (i. e. , human, animal, and object) appearance. However, these approaches only focus on basic actions and interactions between two entities, and their effects are limited by insufficient ”exactly same” reference images. To extend customized image generation to more complex scenes for general real-world applications, we propose a new task: event-customized image generation. Given a single reference image, we define the ”event” as all specific actions, poses, relations, or interactions between different entities in the scene. This task aims at accurately capturing the complex event and generating customized images with various target entities. To solve this task, we proposed a novel training-free event customization method: FreeEvent. Specifically, FreeEvent introduces two extra paths alongside the general diffusion denoising process: 1) Entity switching path: it applies cross-attention guidance and regulation for target entity generation. 2) Event transferring path: it injects the spatial feature and self-attention maps from the reference image to the target image for event generation. To further facilitate this new task, we collected two evaluation benchmarks: SWiG-Event and Real-Event. Extensive experiments and ablations have demonstrated the effectiveness of FreeEvent.

Details

TCS Journal 2024 Journal Article

Further construction of even-variable balanced rotation symmetric Boolean functions with optimal algebraic immunity

Qinglan Zhao
Pan Li
Dong Zheng
Luyang Li
Baodong Qin

The field of cryptography has given a lot of attention to rotation symmetric Boolean functions (RSBFs) because they possess special structures and include many functions with good cryptographic properties. It is difficult to construct even-variable balanced RSBFs with optimal algebraic immunity in the study on RSBFs. Recently, Mesnager et al. proposed the first and only construction of balanced RSBFs with optimal algebraic immunity for arbitrary even variables (Des. Codes Cryptogr. 89 (1) (2021) 1-17). The nonlinearity of their functions is not high. In this paper, we develop further research based on their construction and present a fresh design of n-variable balanced RSBFs with optimal algebraic immunity for arbitrary even n. Our functions have higher nonlinearity compared to their functions. Furthermore, the algebraic degree and fast algebraic immunity of our functions are not less than n − 2 for n not bigger than 16.

Details DOI

NeurIPS Conference 2023 Conference Paper

KuaiSim: A Comprehensive Simulator for Recommender Systems

Kesen Zhao
Shuchang Liu
Qingpeng Cai
Xiangyu Zhao
Ziru Liu
Dong Zheng
Peng Jiang
Kun Gai

Reinforcement Learning (RL)-based recommender systems (RSs) have garnered considerable attention due to their ability to learn optimal recommendation policies and maximize long-term user rewards. However, deploying RL models directly in online environments and generating authentic data through A/B tests can pose challenges and require substantial resources. Simulators offer an alternative approach by providing training and evaluation environments for RS models, reducing reliance on real-world data. Existing simulators have shown promising results but also have limitations such as simplified user feedback, lacking consistency with real-world data, the challenge of simulator evaluation, and difficulties in migration and expansion across RSs. To address these challenges, we propose KuaiSim, a comprehensive user environment that provides user feedback with multi-behavior and cross-session responses. The resulting simulator can support three levels of recommendation problems: the request level list-wise recommendation task, the whole-session level sequential recommendation task, and the cross-session level retention optimization task. For each task, KuaiSim also provides evaluation protocols and baseline recommendation algorithms that further serve as benchmarks for future research. We also restructure existing competitive simulators on the Kuairand Dataset and compare them against KuaiSim to future assess their performance and behavioral differences. Furthermore, to showcase KuaiSim's flexibility in accommodating different datasets, we demonstrate its versatility and robustness when deploying it on the ML-1m dataset. The implementation code is available online to ease reproducibility \footnote{https: //github. com/Applied-Machine-Learning-Lab/KuaiSim}.

PDF Details

ICLR Conference 2023 Conference Paper

ResAct: Reinforcing Long-term Engagement in Sequential Recommendation with Residual Actor

Wanqi Xue
Qingpeng Cai 0001
Ruohan Zhan
Dong Zheng
Peng Jiang 0002
Kun Gai
Bo An 0001

Long-term engagement is preferred over immediate engagement in sequential recommendation as it directly affects product operational metrics such as daily active users (DAUs) and dwell time. Meanwhile, reinforcement learning (RL) is widely regarded as a promising framework for optimizing long-term engagement in sequential recommendation. However, due to expensive online interactions, it is very difficult for RL algorithms to perform state-action value estimation, exploration and feature extraction when optimizing long-term engagement. In this paper, we propose ResAct which seeks a policy that is close to, but better than, the online-serving policy. In this way, we can collect sufficient data near the learned policy so that state-action values can be properly estimated, and there is no need to perform online exploration. ResAct optimizes the policy by first reconstructing the online behaviors and then improving it via a Residual Actor. To extract long-term information, ResAct utilizes two information-theoretical regularizers to confirm the expressiveness and conciseness of features. We conduct experiments on a benchmark dataset and a large-scale industrial dataset which consists of tens of millions of recommendation requests. Experimental results show that our method significantly outperforms the state-of-the-art baselines in various long-term engagement optimization tasks.

Details

NeurIPS Conference 2023 Conference Paper

State Regularized Policy Optimization on Data with Dynamics Shift

Zhenghai Xue
Qingpeng Cai
Shuchang Liu
Dong Zheng
Peng Jiang
Kun Gai
Bo An

In many real-world scenarios, Reinforcement Learning (RL) algorithms are trained on data with dynamics shift, i. e. , with different underlying environment dynamics. A majority of current methods address such issue by training context encoders to identify environment parameters. Data with dynamics shift are separated according to their environment parameters to train the corresponding policy. However, these methods can be sample inefficient as data are used \textit{ad hoc}, and policies trained for one dynamics cannot benefit from data collected in all other environments with different dynamics. In this paper, we find that in many environments with similar structures and different dynamics, optimal policies have similar stationary state distributions. We exploit such property and learn the stationary state distribution from data with dynamics shift for efficient data reuse. Such distribution is used to regularize the policy trained in a new environment, leading to the SRPO (\textbf{S}tate \textbf{R}egularized \textbf{P}olicy \textbf{O}ptimization) algorithm. To conduct theoretical analyses, the intuition of similar environment structures is characterized by the notion of homomorphous MDPs. We then demonstrate a lower-bound performance guarantee on policies regularized by the stationary state distribution. In practice, SRPO can be an add-on module to context-based algorithms in both online and offline RL settings. Experimental results show that SRPO can make several context-based algorithms far more data efficient and significantly improve their overall performance.

PDF Details