Author name cluster

Yuzhen Mao

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

4 papers

2 author rows

TMLR Journal 2026 Journal Article

ADiff4TPP: Asynchronous Diffusion Models for Temporal Point Processes

Amartya Mukherjee
Ruizhi Deng
He Zhao
Yuzhen Mao
Leonid Sigal
Frederick Tung

This work introduces a diffusion model-based approach to modelling temporal point processes via an asynchronous noise schedule. Existing methods typically rely on parametric conditional intensity functions or autoregressive next-event prediction, which can limit distributional expressivity and make long-horizon forecasting computationally expensive. We address this limitation by using diffusion models to learn the joint distribution of event sequences in latent space without imposing restrictive parametric assumptions. At each step of the diffusion process, the noise schedule injects noise of varying scales into different parts of the data. With a careful design of the noise schedules, earlier events are generated faster than later ones, thus providing stronger conditioning for forecasting the more distant future. We derive an objective to effectively train these models for a general family of noise schedules based on conditional flow matching. Our method models the joint distribution of the latent representations of events in a sequence and achieves state-of-the-art results in predicting both the next inter-event time and event type on benchmark datasets. Additionally, it flexibly accommodates varying lengths of observation and prediction windows in different forecasting settings by adjusting the starting and ending points of the generation process. Finally, our method shows strong performance in long horizon prediction tasks, outperforming existing baseline methods.

PDF Details

NeurIPS Conference 2025 Conference Paper

Performative Risk Control: Calibrating Models for Reliable Deployment under Performativity

Victor Li
Baiting Chen
Yuzhen Mao
Qi Lei
Zhun Deng

Calibrating blackbox machine learning models to achieve risk control is crucial to ensure reliable decision-making. A rich line of literature has been studying how to calibrate a model so that its predictions satisfy explicit finite-sample statistical guarantees under a fixed, static, and unknown data-generating distribution. However, prediction-supported decisions may influence the outcome they aim to predict, a phenomenon named performativity of predictions, which is commonly seen in social science and economics. In this paper, we introduce Performative Risk Control, a framework to calibrate models to achieve risk control under performativity with provable theoretical guarantees. Specifically, we provide an iteratively refined calibration process, where we ensure the predictions are improved and risk-controlled throughout the process. We also study different types of risk measures and choices of tail bounds. Lastly, we demonstrate the effectiveness of our framework by numerical experiments on the task of predicting credit default risk. To the best of our knowledge, this work is the first one to study statistically rigorous risk control under performativity, which will serve as an important safeguard against a wide range of strategic manipulation in decision-making processes.

PDF Details

ICML Conference 2024 Conference Paper

EvoluNet: Advancing Dynamic Non-IID Transfer Learning on Graphs

Haohui Wang
Yuzhen Mao
Yujun Yan
Yaoqing Yang 0002
Jianhui Sun
Kevin Choi
Balaji Veeramani
Alison Hu

Non-IID transfer learning on graphs is crucial in many high-stakes domains. The majority of existing works assume stationary distribution for both source and target domains. However, real-world graphs are intrinsically dynamic, presenting challenges in terms of domain evolution and dynamic discrepancy between source and target domains. To bridge the gap, we shift the problem to the dynamic setting and pose the question: given the label-rich source graphs and the label-scarce target graphs both observed in previous $T$ timestamps, how can we effectively characterize the evolving domain discrepancy and optimize the generalization performance of the target domain at the incoming $T+1$ timestamp? To answer it, we propose a generalization bound for dynamic non-IID transfer learning on graphs, which implies the generalization performance is dominated by domain evolution and domain discrepancy between source and target graphs. Inspired by the theoretical results, we introduce a novel generic framework named EvoluNet. It leverages a transformer-based temporal encoding module to model temporal information of the evolving domains and then uses a dynamic domain unification module to efficiently learn domain-invariant representations across the source and target domains. Finally, EvoluNet outperforms the state-of-the-art models by up to 12. 1%, demonstrating its effectiveness in transferring knowledge from dynamic source graphs to dynamic target graphs.

Details

ICLR Conference 2024 Conference Paper

IceFormer: Accelerated Inference with Long-Sequence Transformers on CPUs

Yuzhen Mao
Martin Ester
Ke Li 0011

One limitation of existing Transformer-based models is that they cannot handle very long sequences as input since their self-attention operations exhibit quadratic time and space complexity. This problem becomes especially acute when Transformers are deployed on hardware platforms equipped only with CPUs. To address this issue, we propose a novel method for accelerating self-attention at inference time that works with pretrained Transformer models out-of-the-box without requiring retraining. We experiment using our method to accelerate various long-sequence Transformers, including a leading LLaMA 2-based LLM, on various benchmarks and demonstrate a greater speedup of $2.73\times$ - $7.63\times$ while retaining $98.6$% - $99.6$% of the accuracy of the original pretrained models. The code is available on our project website at https://yuzhenmao.github.io/IceFormer/.

Details