Author name cluster

Yu Bai

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

28 papers

2 author rows

AAAI Conference 2026 Conference Paper

Identifying and Analyzing Performance-Critical Tokens in Large Language Models

Yu Bai
Heyan Huang
Cesare Spinoso-Di Piano
Sanxing Chen
Marc-Antoine Rondeau
Yang Gao
Jackie Chi Kit Cheung

In-context learning (ICL) has emerged as an effective solution for few-shot learning with large language models (LLMs). However, how LLMs leverage demonstrations to specify a task and learn a corresponding computational function through ICL is underexplored. Drawing from the way humans learn from content-label mappings in demonstrations, we categorize the tokens in an ICL prompt into content, stopword, and template tokens. Our goal is to identify the types of tokens whose representations directly influence LLM's performance, a property we refer to as being performance-critical. By ablating representations from the attention of the test example, we find that the representations of informative content tokens have less influence on performance compared to template and stopword tokens, which contrasts with the human attention to informative words. We give evidence that the representations of performance-critical tokens aggregate information from the content tokens. Moreover, we demonstrate experimentally that lexical meaning, repetition, and structural cues are the main distinguishing characteristics of these tokens. Our work sheds light on how LLMs learn to perform tasks from demonstrations and deepens our understanding of the roles different types of tokens play in LLMs.

PDF Details DOI

NeurIPS Conference 2025 Conference Paper

Accelerated Vertical Federated Adversarial Learning through Decoupling Layer-Wise Dependencies

Tianxing Man
Yu Bai
Ganyu Wang
Jinjie Fang
Haoran Fang
Bin Gu
Yi Chang

Vertical Federated Learning (VFL) enables participants to collaboratively train models on aligned samples while keeping their heterogeneous features private and distributed. Despite their utility, VFL models remain vulnerable to adversarial attacks during inference. Adversarial Training (AT), which generates adversarial examples at each training iteration, stands as the most effective defense for improving model robustness. However, applying AT in VFL settings (VFAL) faces significant computational efficiency challenges, as the distributed training framework necessitates iterative propagations across participants. To this end, we propose **_DecVFAL_** framework, which substantially accelerates **_VFAL_** training through a dual-level ***Dec***oupling mechanism applied during adversarial sample generation. Specifically, we first decouple the bottom modules of clients (directly responsible for adversarial updates) from the remaining networks, enabling efficient _lazy sequential propagations_ that reduce communication frequency through delayed gradients. We further introduce _decoupled parallel backpropagation_ to accelerate delayed gradient computation by eliminating idle waiting through parallel processing across modules. Additionally, we are the first to establish convergence analysis for VFAL, rigorously characterizing how our decoupling mechanism interacts with existing VFL dynamics, and prove that _DecVFAL_ achieves an $\mathcal{O}(1/\sqrt{K})$ convergence rate matching that of standard VFLs. Experimental results show that _DecVFAL_ ensures competitive robustness while significantly achieving about $3\sim10\times$ speed up.