Author name cluster

Varun Kanade

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

30 papers

2 author rows

FOCS Conference 2025 Conference Paper

How Global Calibration Strengthens Multiaccuracy

Sílvia Casacuberta
Parikshit Gopalan
Varun Kanade
Omer Reingold

Multiaccuracy and multicalibration are multi-group fairness notions for prediction that have found numerous applications in learning and computational complexity [HKRR18]. They can be achieved from a single learning primitive: weak agnostic learning. A line of work starting from [GKR+22] has shown that multicalibration implies a very strong form of learning. Here we investigate the power of multiaccuracy as a learning primitive, both with and without the additional assumption of calibration. We find that multiaccuracy in itself is rather weak, but that the addition of global calibration (this notion is called calibrated multiaccuracy) boosts its power substantially, enough to recover implications that were previously known only assuming the stronger notion of multicalibration. We give evidence that multiaccuracy might not be as powerful as standard weak agnostic learning, by showing that there is no way to post-process a multiaccurate predictor to get a weak learner, even assuming the best hypothesis has correlation 1/2. Rather, we show that it yields a restricted form of weak agnostic learning, which requires some concept in the class to have correlation greater than 1/2 with the labels. However, by also requiring the predictor to be calibrated, we recover not just weak, but strong agnostic learning. A similar picture emerges when we consider the derivation of hardcore measures from predictors satisfying multigroup fairness notions [TTV09], [CDV24]. On the one hand, while multiaccuracy only yields hardcore measures of density half the optimal, we show that (a weighted version of) calibrated multiaccuracy achieves optimal density. Our results yield new insights into the complementary roles played by multiaccuracy and calibration in each setting. They shed light on why multiaccuracy and global calibration, although not particularly powerful by themselves, together yield considerably stronger notions.

Details

NeurIPS Conference 2025 Conference Paper

Pause Tokens Strictly Increase the Expressivity of Constant-Depth Transformers

Charles London
Varun Kanade

Pause tokens, simple filler symbols such as ". .. ", consistently improve Transformer performance on both language and mathematical tasks, yet their theoretical effect remains unexplained. We provide the first formal separation result, proving that adding pause tokens to constant-depth, logarithmic-width Transformers strictly increases their computational expressivity. With bounded-precision activations, Transformers without pause tokens compute only a strict subset of $\mathsf{AC}^0$ functions, while adding a polynomial number of pause tokens enables expressing the complete class. For logarithmic-precision Transformers, we show that adding pause tokens achieves expressivity equivalent to $\mathsf{TC}^0$, matching known upper bounds. Empirically, we demonstrate that two‑layer causally masked Transformers can learn parity when supplied with pause tokens, a function that they appear unable to learn without them. Our results provide a rigorous theoretical explanation for prior empirical findings, clarify how pause tokens interact with width, depth, and numeric precision, and position them as a distinct mechanism, complementary to chain-of-thought prompting, for enhancing Transformer reasoning.