Antidistillation Sampling

Yash Savani; Asher Trockman; Zhili Feng; Yixuan Xu; Avi Schwarzschild; Alexander Robey; Marc Finzi; Zico Kolter

Back to NeurIPS

NeurIPS 2025

Antidistillation Sampling

Conference Paper Main Conference Track Artificial Intelligence · Machine Learning

PDF Details

Abstract

Frontier models that generate extended reasoning traces inadvertently produce token sequences that can facilitate model distillation. Recognizing this vulnerability, model owners may seek sampling strategies that limit the effectiveness of distillation without compromising model performance. Antidistillation sampling provides exactly this capability. By strategically modifying a model's next-token probability distribution, antidistillation sampling poisons reasoning traces, rendering them significantly less effective for distillation while preserving the model's utility.

Authors

Keywords

No keywords are indexed for this paper.

Context

Venue: Annual Conference on Neural Information Processing Systems
Archive span: 1987-2025
Indexed papers: 30776
Paper id: 1084152708520479700