Causal Discovery and Inference through Next-Token Prediction

Eivinas Butkus; Nikolaus Kriegeskorte

Back to NeurIPS

NeurIPS 2025

Causal Discovery and Inference through Next-Token Prediction

Conference Paper Main Conference Track Artificial Intelligence · Machine Learning

PDF Details

Abstract

Deep neural networks have been criticized as fundamentally statistical systems that fail to capture causal structure and perform causal reasoning. Here we demonstrate that a GPT-style transformer trained for next-token prediction can simultaneously discover instances of linear Gaussian structural causal models (SCMs) and learn to answer counterfactual queries about those SCMs. First, we show that the network generalizes to counterfactual queries about SCMs for which it has seen interventional data but not any examples of counterfactual inference. The network must, thus, have successfully composed discovered causal structures with a learned counterfactual inference algorithm. Second, we decode the implicit “mental” SCM from the network's residual stream activations and manipulate it using gradient descent with predictable effects on the network's output. Our results suggest that statistical prediction may be sufficient to drive the emergence of internal causal models and causal inference capacities in deep neural networks.

Causal Discovery and Inference through Next-Token Prediction

Abstract

Authors

Keywords

Context