Information-Theoretic Discrete Diffusion

Moongyu Jeon; Sangwoo Shin; Dongjae Jeon; Albert No

Back to NeurIPS

NeurIPS 2025

Information-Theoretic Discrete Diffusion

Conference Paper Main Conference Track Artificial Intelligence · Machine Learning

PDF Details

Abstract

We present an information-theoretic framework for discrete diffusion models that yields principled estimators of log-likelihood using score-matching losses. Inspired by the I-MMSE identity for the Gaussian setup, we derive analogous results for the discrete setting. Specifically, we introduce the Information–Minimum Denoising Score Entropy (I-MDSE) relation, which links mutual information between data and its diffused version to the minimum denoising score entropy (DSE) loss. We extend this theory to masked diffusion and establish the Information–Minimum Denoising Cross-Entropy (I-MDCE) relation, connecting cross-entropy losses to mutual information in discrete masked processes. These results provide a time-integral decomposition of the log-likelihood of the data in terms of optimal score-based losses, showing that commonly used losses such as DSE and DCE are not merely variational bounds but tight and principled estimators of log-likelihood. The I-MDCE decomposition further enables practical extensions, including time-free formula, conditional likelihood estimation in prompt–response tasks, and coupled Monte Carlo estimation of likelihood ratios. Experiments on synthetic and real-world data confirm the accuracy, variance stability, and utility of our estimators. The code is publicly available at https: //github. com/Dongjae0324/infodis.

Information-Theoretic Discrete Diffusion

Abstract

Authors

Keywords

Context