DINGO: Constrained Inference for Diffusion LLMs

Tarun Suresh; Debangshu Banerjee; Shubham Ugare; Sasa Misailovic; Gagandeep Singh

Back to NeurIPS

NeurIPS 2025

DINGO: Constrained Inference for Diffusion LLMs

Conference Paper Main Conference Track Artificial Intelligence · Machine Learning

PDF Details

Abstract

Diffusion LLMs have emerged as a promising alternative to conventional autoregressive LLMs, offering substantial potential for improving runtime efficiency. However, existing diffusion models fail to provably enforce user-specified formal constraints, such as regular expressions, which makes them unreliable for tasks that require structured outputs, such as fixed-schema JSON generation. Unlike autoregressive models, which generate tokens sequentially, diffusion LLMs predict a block of tokens in parallel. This parallelism makes traditional constrained decoding algorithms, designed to enforce constraints with sequential token prediction, ineffective at preserving the true output distribution. To address this limitation, we propose DINGO, a dynamic programming-based constrained decoding strategy that is both efficient and provably distribution-preserving. DINGO enables sampling of output strings with the highest probability under the model’s predicted distribution while strictly adhering to any user-specified regular expression. On standard symbolic math and JSON generation benchmarks, DINGO achieves up to a $68$\% points of improvement over unconstrained inference. The code is available at [**DINGO**](https: //github. com/uiuc-focal-lab/DINGO).

DINGO: Constrained Inference for Diffusion LLMs

Abstract

Authors

Keywords

Context