Execution-guided within-prompt search for programming-by-example

Gust Verbruggen; Ashish Tiwari 0001; Mukul Singh; Vu Le 0002; Sumit Gulwani

Back to ICLR

ICLR 2025

Execution-guided within-prompt search for programming-by-example

Conference Paper Accept (Poster) Artificial Intelligence · Machine Learning

Details

Abstract

Large language models (LLMs) can generate code from examples without being limited to a DSL, but they lack search, as sampled programs are independent. In this paper, we use an LLM as a policy that generates lines of code and then join these lines of code to let the LLM implicitly estimate the value of each of these lines in its next iteration. We further guide the policy and value estimation by executing each line and annotating it with its results on the given examples. This allows us to search for programs within a single (expanding) prompt until a sound program is found, by letting the policy reason in both the syntactic (code) and semantic (execution) space. We evaluate within-prompt search on straight-line Python code generation using five benchmarks across different domains (strings, lists, and arbitrary Python programming problems). We show that the model uses the execution results to guide the search and that within-prompt search performs well at low token budgets. We also analyze how the model behaves as a policy and value, show that it can parallelize the search, and that it can implicitly backtrack over earlier generations.

Execution-guided within-prompt search for programming-by-example

Abstract

Authors

Keywords

Context