Arrow Research search
Back to NeurIPS

NeurIPS 2025

Image Editing As Programs with Diffusion Models

Conference Paper Main Conference Track Artificial Intelligence ยท Machine Learning

Abstract

While diffusion models have achieved remarkable success in text-to-image generation, they encounter significant challenges with instruction-driven image editing. Our research highlights a key challenge: these models particularly struggle with structurally-inconsistent edits that involve substantial layout changes. To address this gap, we introduce Image Editing As Programs (IEAP), a unified image editing framework built upon the Diffusion Transformer (DiT) architecture. Specifically, IEAP deals with complex instructions by decomposing them into a sequence of programmable atomic operations. Each atomic operation manages a specific type of structurally consistent edit; when sequentially combined, IEAP enables the execution of arbitrary and structurally-inconsistent transformations. This reductionist approach enables IEAP to robustly handle a wide spectrum of edits, encompassing both structurally-consistent and inconsistent changes. Extensive experiments demonstrate that IEAP significantly outperforms state-of-the-art methods on standard benchmarks across various editing scenarios. In these evaluations, our framework delivers superior accuracy and semantic fidelity, particularly for complex, multi-step instructions. Codes are available at https: //github. com/YujiaHu1109/IEAP.

Authors

Keywords

No keywords are indexed for this paper.

Context

Venue
Annual Conference on Neural Information Processing Systems
Archive span
1987-2025
Indexed papers
30776
Paper id
901840149987250880