NeurIPS Conference 2025 Conference Paper
Training Language Models to Generate Quality Code with Program Analysis Feedback
- Feng Yao
- Zilong Wang
- Liyuan Liu
- Junxia Cui
- Li Zhong
- Xiaohan Fu
- Haohui Mai
- Viswanathan Krishnan
Code generation with large language models (LLMs), often termed vibe coding, is increasingly adopted in production but fails to ensure code quality, particularly in security (e. g. , SQL injection vulnerabilities) and maintainability (e. g. , missing type annotations). Existing methods, such as supervised fine-tuning and rule-based post-processing, rely on labor-intensive annotations or brittle heuristics, limiting their scalability and effectiveness. We propose REAL (Reinforcement rEwards from Automated anaLysis), a reinforcement learning framework that trains LLMs to generate production-quality code using program analysis–guided feedback. Specifically, REAL integrates two automated signals: (1) static analyzers detecting security and maintainability defects and (2) unit tests ensuring functional correctness. Unlike prior work, our framework is prompt-agnostic and reference-free, enabling scalable supervision without manual intervention. Experiments across multiple datasets and model scales demonstrate that REAL outperforms state-of-the-art methods in simultaneous assessments of functionality and code quality. Our work bridges the gap between rapid prototyping and production-ready code, enabling LLMs to deliver both speed and quality.