InverseCoder: Self-improving Instruction-Tuned Code LLMs with Inverse-Instruct

Yutong Wu; Di Huang; Wenxuan Shi; Wei Wang; Yewen Pu; Lingzhe Gao; Shihao Liu; Ziyuan Nan; Kaizhao Yuan; Rui Zhang; Xishan Zhang; Zidong Du; Qi Guo; Dawei Yin; Xing Hu; Yunji Chen

doi:10.1609/aaai.v39i24.34742

Back to AAAI

AAAI 2025

InverseCoder: Self-improving Instruction-Tuned Code LLMs with Inverse-Instruct

Conference Paper AAAI Technical Track on Natural Language Processing III Artificial Intelligence

PDF Details DOI

Abstract

Recent advancements in open-source code large language models (LLMs) have been driven by fine-tuning on the data generated from powerful closed-source LLMs, which are expensive to obtain. This paper explores whether it is possible to use a fine-tuned open-source model to generate additional data to augment its instruction-tuning dataset. We make two observations: (1) A code snippet can serve as the response to different instructions. (2) Instruction-tuned code LLMs perform better at translating code into instructions than the reverse. Based on these observations, we propose Inverse-Instruct, a data augmentation technique that uses a fine-tuned LLM to generate additional instructions of code responses from its own training dataset. The additional instruction-response pairs are added to the original dataset, and a stronger code LLM can be obtained by fine-tuning on the augmented dataset. We empirically validate Inverse-Instruct on a range of open-source code models (e.g. CodeLlama-Python and DeepSeek-Coder) and benchmarks (e.g., HumanEval(+), MBPP(+), DS-1000 and MultiPL-E), showing it consistently improves the base models.

InverseCoder: Self-improving Instruction-Tuned Code LLMs with Inverse-Instruct

Abstract

Authors

Keywords

Context