TreeGen: A Tree-Based Transformer Architecture for Code Generation

Zeyu Sun; Qihao Zhu; Yingfei Xiong; Yican Sun; Lili Mou; Lu Zhang

Back to AAAI

AAAI 2020

TreeGen: A Tree-Based Transformer Architecture for Code Generation

Conference Paper AAAI Technical Track: Natural Language Processing Artificial Intelligence

PDF Details

Abstract

A code generation system generates programming language code based on an input natural language description. State-ofthe-art approaches rely on neural networks for code generation. However, these code generators suffer from two problems. One is the long dependency problem, where a code element often depends on another far-away code element. A variable reference, for example, depends on its deﬁnition, which may appear quite a few lines before. The other problem is structure modeling, as programs contain rich structural information. In this paper, we propose a novel tree-based neural architecture, TreeGen, for code generation. TreeGen uses the attention mechanism of Transformers to alleviate the longdependency problem, and introduces a novel AST reader (encoder) to incorporate grammar rules and AST structures into the network. We evaluated TreeGen on a Python benchmark, HearthStone, and two semantic parsing benchmarks, ATIS and GEO. TreeGen outperformed the previous state-of-theart approach by 4. 5 percentage points on HearthStone, and achieved the best accuracy among neural network-based approaches on ATIS (89. 1%) and GEO (89. 6%). We also conducted an ablation test to better understand each component of our model.

TreeGen: A Tree-Based Transformer Architecture for Code Generation

Abstract

Authors

Keywords

Context