Tied Transformers: Neural Machine Translation with Shared Encoder and Decoder

Yingce Xia; Tianyu He; Xu Tan; Fei Tian; Di He; Tao Qin

Back to AAAI

AAAI 2019

Tied Transformers: Neural Machine Translation with Shared Encoder and Decoder

Conference Paper AAAI Technical Track: Machine Learning Artificial Intelligence

PDF Details

Abstract

Sharing source and target side vocabularies and word embeddings has been a popular practice in neural machine translation (briefly, NMT) for similar languages (e. g. , English to French or German translation). The success of such wordlevel sharing motivates us to move one step further: we consider model-level sharing and tie the whole parts of the encoder and decoder of an NMT model. We share the encoder and decoder of Transformer (Vaswani et al. 2017), the stateof-the-art NMT model, and obtain a compact model named Tied Transformer. Experimental results demonstrate that such a simple method works well for both similar and dissimilar language pairs. We empirically verify our framework for both supervised NMT and unsupervised NMT: we achieve a 35. 52 BLEU score on IWSLT 2014 German to English translation, 28. 98/29. 89 BLEU scores on WMT 2014 English to German translation without/with monolingual data, and a 22. 05 BLEU score on WMT 2016 unsupervised German to English translation.

Tied Transformers: Neural Machine Translation with Shared Encoder and Decoder

Abstract

Authors

Keywords

Context