Alternating Language Modeling for Cross-Lingual Pre-Training

Jian Yang; Shuming Ma; Dongdong Zhang; ShuangZhi Wu; Zhoujun Li; Ming Zhou

Back to AAAI

AAAI 2020

Alternating Language Modeling for Cross-Lingual Pre-Training

Conference Paper AAAI Technical Track: Natural Language Processing Artificial Intelligence

PDF Details

Abstract

Language model pre-training has achieved success in many natural language processing tasks. Existing methods for cross-lingual pre-training adopt Translation Language Model to predict masked words with the concatenation of the source sentence and its target equivalent. In this work, we introduce a novel cross-lingual pre-training method, called Alternating Language Modeling (ALM). It code-switches sentences of different languages rather than simple concatenation, hoping to capture the rich cross-lingual context of words and phrases. More speciﬁcally, we randomly substitute source phrases with target translations to create code-switched sentences. Then, we use these code-switched data to train ALM model to learn to predict words of different languages. We evaluate our pre-training ALM on the downstream tasks of machine translation and cross-lingual classiﬁcation. Experiments show that ALM can outperform the previous pretraining methods on three benchmarks. 1

Alternating Language Modeling for Cross-Lingual Pre-Training

Abstract

Authors

Keywords

Context