Author name cluster

Victor O.K. Li

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

5 papers

1 author row

AAAI Conference 2021 Conference Paper

Lexically Constrained Neural Machine Translation with Explicit Alignment Guidance

Guanhua Chen
Yun Chen
Victor O.K. Li

Lexically constrained neural machine translation (NMT), which leverages pre-specified translation to constrain NMT, has practical significance in interactive translation and NMT domain adaptation. Previous works either modify the decoding algorithm or train the model on augmented datasets. These methods suffer from either high computational overheads or low copying success rates. In this paper, we investigate ATT-INPUT and ATT-OUTPUT, two alignment-based constrained decoding methods. These two methods revise the target tokens during decoding based on word alignments derived from encoder-decoder attention weights. Our study shows that ATT-INPUT translates better while ATT- OUTPUT is more computationally efficient. Capitalizing on both strengths, we further propose EAM-OUTPUT by introducing an explicit alignment module (EAM) to a pretrained Transformer. It decodes similarly as ATT-OUTPUT, except using alignments derived from the EAM. We leverage the word alignments induced from ATT-INPUT as labels and train the EAM while keeping the parameters of the Transformer frozen. Experiments on WMT16 De-En and WMT16 Ro- En show the effectiveness of our approaches on constrained NMT. In particular, the proposed EAM-OUTPUT method consistently outperforms previous approaches in translation quality, with light computational overheads over unconstrained baseline.

PDF Details

AAAI Conference 2021 Conference Paper

Show Me How To Revise: Improving Lexically Constrained Sentence Generation with XLNet

Xingwei He
Victor O.K. Li

Lexically constrained sentence generation allows the incorporation of prior knowledge such as lexical constraints into the output. This technique has been applied to machine translation, and dialog response generation. Previous work usually used Markov Chain Monte Carlo (MCMC) sampling to generate lexically constrained sentences, but they randomly determined the position to be edited and the action to be taken, resulting in many invalid refinements. To overcome this challenge, we used a classifier to instruct the MCMCbased models where and how to refine the candidate sentences. First, we developed two methods to create synthetic data on which the pre-trained model is fine-tuned to obtain a reliable classifier. Next, we proposed a two-step approach, “Predict and Revise”, for constrained sentence generation. During the predict step, we leveraged the classifier to compute the learned prior for the candidate sentence. During the revise step, we resorted to MCMC sampling to revise the candidate sentence by conducting a sampled action at a sampled position drawn from the learned prior. We compared our proposed models with many strong baselines on two tasks, generating sentences with lexical constraints and text infilling. Experimental results have demonstrated that our proposed model performs much better than the previous work in terms of sentence fluency and diversity. Our code, pre-trained models and Appendix are available at https: //github. com/NLPCode/MCMCXLNet.

PDF Details

AAAI Conference 2020 Conference Paper

Go From the General to the Particular: Multi-Domain Translation with Domain Transformation Networks

Yong Wang
Longyue Wang
Shuming Shi
Victor O.K. Li
Zhaopeng Tu

The key challenge of multi-domain translation lies in simultaneously encoding both the general knowledge shared across domains and the particular knowledge distinctive to each domain in a uniﬁed model. Previous work shows that the standard neural machine translation (NMT) model, trained on mixed-domain data, generally captures the general knowledge, but misses the domain-speciﬁc knowledge. In response to this problem, we augment NMT model with additional domain transformation networks to transform the general representations to domain-speciﬁc representations, which are subsequently fed to the NMT decoder. To guarantee the knowledge transformation, we also propose two complementary supervision signals by leveraging the power of knowledge distillation and adversarial learning. Experimental results on several language pairs, covering both balanced and unbalanced multi-domain translation, demonstrate the effectiveness and universality of the proposed approach. Encouragingly, the proposed uniﬁed model achieves comparable results with the ﬁne-tuning approach that requires multiple models to preserve the particular knowledge. Further analyses reveal that the domain transformation networks successfully capture the domain-speciﬁc knowledge as expected. 1

PDF Details

AAAI Conference 2018 Conference Paper

Neural Machine Translation with Gumbel-Greedy Decoding

Jiatao Gu
Daniel Jiwoong Im
Victor O.K. Li

Previous neural machine translation models used some heuristic search algorithms (e. g. , beam search) in order to avoid solving the maximum a posteriori problem over translation sentences at test phase. In this paper, we propose the Gumbel- Greedy Decoding which trains a generative network to predict translation under a trained model. We solve such a problem using the Gumbel-Softmax reparameterization, which makes our generative network differentiable and trainable through standard stochastic gradient methods. We empirically demonstrate that our proposed model is effective for generating sequences of discrete words.

PDF Details

AAAI Conference 2018 Conference Paper

Search Engine Guided Neural Machine Translation

Jiatao Gu
Yong Wang
Kyunghyun Cho
Victor O.K. Li

In this paper, we extend an attention-based neural machine translation (NMT) model by allowing it to access an entire training set of parallel sentence pairs even after training. The proposed approach consists of two stages. In the ﬁrst stage– retrieval stage–, an off-the-shelf, black-box search engine is used to retrieve a small subset of sentence pairs from a training set given a source sentence. These pairs are further ﬁltered based on a fuzzy matching score based on edit distance. In the second stage–translation stage–, a novel translation model, called search engine guided NMT (SEG-NMT), seamlessly uses both the source sentence and a set of retrieved sentence pairs to perform the translation. Empirical evaluation on three language pairs (En-Fr, En-De, and En-Es) shows that the proposed approach signiﬁcantly outperforms the baseline approach and the improvement is more signiﬁcant when more relevant sentence pairs were retrieved.

PDF Details