Arrow Research search

Author name cluster

Katrin Erk

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

3 papers
1 author row

Possible papers

3

AAAI Conference 2020 Conference Paper

Attending to Entities for Better Text Understanding

  • Pengxiang Cheng
  • Katrin Erk

Recent progress in NLP witnessed the development of largescale pre-trained language models (GPT, BERT, XLNet, etc.) based on Transformer (Vaswani et al. 2017), and in a range of end tasks, such models have achieved state-of-the-art results, approaching human performance. This clearly demonstrates the power of the stacked self-attention architecture when paired with a sufficient number of layers and a large amount of pre-training data. However, on tasks that require complex and long-distance reasoning where surface-level cues are not enough, there is still a large gap between the pre-trained models and human performance. Strubell et al. (2018) recently showed that it is possible to inject knowledge of syntactic structure into a model through supervised self-attention. We conjecture that a similar injection of semantic knowledge, in particular, coreference information, into an existing model would improve performance on such complex problems. On the LAMBADA (Paperno et al. 2016) task, we show that a model trained from scratch with coreference as auxiliary supervision for self-attention outperforms the largest GPT-2 model, setting the new state-of-the-art, while only containing a tiny fraction of parameters compared to GPT-2. We also conduct a thorough analysis of different variants of model architectures and supervision configurations, suggesting future directions on applying similar techniques to other problems.

AAAI Conference 2019 Conference Paper

Implicit Argument Prediction as Reading Comprehension

  • Pengxiang Cheng
  • Katrin Erk

Implicit arguments, which cannot be detected solely through syntactic cues, make it harder to extract predicate-argument tuples. We present a new model for implicit argument prediction that draws on reading comprehension, casting the predicate-argument tuple with the missing argument as a query. We also draw on pointer networks and multi-hop computation. Our model shows good performance on an argument cloze task as well as on a nominal implicit argument prediction task.

TIST Journal 2013 Journal Article

An inference-based model of word meaning in context as a paraphrase distribution

  • Taesun Moon
  • Katrin Erk

Graded models of word meaning in context characterize the meaning of individual usages (occurrences) without reference to dictionary senses. We introduce a novel approach that frames the task of computing word meaning in context as a probabilistic inference problem. The model represents the meaning of a word as a probability distribution over potential paraphrases, inferred using an undirected graphical model. Evaluated on paraphrasing tasks, the model achieves state-of-the-art performance.