Arrow Research search

Author name cluster

Anders Søgaard

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

14 papers
2 author rows

Possible papers

14

AAAI Conference 2026 Conference Paper

Realist and Pluralist Conceptions of Intelligence and Their Implications on AI Research

  • Ninell Oldenburg
  • Ruchira Dhar
  • Anders Søgaard

In this paper, we argue that current AI research operates on a spectrum between two different underlying conceptions of intelligence: Intelligence Realism, which holds that intelligence represents a single, universal capacity measurable across all systems, and Intelligence Pluralism, which views intelligence as diverse, context-dependent capacities that cannot be reduced to a single universal measure. Through an analysis of current debates in AI research, we demonstrate how the conceptions remain largely implicit yet fundamentally shape how empirical evidence gets interpreted across a wide range of areas. These underlying views generate fundamentally different research approaches across three areas. Methodologically, they produce different approaches to model selection, benchmark design, and experimental validation. Interpretively, they lead to contradictory readings of the same empirical phenomena, from capability emergence to system limitations. Regarding AI risk, they generate categorically different assessments: realists view superintelligence as the primary risk and search for unified alignment solutions, while pluralists see diverse threats across different domains requiring context-specific solutions. We argue that making explicit these underlying assumptions can contribute to a clearer understanding of disagreements in AI research.

ICML Conference 2023 Conference Paper

Differential Privacy, Linguistic Fairness, and Training Data Influence: Impossibility and Possibility Theorems for Multilingual Language Models

  • Phillip Rust
  • Anders Søgaard

Language models such as mBERT, XLM-R, and BLOOM aim to achieve multilingual generalization or compression to facilitate transfer to a large number of (potentially unseen) languages. However, these models should ideally also be private, linguistically fair, and transparent, by relating their predictions to training data. Can these requirements be simultaneously satisfied? We show that multilingual compression and linguistic fairness are compatible with differential privacy, but that differential privacy is at odds with training data influence sparsity, an objective for transparency. We further present a series of experiments on two common NLP tasks and evaluate multilingual compression and training data influence sparsity under different privacy guarantees, exploring these trade-offs in more detail. Our results suggest that we need to develop ways to jointly optimize for these objectives in order to find practical trade-offs.

AAAI Conference 2021 Conference Paper

Analogy Training Multilingual Encoders

  • Nicolas Garneau
  • Mareike Hartmann
  • Anders Sandholm
  • Sebastian Ruder
  • Ivan Vulić
  • Anders Søgaard

Language encoders encode words and phrases in ways that capture their local semantic relatedness, but are known to be globally inconsistent. Global inconsistency can seemingly be corrected for, in part, by leveraging signals from knowledge bases, but previous results are partial and limited to monolingual English encoders. We extract a large-scale multilingual, multi-word analogy dataset from Wikidata for diagnosing and correcting for global inconsistencies and implement a four-way Siamese BERT architecture for grounding multilingual BERT (mBERT) in Wikidata through analogy training. We show that analogy training not only improves the global consistency of mBERT, as well as the isomorphism of language-specific subspaces, but also leads to significant gains on downstream tasks such as bilingual dictionary induction and sentence retrieval.

AAAI Conference 2021 Conference Paper

Joint Semantic Analysis with Document-Level Cross-Task Coherence Rewards

  • Rahul Aralikatte
  • Mostafa Abdou
  • Heather C Lent
  • Daniel Hershcovich
  • Anders Søgaard

Coreference resolution and semantic role labeling are NLP tasks that capture different aspects of semantics, indicating respectively, which expressions refer to the same entity, and what semantic roles expressions serve in the sentence. However, they are often closely interdependent, and both generally necessitate natural language understanding. Do they form a coherent abstract representation of documents? We present a neural network architecture for joint coreference resolution and semantic role labeling for English, and train graph neural networks to model the coherence of the combined shallow semantic graph. Using the resulting coherence score as a reward for our joint semantic analyzer, we use reinforcement learning to encourage global coherence over the document and between semantic annotations. This leads to improvements on both tasks in multiple datasets from different domains, and across a range of encoders of different expressivity, calling, we believe, for a more holistic approach to semantics in NLP.

AAAI Conference 2020 Conference Paper

Parsing as Pretraining

  • David Vilares
  • Michalina Strzyz
  • Anders Søgaard
  • Carlos Gómez-Rodríguez

Recent analyses suggest that encoders pretrained for language modeling capture certain morpho-syntactic structure. However, probing frameworks for word vectors still do not report results on standard setups such as constituent and dependency parsing. This paper addresses this problem and does full parsing (on English) relying only on pretraining architectures – and no decoding. We first cast constituent and dependency parsing as sequence tagging. We then use a single feed-forward layer to directly map word vectors to labels that encode a linearized tree. This is used to: (i) see how far we can reach on syntax modelling with just pretrained encoders, and (ii) shed some light about the syntax-sensitivity of different word vectors (by freezing the weights of the pretraining network during training). For evaluation, we use bracketing F1score and LAS, and analyze in-depth differences across representations for span lengths and dependency displacements. The overall results surpass existing sequence tagging parsers on the PTB (93. 5%) and end-to-end EN-EWT UD (78. 8%).

AAAI Conference 2020 Conference Paper

Weakly Supervised POS Taggers Perform Poorly on Truly Low-Resource Languages

  • Katharina Kann
  • Ophélie Lacroix
  • Anders Søgaard

Part-of-speech (POS) taggers for low-resource languages which are exclusively based on various forms of weak supervision – e. g. , cross-lingual transfer, type-level supervision, or a combination thereof – have been reported to perform almost as well as supervised ones. However, weakly supervised POS taggers are commonly only evaluated on languages that are very different from truly low-resource languages, and the taggers use sources of information, like high-coverage and almost error-free dictionaries, which are likely not available for resource-poor languages. We train and evaluate state-of-theart weakly supervised POS taggers for a typologically diverse set of 15 truly low-resource languages. On these languages, given a realistic amount of resources, even our best model gets only less than half of the words right. Our results highlight the need for new and different approaches to POS tagging for truly low-resource languages.

AAAI Conference 2020 Conference Paper

What Do You Mean ‘Why?’: Resolving Sluices in Conversations

  • Victor Petrén Bach Hansen
  • Anders Søgaard

In conversation, we often ask one-word questions such as ‘Why? ’ or ‘Who? ’. Such questions are typically easy for humans to answer, but can be hard for computers, because their resolution requires retrieving both the right semantic frames and the right arguments from context. This paper introduces the novel ellipsis resolution task of resolving such one-word questions, referred to as sluices in linguistics. We present a crowd-sourced dataset containing annotations of sluices from over 4, 000 dialogues collected from conversational QA datasets, as well as a series of strong baseline architectures.

JAIR Journal 2019 Journal Article

A Survey of Cross-lingual Word Embedding Models

  • Sebastian Ruder
  • Ivan Vulić
  • Anders Søgaard

Cross-lingual representations of words enable us to reason about word meaning in multilingual contexts and are a key facilitator of cross-lingual transfer when developing natural language processing models for low-resource languages. In this survey, we provide a comprehensive typology of cross-lingual word embedding models. We compare their data requirements and objective functions. The recurring theme of the survey is that many of the models presented in the literature optimize for the same objectives, and that seemingly different models are often equivalent, modulo optimization strategies, hyper-parameters, and such. We also discuss the different ways cross-lingual word embeddings are evaluated, as well as future challenges and research horizons.

NeurIPS Conference 2019 Conference Paper

Comparing Unsupervised Word Translation Methods Step by Step

  • Mareike Hartmann
  • Yova Kementchedjhieva
  • Anders Søgaard

Cross-lingual word vector space alignment is the task of mapping the vocabularies of two languages into a shared semantic space, which can be used for dictionary induction, unsupervised machine translation, and transfer learning. In the unsupervised regime, an initial seed dictionary is learned in the absence of any known correspondences between words, through {\bf distribution matching}, and the seed dictionary is then used to supervise the induction of the final alignment in what is typically referred to as a (possibly iterative) {\bf refinement} step. We focus on the first step and compare distribution matching techniques in the context of language pairs for which mixed training stability and evaluation scores have been reported. We show that, surprisingly, when looking at this initial step in isolation, vanilla GANs are superior to more recent methods, both in terms of precision and robustness. The improvements reported by more recent methods thus stem from the refinement techniques, and we show that we can obtain state-of-the-art performance combining vanilla GANs with such refinement techniques.

AAAI Conference 2019 Conference Paper

Jointly Learning to Label Sentences and Tokens

  • Marek Rei
  • Anders Søgaard

Learning to construct text representations in end-to-end systems can be difficult, as natural languages are highly compositional and task-specific annotated datasets are often limited in size. Methods for directly supervising language composition can allow us to guide the models based on existing knowledge, regularizing them towards more robust and interpretable representations. In this paper, we investigate how objectives at different granularities can be used to learn better language representations and we propose an architecture for jointly learning to label sentences and tokens. The predictions at each level are combined together using an attention mechanism, with token-level labels also acting as explicit supervision for composing sentence-level representations. Our experiments show that by learning to perform these tasks jointly on multiple levels, the model achieves substantial improvements for both sentence classification and sequence labeling.

AAAI Conference 2019 Conference Paper

Latent Multi-Task Architecture Learning

  • Sebastian Ruder
  • Joachim Bingel
  • Isabelle Augenstein
  • Anders Søgaard

Multi-task learning (MTL) allows deep neural networks to learn from related tasks by sharing parameters with other networks. In practice, however, MTL involves searching an enormous space of possible parameter sharing architectures to find (a) the layers or subspaces that benefit from sharing, (b) the appropriate amount of sharing, and (c) the appropriate relative weights of the different task losses. Recent work has addressed each of the above problems in isolation. In this work we present an approach that learns a latent multi-task architecture that jointly addresses (a)–(c). We present experiments on synthetic data and data from OntoNotes 5. 0, including four different tasks and seven different domains. Our extension consistently outperforms previous approaches to learning latent architectures for multi-task problems and achieves up to 15% average error reductions over common approaches to MTL.

AAAI Conference 2019 Conference Paper

Predicting Concrete and Abstract Entities in Modern Poetry

  • Fiammetta Caccavale
  • Anders Søgaard

One dimension of modernist poetry is introducing entities in surprising contexts, such as wheelbarrow in Bob Dylan’s feel like falling in love with the first woman I meet/ putting her in a wheelbarrow. This paper considers the problem of teaching a neural language model to select poetic entities, based on local context windows. We do so by fine-tuning and evaluating language models on the poetry of American modernists, both on seen and unseen poets, and across a range of experimental designs. We also compare the performance of our poetic language model to human, professional poets. Our main finding is that, perhaps surprisingly, modernist poetry differs most from ordinary language when entities are concrete, like wheelbarrow, and while our fine-tuning strategy successfully adapts to poetic language in general, outperforming professional poets, the biggest error reduction is observed with concrete entities.

AAAI Conference 2018 Conference Paper

Learning to Predict Readability Using Eye-Movement Data From Natives and Learners

  • Ana González-Garduño
  • Anders Søgaard

Readability assessment can improve the quality of assisting technologies aimed at language learners. Eye-tracking data has been used for both inducing and evaluating generalpurpose NLP/AI models, and below we show that unsurprisingly, gaze data from language learners can also improve multi-task readability assessment models. This is unsurprising, since the gaze data records the reading difficulties of the learners. Unfortunately, eye-tracking data from language learners is often much harder to obtain than eye-tracking data from native speakers. We therefore compare the performance of deep learning readability models that use native speaker eye movement data to models using data from language learners. Somewhat surprisingly, we observe no significant drop in performance when replacing learners with natives, making approaches that rely on native speaker gaze information, more scalable. In other words, our finding is that language learner difficulties can be efficiently estimated from native speakers, which suggests that, more generally, readily available gaze data can be used to improve educational NLP/AI models targeted towards language learners.