Arrow Research search
Back to AAAI

AAAI 2019

Sequence to Sequence Learning for Query Expansion

Short Paper Student Abstract Track Artificial Intelligence

Abstract

As fas as we are aware, using Sequence to Sequence algorithms for query expansion has not been explored yet in Information Retrieval literature. We tried to fill this gap in the literature with a custom Query Expansion system trained and tested on open datasets. One specificity of our engine compared to classic ones is that it does not need the documents to expand the introduced query. We test our expansions on two different tasks: Information Retrieval and Answer preselection. Our method yielded a slight improvement in performance in both two tasks. Our main contributions are: • Starting from open datasets, we built a Query Expansion training set using sentence-embeddings-based Keyword Extraction. • We assess the ability of the Sequence to Sequence neural networks to capture expanding relations in the words embeddings’ space. We afterwards started a quantitative and qualitative analysis of the weights learned by our network. In the second part, I will discuss what is learned by a Recurrent Neural Network compared to what we know about human language learning. Related Work Relevance feedback has been a popular choice for query expansion, starting with the Rocchio Algorithm (Salton 1971) in SMART Information Retrieval System. Using a set of relevant and non relevant documents, the original query vector is modified. Recently, the introduction of word embeddings (Mikolov et al. 2013) allowed new possibilities for Query Expansion. The distributed representations of the words in a query made it possible to produce expansions without extracting them from the documents. Using the centroid of the words introduced and cosine-similar tokens, Kuzi and al (2016) proposed a document-independent expansion method. Sequence to sequence architectures Sequence to Sequence is a neural architecture very popular in machine translation since it has achieved state of the art Copyright c 2019, Association for the Advancement of Artificial Intelligence (www. aaai. org). All rights reserved. results. Proposed by Sutskever and al (2014), it consists in a two-component model using recurrent neural networks to link variable-length input sequences to variable-length output sequences. The introduced sequence gets encoded by the first component into a vectorial representation. Therefore, the decoder transforms that vector into the target sequence. At each step the next token maximizes: p(yi|y1, .. ., yi−1, x) = g(yi−1, si, c) where si is the i-th hidden state of the encoder, c the final vector output by the encoder representing the entire input sentence and yi the i-th generated token. g is the function learned by the decoder. Our Approach Building the training set Datasets We used MultiNLI (Williams, Nangia, and Bowman 2018) and SNLI (Bowman et al. 2015). For both corpora, we naturally eliminate pairs classified as contradiction as they shall not provide relevant expansions. We also selected the duplicate pairs from the Quora question pairs dataset and trained our expansion model using the words that do not appear in the first formulation. Finally, MSCOCO (Lin et al. 2014) dataset consists in human annotated captions of over 120K images. Since they are describing the same image. We can assume the words appearing in one description and not in the other are an eventual expansion for the first annotation.

Authors

Keywords

No keywords are indexed for this paper.

Context

Venue
AAAI Conference on Artificial Intelligence
Archive span
1980-2026
Indexed papers
28718
Paper id
434644558601209732