Arrow Research search

Author name cluster

Ellen Riloff

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

10 papers
1 author row

Possible papers

10

AAAI Conference 2018 Conference Paper

Mars Target Encyclopedia: Rock and Soil Composition Extracted From the Literature

  • Kiri Wagstaff
  • Raymond Francis
  • Thamme Gowda
  • You Lu
  • Ellen Riloff
  • Karanjeet Singh
  • Nina Lanza

We have constructed an information extraction system called the Mars Target Encyclopedia that takes in planetary science publications and extracts scientific knowledge about target compositions. The extracted knowledge is stored in a searchable database that can greatly accelerate the ability of scientists to compare new discoveries with what is already known. To date, we have applied this system to ∼6000 documents and achieved 41–56% precision in the extracted information.

AAAI Conference 2018 Conference Paper

Weakly Supervised Induction of Affective Events by Optimizing Semantic Consistency

  • Haibo Ding
  • Ellen Riloff

To understand narrative text, we must comprehend how people are affected by the events that they experience. For example, readers understand that graduating from college is a positive event (achievement) but being fired from one’s job is a negative event (problem). NLP researchers have developed effective tools for recognizing explicit sentiments, but affective events are more difficult to recognize because the polarity is often implicit and can depend on both a predicate and its arguments. Our research investigates the prevalence of affective events in a personal story corpus, and introduces a weakly supervised method for large scale induction of affective events. We present an iterative learning framework that constructs a graph with nodes representing events and initializes their affective polarities with sentiment analysis tools as weak supervision. The events are then linked based on three types of semantic relations: (1) semantic similarity, (2) semantic opposition, and (3) shared components. The learning algorithm iteratively refines the polarity values by optimizing semantic consistency across all events in the graph. Our model learns over 100, 000 affective events and identifies their polarities more accurately than other methods.

AAAI Conference 2016 Conference Paper

Acquiring Knowledge of Affective Events from Blogs Using Label Propagation

  • Haibo Ding
  • Ellen Riloff

Many common events in our daily life affect us in positive and negative ways. For example, going on vacation is typically an enjoyable event, while being rushed to the hospital is an undesirable event. In narrative stories and personal conversations, recognizing that some events have a strong affective polarity is essential to understand the discourse and the emotional states of the affected people. However, current NLP systems mainly depend on sentiment analysis tools, which fail to recognize many events that are implicitly affective based on human knowledge about the event itself and cultural norms. Our goal is to automatically acquire knowledge of stereotypically positive and negative events from personal blogs. Our research creates an event context graph from a large collection of blog posts and uses a sentiment classifier and semi-supervised label propagation algorithm to discover affective events. We explore several graph configurations that propagate affective polarity across edges using local context, discourse proximity, and event-event co-occurrence. We then harvest highly affective events from the graph and evaluate the agreement of the polarities with human judgements.

AAAI Conference 2012 Conference Paper

Modeling Textual Cohesion for Event Extraction

  • Ruihong Huang
  • Ellen Riloff

Event extraction systems typically locate the role fillers for an event by analyzing sentences in isolation and identifying each role filler independently of the others. We argue that more accurate event extraction requires a view of the larger context to decide whether an entity is related to a relevant event. We propose a bottom-up approach to event extraction that initially identifies candidate role fillers independently and then uses that information as well as discourse properties to model textual cohesion. The novel component of the architecture is a sequentially structured sentence classifier that identifies event-related story contexts. The sentence classifier uses lexical associations and discourse relations across sentences, as well as domain-specific distributions of candidate role fillers within and across sentences. This approach yields state-ofthe-art performance on the MUC-4 data set, achieving substantially higher precision than previous systems.

AAAI Conference 2005 Conference Paper

Exploiting Subjectivity Classification to Improve Information Extraction

  • Ellen Riloff

Information extraction (IE) systems are prone to false hits for a variety of reasons and we observed that many of these false hits occur in sentences that contain subjective language (e. g. , opinions, emotions, and sentiments). Motivated by these observations, we explore the idea of using subjectivity analysis to improve the precision of information extraction systems. In this paper, we describe an IE system that uses a subjective sentence classifier to filter its extractions. We experimented with several different strategies for using the subjectivity classifications, including an aggressive strategy that discards all extractions found in subjective sentences and more complex strategies that selectively discard extractions. We evaluated the performance of these different approaches on the MUC-4 terrorism data set. We found that indiscriminately filtering extractions from subjective sentences was overly aggressive, but more selective filtering strategies improved IE precision with minimal recall loss.

AAAI Conference 1999 Conference Paper

Learning Dictionaries for Information Extraction by Multi-Level Bootstrapping

  • Ellen Riloff
  • University of Utah
  • Rosie Jones
  • Carnegie Mellon University

Information extraction systems usually require two dictionaries: a semanticlexicon anda dictionary of extraction patterns for the domain. Wepresent a multilevel bootstrapping algorithmthat generates both the semantic lexicon and extraction patterns simultaneously. As input, our technique requires only unannotated training texts and a handful of seed words for a category. Weuse a mutual bootstrapping techniqueto alternately select the best extraction pattern for the categoryandbootstrap its extractions into the semanticlexicon, whichis the basis for selecting the next extraction pattern. To makethis approach more robust, weadda secondlevel of bootstrapping (metabootstrapping)that retains only the mostreliable lexicon entries produced by mutual bootstrapping and then restarts the process. Weevaluated this multilevel bootstrappingtechniqueon a collection of corporate webpagesanda corpusof terrorism newsarticles. Thealgorithm producedhigh-quality dictionaries for several semanticcategories.

AAAI Conference 1996 Conference Paper

Automatically Generating Extraction Patterns from Untagged Text

  • Ellen Riloff

Many corpus-based natural language processing systems rely on text corpora that have been manually annotated with syntactic or semantic tags. In particular, all previous dictionary construction systems for information extraction have used an annotated training corpus or some form of annotated input. We have developed a system called AutoSlog-TS that creates dictionaries of extraction patterns using only untagged text. AutoSlog-TS is based on the AutoSlog system, which generated extraction patterns using annotated text and a set of heuristic rules. By adapting AutoSlog and combining it with statistical techniques, we eliminated its dependency on tagged text. In experiments with the MUC-4 terrorism domain, AutoSlog-TS created a dictionary of extraction patterns that performed comparably to a dictionary created by AutoSlog, using only preclassified texts as input.

AAAI Conference 1993 Conference Paper

Automatically Constructing a Dictionary for Information Extraction Tasks

  • Ellen Riloff

Knowledge-based natural language processing systems have achieved good success with certain tasks but they are often criticized because they depend on a domain-specific dictionary that requires a great deal of manual knowledge engineering. This knowledge engineering bottleneck makes knowledge-based NLP systems impractical for real-world applications because they cannot be easily scaled up orported to new domains. In response to this problem, we developed a system called AutoSlog that automatically builds a domain-specific dictionary of concepts for extracting information from text. Using AutoSlog. we constructed a dictionary for the domain of terrorist event descriptions in only 5 person-hours. We then compared the AutoSlog dictionary with a hand-crafted dictionary that was built by two highly skilled graduate students and required approximately 1500 person-hours of effort. We evaluated the two dictionaries using two blind test sets of 100 texts each. Overall, the AutoSlog dictionary achieved 98% of the performance of the hand-crafted dictionary. On the first test set, the Auto- Slog dictionary obtained 96. 3% of the perfomlance of the hand-crafted dictionary. On the second test set, the overall scores were virtually indistinguishable with the AutoSlog dictionary achieving 99. 7% of the performance of the handcrafted dictionary.

AAAI Conference 1992 Conference Paper

Classifying Texts Using Relevancy Signatures

  • Ellen Riloff

Text processing for complex domains such as terrorism is complicated by the difficulty of being able to reliably distinguish relevant and irrelevant texts. We have discovered a simple and effective filter, the Relevancy Signatures Algorithm, and demonstrated its performance in the domain of terrorist event descriptions. The Relevancy Signatures Algorithm is based on the natural language processing technique of selective concept extraction, and relies on text representations that reflect predictable patterns of linguistic context. This paper describes text classification experiments conducted in the domain of terrorism using the MUC-3 text corpus. A customized dictionary of about 6,000 words provides the lexical knowledge base needed to discriminate relevant texts, and the CIRCUS sentence analyzer generates relevancy signatures as an effortless side-effect of its normal sentence analysis. Although we suspect that the training base available to us from the MUC-3 corpus may not be large enough to provide optimal training, we were nevertheless able to attain relevancy discriminations for significant levels of recall (ranging from 11% to 47%) with 100% precision in half of our test runs.