Arrow Research search

Author name cluster

Donghui Feng

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

2 papers
1 author row

Possible papers

2

AAAI Conference 2006 Conference Paper

Mining and Re-ranking for Answering Biographical Queries on the Web

  • Donghui Feng

The rapid growth of the Web has made itself a huge and valuable knowledge base. Among them, biographical information is of great interest to society. However, there has not been an efficient and complete approach to automated biography creation by querying the web. This paper describes an automatic web-based question answering system for biographical queries. Ad-hoc improvements on pattern learning approaches are proposed for mining biographical knowledge. Using bootstrapping, our approach learns surface text patterns from the web, and applies the learned patterns to extract relevant information. To reduce human labeling cost, we propose a new IDF-inspired reranking approach and compare it with pattern’s precisionbased re-ranking approach. A comparative study of the two re-ranking models is conducted. The tested system produces promising results for answering biographical queries.

AAAI Conference 2006 Conference Paper

Towards Modeling Threaded Discussions using Induced Ontology Knowledge

  • Donghui Feng
  • Erin Shaw

Online discussion boards are a popular form of web-based computer-mediated communication, especially in the areas of distributed education and customer support. Automatic analysis for discussion understanding would enable better information assessment and assistance. This paper describes an extensive study of the relationship between individual messages and full discussion threads. We present a new approach to classifying discussions using a Rocchio-style classifier with little cost for data labeling. In place of a labeled data set, we employ a coarse domain ontology that is automatically induced from a canonical text in a novel way and use it to build discussion topic profiles. We describe a new classify-by-dominance strategy for classifying discussion threads and demonstrate that in the presence of noise it can perform better than the standard classify-as-a-whole approach with an error rate reduction of 16.8%. This analysis of human conversation via online discussions provides a basis for the development of future information extraction and question answering techniques.