Arrow Research search

Author name cluster

Hung Ngo

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

3 papers
1 author row

Possible papers

3

Highlights Conference 2017 Conference Abstract

Shannon-type inequalities, submodular width, and disjunctive datalog

  • Hung Ngo

This talk overviews recent results on bounding the output size and on efficiently evaluating a disjunctive datalog rule, given input statistics such as relation sizes, functional dependencies, and degree bounds. These are the kind of statistics prevalent in database query evaluation, and our results apply to aggregate queries as well. The disjunctive datalog query evaluation problem is fairly general, as both conjunctive query evaluation and the basic constraint satisfaction problem are special cases. These new combinatorial and algorithmic results are built up on a fundamental connection between query evaluation and Shannon-type inequalities. It was observed in different contexts over the past 40 years that information-theoretic inequalities can be used to bound combinatorial quantities. First, one can derive (sometimes tight) output size bounds of conjunctive queries and disjunctive datalog rules using Shannon-type inequalities. This talk discusses these bounds and techniques. Second, we shall show how one can turn a proof of an information-inequality into an efficient algorithm to evaluate such queries. The algorithm’s runtime is bounded by a generalized version of the submodular width of the query (a recent notion developed by Daniel Marx), which is optimal modulo complexity-theoretic assumptions. The talk is based on joint works with Mahmoud Abo Khamis and Dan Suciu.

NeurIPS Conference 2014 Conference Paper

Parallel Feature Selection Inspired by Group Testing

  • Yingbo Zhou
  • Utkarsh Porwal
  • Ce Zhang
  • Hung Ngo
  • XuanLong Nguyen
  • Christopher Ré
  • Venu Govindaraju

This paper presents a parallel feature selection method for classification that scales up to very high dimensions and large data sizes. Our original method is inspired by group testing theory, under which the feature selection procedure consists of a collection of randomized tests to be performed in parallel. Each test corresponds to a subset of features, for which a scoring function may be applied to measure the relevance of the features in a classification task. We develop a general theory providing sufficient conditions under which true features are guaranteed to be correctly identified. Superior performance of our method is demonstrated on a challenging relation extraction task from a very large data set that have both redundant features and sample size in the order of millions. We present comprehensive comparisons with state-of-the-art feature selection methods on a range of data sets, for which our method exhibits competitive performance in terms of running time and accuracy. Moreover, it also yields substantial speedup when used as a pre-processing step for most other existing methods.

IJCAI Conference 2013 Conference Paper

Upper Confidence Weighted Learning for Efficient Exploration in Multiclass Prediction with Binary Feedback

  • Hung Ngo
  • Matthew Luciw
  • Ngo Anh Vien
  • Jürgen Schmidhuber

We introduce a novel algorithm called Upper Confidence Weighted Learning (UCWL) for online multiclass learning from binary feedback. UCWL combines the Upper Confidence Bound (UCB) framework with the Soft Confidence Weighted (SCW) online learning scheme. UCWL achieves state of the art performance (especially on noisy and nonseparable data) with low computational costs. Estimated confidence intervals are used for informed exploration, which enables faster learning than the uninformed exploration case or the case where exploration is not used. The targeted application setting is human-robot interaction (HRI), in which a robot is learning to classify its observations while a human teaches it by providing only binary feedback (e. g. , right/wrong). Results in an HRI experiment, and with two benchmark datasets, show UCWL outperforms other algorithms in the online binary feedback setting, and surprisingly even sometimes beats state-of-the-art algorithms that get full feedback, while UCWL gets only binary feedback on the same data.