Highlights Conference 2017 Conference Abstract
Shannon-type inequalities, submodular width, and disjunctive datalog
- Hung Ngo
This talk overviews recent results on bounding the output size and on efficiently evaluating a disjunctive datalog rule, given input statistics such as relation sizes, functional dependencies, and degree bounds. These are the kind of statistics prevalent in database query evaluation, and our results apply to aggregate queries as well. The disjunctive datalog query evaluation problem is fairly general, as both conjunctive query evaluation and the basic constraint satisfaction problem are special cases. These new combinatorial and algorithmic results are built up on a fundamental connection between query evaluation and Shannon-type inequalities. It was observed in different contexts over the past 40 years that information-theoretic inequalities can be used to bound combinatorial quantities. First, one can derive (sometimes tight) output size bounds of conjunctive queries and disjunctive datalog rules using Shannon-type inequalities. This talk discusses these bounds and techniques. Second, we shall show how one can turn a proof of an information-inequality into an efficient algorithm to evaluate such queries. The algorithm’s runtime is bounded by a generalized version of the submodular width of the query (a recent notion developed by Daniel Marx), which is optimal modulo complexity-theoretic assumptions. The talk is based on joint works with Mahmoud Abo Khamis and Dan Suciu.