Author name cluster

Jesse Davis

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

44 papers

2 author rows

TMLR Journal 2026 Journal Article

Dealing with Uncertainty in Contextual Anomaly Detection

Luca Bindini
Lorenzo Perini
Stefano Nistri
Jesse Davis
Paolo Frasconi

Contextual anomaly detection (CAD) aims to identify anomalies in a target (behavioral) variable conditioned on a set of contextual variables that influence the normalcy of the target variable but are not themselves indicators of anomaly. In many anomaly detection tasks, there exist contextual variables that influence the normalcy of the target variable but are not themselves indicators of anomaly. In this work, we propose a novel framework for CAD, normalcy score (NS), that explicitly models both the aleatoric and epistemic uncertainties. Built on heteroscedastic Gaussian process regression, our method regards the Z-score as a random variable, providing confidence intervals that reflect the reliability of the anomaly assessment. Through experiments on benchmark datasets and a real-world application in cardiology, we demonstrate that NS outperforms state-of-the-art CAD methods in both detection accuracy and interpretability. Moreover, confidence intervals enable an adaptive, uncertainty-driven decision-making process, which may be very important in domains such as healthcare.

PDF Details

ICML Conference 2025 Conference Paper

Compressing tree ensembles through Level-wise Optimization and Pruning

Laurens Devos
Timo Martens
Deniz Can Oruc
Wannes Meert
Hendrik Blockeel
Jesse Davis

Tree ensembles (e. g. , gradient boosting decision trees) are often used in practice because they offer excellent predictive performance while still being easy and efficient to learn. In some contexts, it is important to additionally optimize their size: this is specifically the case when models need to have verifiable properties (verification of fairness, robustness, etc. is often exponential in the ensemble’s size), or when models run on battery-powered devices (smaller ensembles consume less energy, increasing battery autonomy). For this reason, compression of tree ensembles is worth studying. This paper presents LOP, a method for compressing a given tree ensemble by pruning or entirely removing trees in it, while updating leaf predictions in such a way that predictive accuracy is mostly unaffected. Empirically, LOP achieves compression factors that are often 10 to 100 times better than that of competing methods.

Details

ECAI Conference 2025 Conference Paper

Policy Safety Testing in Non-Deterministic Planning: Fuzzing, Test Oracles, Fault Analysis

Chaahat Jain
Daniel Sherbakov
Marcel Vinzent
Marcel Steinmetz
Jesse Davis
Jörg Hoffmann 0001

Recent work has introduced methodology for testing learned action policies in AI Planning, aiming to effectively identify bug states where policy behavior is sub-optimal. While this work focused on cost-optimality in classical planning, here we apply the core ideas to safety testing in planning with initial-state and action-outcome non-determinism. We cover the entire testing pipeline, introducing fuzzing algorithms to find unsafe policy runs, as well as test oracles to identify bugs where such unsafe behavior could be avoided. Going beyond the previous framework, we introduce a final step to the pipeline, identifying faults which we define to be specific policy decisions – state/action pairs – transitioning from a safe state (where a safe policy exists) to an unsafe state (where no such policy exists). We adapt a range of known algorithms for these purposes, including also approximate ones bounding the number of times we are allowed to diverge from the learned policy. We run comprehensive experiments evaluating each part of our pipeline. Key takeaways are that safety testing can be quite cheap, in contrast to cost-optimality testing; and that variants of Tarjan’s algorithm tend to be highly effective for this purpose.

Details

ECAI Conference 2024 Conference Paper

Combining Active Learning and Learning to Reject for Anomaly Detection

Luca Stradiotti
Lorenzo Perini
Jesse Davis

Anomaly detection attempts to identify instances in the data that do not conform to the expected behavior. Because it is often difficult to label instances, the problem is tackled in an unsupervised way by employing data-driven heuristics to identify anomalies. However, the heuristics are imperfect which can degrade a detector’s performance. One way to mitigate this problem is using Active Learning to collect labels that help correct cases where the employed heuristics are incorrect. Alternatively, one can allow the detector to abstain (i. e. , say “I do not know”) whenever it is likely to make mispredictions at test time, which is called Learning to Reject (LtR). However, while both have been studied in the context of anomaly detection, they have not been considered in conjunction. Although they both need labels to accomplish their task, integrating these two ideas is challenging for two reasons. First, their label selection strategies are intertwined but they acquire different types of labels. Second, it is unclear how to best divide the limited budget between labeling instances that help AL and those that help LtR. In this paper, we introduce SADAL, the first semi-supervised detector that allocates the label budget between AL and LtR by relying on a reward-based selection function. Experimentally on 25 datasets, we show that our approach outperforms several baselines by achieving a better performance.

Details

NeurIPS Conference 2024 Conference Paper

Faster Repeated Evasion Attacks in Tree Ensembles

Lorenzo Cascioli
Laurens Devos
Ondrej Kuzelka
Jesse Davis

Tree ensembles are one of the most widely used model classes. However, these models are susceptible to adversarial examples, i. e. , slightly perturbed examples that elicit a misprediction. There has been significant research on designing approaches to construct such examples for tree ensembles. But this is a computationally challenging problem that often must be solved a large number of times (e. g. , for all examples in a training set). This is compounded by the fact that current approaches attempt to find such examples from scratch. In contrast, we exploit the fact that multiple similar problems are being solved. Specifically, our approach exploits the insight that adversarial examples for tree ensembles tend to perturb a consistent but relatively small set of features. We show that we can quickly identify this set of features and use this knowledge to speedup constructing adversarial examples.

PDF Details DOI

AAAI Conference 2024 Conference Paper

Robustness Verification of Multi-Class Tree Ensembles

Laurens Devos
Lorenzo Cascioli
Jesse Davis

Tree ensembles are one of the most widely used model classes. However, these models are susceptible to adversarial examples, which are slightly perturbed examples that elicit a misprediction. There has been significant research on designing approaches to verify the robustness of tree ensembles to such attacks. However, existing verification algorithms for tree ensembles are only able to analyze binary classifiers and hence address multiclass problems by reducing them to binary ones using a one-versus-other strategy. In this paper, we show that naively applying this strategy can yield incorrect results in certain situations. We address this shortcoming by proposing a novel approximate heuristic approach to verification for multiclass tree ensembles. Our approach is based on a novel generalization of the verification task, which we show emits other relevant verification queries.

PDF Details DOI

ECAI Conference 2024 Conference Paper

Safety Verification of Tree-Ensemble Policies via Predicate Abstraction

Chaahat Jain
Lorenzo Cascioli
Laurens Devos
Marcel Vinzent
Marcel Steinmetz
Jesse Davis
Jörg Hoffmann 0001

Learned action policies are gaining traction in AI, but come without safety guarantees. Recent work devised a method for safety verification of neural policies via predicate abstraction. Here we extend this approach to policies represented by tree ensembles, through replacing the underlying SMT queries with queries that can be dispatched by Veritas, a reasoning tool dedicated to tree ensembles. The query language supported by Veritas is limited, and we show how to encode richer constraints we need into additional trees and decision variables. We run experiments on benchmarks previously used to evaluate neural policy verification, and we design new benchmarks based on a logistics application at Airbus as well as on a real-world robotics domain. We find that (1) verification with Veritas vastly outperforms verification with Z3 and Gurobi; (2) tree-ensemble policies are much faster to verify than neural policies, while being competitive in policy quality; (3) our techniques are highly complementary to, and often outperform, an encoding of tree-ensemble policy verification into NUXMV.

Details

JAIR Journal 2023 Journal Article

A Markov Framework for Learning and Reasoning About Strategies in Professional Soccer

Maaike Van Roy
Pieter Robberechts
Wen-Chi Yang
Luc De Raedt
Jesse Davis

Strategy-optimization is a fundamental element of dynamic and complex team sports such as soccer, American football, and basketball. As the amount of data that is collected from matches in these sports has increased, so has the demand for data-driven decisionmaking support. If alternative strategies need to be balanced, a data-driven approach can uncover insights that are not available from qualitative analysis. This could tremendously aid teams in their match preparations. In this work, we propose a novel Markov modelbased framework for soccer that allows reasoning about the specific strategies teams use in order to gain insights into the efficiency of each strategy. The framework consists of two components: (1) a learning component, which entails modeling a team’s offensive behavior by learning a Markov decision process (MDP) from event data that is collected from the team’s matches, and (2) a reasoning component, which involves a novel application of probabilistic model checking to reason about the efficacy of the learned strategies of each team. In this paper, we provide an overview of this framework and illustrate it on several use cases using real-world event data from three leagues. Our results show that the framework can be used to reason about the shot decision-making of teams and to optimise the defensive strategies used when playing against a particular team. The general ideas presented in this framework can easily be extended to other sports.

PDF Details DOI

JAIR Journal 2023 Journal Article

Lifted Reasoning for Combinatorial Counting

Pietro Totis
Jesse Davis
Luc De Raedt
Angelika Kimmig

Combinatorics math problems are often used as a benchmark to test human cognitive and logical problem-solving skills. These problems are concerned with counting the number of solutions that exist in a specific scenario that is sketched in natural language. Humans are adept at solving such problems as they can identify commonly occurring structures in the questions for which a closed-form formula exists for computing the answer. These formulas exploit the exchangeability of objects and symmetries to avoid a brute-force enumeration of all possible solutions. Unfortunately, current AI approaches are still unable to solve combinatorial problems in this way. This paper aims to fill this gap by developing novel AI techniques for representing and solving such problems. It makes the following five contributions. First, we identify a class of combinatorics math problems which traditional lifted counting techniques fail to model or solve efficiently. Second, we propose a novel declarative language for this class of problems. Third, we propose novel lifted solving algorithms bridging probabilistic inference techniques and constraint programming. Fourth, we implement them in a lifted solver that solves efficiently the class of problems under investigation. Finally, we evaluate our contributions on a real-world combinatorics math problems dataset and synthetic benchmarks.

PDF Details DOI

NeurIPS Conference 2023 Conference Paper

Unsupervised Anomaly Detection with Rejection

Lorenzo Perini
Jesse Davis

Anomaly detection aims at detecting unexpected behaviours in the data. Because anomaly detection is usually an unsupervised task, traditional anomaly detectors learn a decision boundary by employing heuristics based on intuitions, which are hard to verify in practice. This introduces some uncertainty, especially close to the decision boundary, that may reduce the user trust in the detector's predictions. A way to combat this is by allowing the detector to reject predictions with high uncertainty (Learning to Reject). This requires employing a confidence metric that captures the distance to the decision boundary and setting a rejection threshold to reject low-confidence predictions. However, selecting a proper metric and setting the rejection threshold without labels are challenging tasks. In this paper, we solve these challenges by setting a constant rejection threshold on the stability metric computed by ExCeeD. Our insight relies on a theoretical analysis of such a metric. Moreover, setting a constant threshold results in strong guarantees: we estimate the test rejection rate, and derive a theoretical upper bound for both the rejection rate and the expected prediction cost. Experimentally, we show that our method outperforms some metric-based methods.

PDF Details

AAAI Conference 2022 Conference Paper

Transferring the Contamination Factor between Anomaly Detection Domains by Shape Similarity

Lorenzo Perini
Vincent Vercruyssen
Jesse Davis

Anomaly detection attempts to find examples in a dataset that do not conform to the expected behavior. Algorithms for this task assign an anomaly score to each example representing its degree of anomalousness. Setting a threshold on the anomaly scores enables converting these scores into a discrete prediction for each example. Setting an appropriate threshold is challenging in practice since anomaly detection is often treated as an unsupervised problem. A common approach is to set the threshold based on the dataset’s contamination factor, i. e. , the proportion of anomalous examples in the data. While the contamination factor may be known based on domain knowledge, it is often necessary to estimate it by labeling data. However, many anomaly detection problems involve monitoring multiple related, yet slightly different entities (e. g. , a fleet of machines). Then, estimating the contamination factor for each dataset separately by labeling data would be extremely time-consuming. Therefore, this paper introduces a method for transferring the known contamination factor from one dataset (the source domain) to a related dataset where it is unknown (the target domain). Our approach does not require labeled target data and is based on modeling the shape of the distribution of the anomaly scores in both domains. We theoretically analyze how our method behaves when the (biased) target domain anomaly score distribution converges to its true one. Empirically, our method outperforms several baselines on real-world datasets.

PDF Details

AAAI Conference 2022 Conference Paper

Unifying Knowledge Base Completion with PU Learning to Mitigate the Observation Bias

Jonas Schouterden
Jessa Bekker
Jesse Davis
Hendrik Blockeel

Methods for Knowledge Base Completion (KBC) reason about a knowledge base (KB) in order to derive new facts that should be included in the KB. This is challenging for two reasons. First, KBs only contain positive examples. This complicates model evaluation which needs both positive and negative examples. Second, those facts that were selected to be included in the knowledge base, are most likely not an i. i. d. sample of the true facts, due to the way knowledge bases are constructed. In this paper, we focus on rule-based approaches, which traditionally address the first challenge by making assumptions that enable identifying negative examples, which in turn makes it possible to compute a rule’s confidence or precision. However, they largely ignore the second challenge, which means that their estimates of a rule’s confidence can be biased. This paper approaches rule-based KBC through the lens of PU learning, which can cope with both challenges. We make three contributions. (1) We provide a unifying view that formalizes the relationship between multiple existing confidences measures based on (i) what assumption they make about and (ii) how their accuracy depends on the selection mechanism. (2) We introduce two new confidence measures that can mitigate known biases by using propensity scores that quantify how likely a fact is to be included the KB. (3) We show through theoretical and empirical analysis that taking the bias into account improves the confidence estimates, even when the propensity scores are not known exactly.

PDF Details

ICML Conference 2021 Conference Paper

Versatile Verification of Tree Ensembles

Laurens Devos
Wannes Meert
Jesse Davis

Machine learned models often must abide by certain requirements (e. g. , fairness or legal). This has spurred interested in developing approaches that can provably verify whether a model satisfies certain properties. This paper introduces a generic algorithm called Veritas that enables tackling multiple different verification tasks for tree ensemble models like random forests (RFs) and gradient boosted decision trees (GBDTs). This generality contrasts with previous work, which has focused exclusively on either adversarial example generation or robustness checking. Veritas formulates the verification task as a generic optimization problem and introduces a novel search space representation. Veritas offers two key advantages. First, it provides anytime lower and upper bounds when the optimization problem cannot be solved exactly. In contrast, many existing methods have focused on exact solutions and are thus limited by the verification problem being NP-complete. Second, Veritas produces full (bounded suboptimal) solutions that can be used to generate concrete examples. We experimentally show that our method produces state-of-the-art robustness estimates, especially when executed with strict time constraints. This is exceedingly important when checking the robustness of large datasets. Additionally, we show that Veritas enables tackling more real-world verification scenarios.

Details

IJCAI Conference 2020 Conference Paper

Class Prior Estimation in Active Positive and Unlabeled Learning

Lorenzo Perini
Vincent Vercruyssen
Jesse Davis

Estimating the proportion of positive examples (i. e. , the class prior) from positive and unlabeled (PU) data is an important task that facilitates learning a classifier from such data. In this paper, we explore how to tackle this problem when the observed labels were acquired via active learning. This introduces the challenge that the observed labels were not selected completely at random, which is the primary assumption underpinning existing approaches to estimating the class prior from PU data. We analyze this new setting and design an algorithm that is able to estimate the class prior for a given active learning strategy. Empirically, we show that our approach accurately recovers the true class prior on a benchmark of anomaly detection datasets and that it does so more accurately than existing methods.

PDF Details DOI

ECAI Conference 2020 Conference Paper

STRiKE: Rule-Driven Relational Learning Using Stratified k-Entailment

Martin Svatos
Steven Schockaert
Jesse Davis
Ondrej Kuzelka

Relational learning for knowledge base completion has been receiving considerable attention. Intuitively, rule-based strategies are clearly appealing, given their transparency and their ability to capture complex relational dependencies. In practice, however, pure rule-based strategies are currently not competitive with state-of-the-art methods, which is a reflection of the fact that (i) learning high-quality rules is challenging, and (ii) classical entailment is too brittle to cope with the noisy nature of the learned rules and the given knowledge base. In this paper, we introduce STRiKE, a new approach for relational learning in knowledge bases which addresses these concerns. Our contribution is three-fold. First, we introduce a new method for learning stratified rule bases from relational data. Second, to use these rules in a noise-tolerant way, we propose a strategy which extends k-entailment, a recently introduced cautious entailment relation, to stratified rule bases. Finally, we introduce an efficient algorithm for reasoning based on k-entailment.

Details

IJCAI Conference 2020 Conference Paper

VAEP: An Objective Approach to Valuing On-the-Ball Actions in Soccer (Extended Abstract)

Tom Decroos
Lotte Bransen
Jan Van Haaren
Jesse Davis

Despite the fact that objectively assessing the impact of the individual actions performed by soccer players during games is a crucial task, most traditional metrics have substantial shortcomings. First, many metrics only consider rare actions like shots and goals which account for less than 2% of all on-the-ball actions. Second, they fail to account for the context in which the actions occurred. This work summarizes several important contributions. First, we describe a language for representing individual player actions on the pitch. This language unifies several existing formats which greatly simplifies automated analysis and this language is becoming widely used in the soccer analytics community. Second, we describe our framework for valuing any type of player action based on its impact on the game outcome while accounting for the context in which the action happened. This framework enables giving a broad overview of a player's performance, including quantifying a player's total offensive and defensive contributions to their team. Third, we provide illustrative use cases that highlight the working and benefits of our framework.

PDF Details DOI

UAI Conference 2019 Conference Paper

Markov Logic Networks for Knowledge Base Completion: A Theoretical Analysis Under the MCAR Assumption

Ondrej Kuzelka
Jesse Davis

We study the following question. We are given a knowledge base in which some facts are missing. We learn the weights of a Markov logic network using maximum likelihood estimation on this knowledge base and then use the learned Markov logic network to predict the missing facts. Assuming that the facts are missing independently and with the same probability, can we say that this approach is consistent in some precise sense? This is a non-trivial question because we are learning from only one training example. In this paper we show that the answer to this question is positive.

Details

AAAI Conference 2018 Conference Paper

Estimating the Class Prior in Positive and Unlabeled Data Through Decision Tree Induction

Jessa Bekker
Jesse Davis

For tasks such as medical diagnosis and knowledge base completion, a classiﬁer may only have access to positive and unlabeled examples, where the unlabeled data consists of both positive and negative examples. One way that enables learning from this type of data is knowing the true class prior. In this paper, we propose a simple yet effective method for estimating the class prior, by estimating the probability that a positive example is selected to be labeled. Our key insight is that subdomains of the data give a lower bound on this probability. This lower bound gets closer to the real probability as the ratio of labeled examples increases. Finding such subsets can naturally be done via top-down decision tree induction. Experiments show that our method makes estimates which are equivalently accurate as those of the state of the art methods, and is an order of magnitude faster.

PDF Details

UAI Conference 2018 Conference Paper

PAC-Reasoning in Relational Domains

Ondrej Kuzelka
Yuyi Wang 0001
Jesse Davis
Steven Schockaert

We consider the problem of predicting plausible missing facts in relational data, given a set of imperfect logical rules. In particular, our aim is to provide bounds on the (expected) number of incorrect inferences that are made in this way. Since for classical inference it is in general impossible to bound this number in a non-trivial way, we consider two inference relations that weaken, but remain close in spirit to classical inference.

Details

AAAI Conference 2018 Conference Paper

Relational Marginal Problems: Theory and Estimation

Ondřej Kuželka
Yuyi Wang
Jesse Davis
Steven Schockaert

In the propositional setting, the marginal problem is to ﬁnd a (maximum-entropy) distribution that has some given marginals. We study this problem in a relational setting and make the following contributions. First, we compare two different notions of relational marginals. Second, we show a duality between the resulting relational marginal problems and the maximum likelihood estimation of the parameters of relational models, which generalizes a well-known duality from the propositional setting. Third, by exploiting the relational marginal formulation, we present a statistically sound method to learn the parameters of relational models that will be applied in settings where the number of constants differs between the training and test data. Furthermore, based on a relational generalization of marginal polytopes, we characterize cases where the standard estimators based on feature’s number of true groundings needs to be adjusted and we quantitatively characterize the consequences of these adjustments. Fourth, we prove bounds on expected errors of the estimated parameters, which allows us to lower-bound, among other things, the effective sample size of relational training data.

PDF Details

IJCAI Conference 2017 Conference Paper

Induction of Interpretable Possibilistic Logic Theories from Relational Data

Ondrej Kuzelka
Jesse Davis
Steven Schockaert

The field of statistical relational learning (SRL) is concerned with learning probabilistic models from relational data. Learned SRL models are typically represented using some kind of weighted logical formulas, which makes them considerably more interpretable than those obtained by e. g. neural networks. In practice, however, these models are often still difficult to interpret correctly, as they can contain many formulas that interact in non-trivial ways and weights do not always have an intuitive meaning. To address this, we propose a new SRL method which uses possibilistic logic to encode relational models. Learned models are then essentially stratified classical theories, which explicitly encode what can be derived with a given level of certainty. Compared to Markov Logic Networks (MLNs), our method is faster and produces considerably more interpretable models.

PDF Details

AAAI Conference 2017 Conference Paper

Predicting Soccer Highlights from Spatio-Temporal Match Event Streams

Tom Decroos
Vladimir Dzyuba
Jan Van Haaren
Jesse Davis

Sports broadcasters are continuously seeking to make their live coverages of soccer matches more attractive. A recent innovation is the “highlight channel, ” which shows the most interesting events from multiple matches played at the same time. However, switching between matches at the right time is challenging in fast-paced sports like soccer, where interesting situations often evolve as quickly as they disappear again. This paper presents the POGBA algorithm for automatically predicting highlights in soccer matches, which is an important task that has not yet been addressed. POGBA leverages spatio-temporal event streams collected during matches to predict the probability that a particular game state will lead to a goal. An empirical evaluation on a real-world dataset shows that POGBA outperforms the baseline algorithms in terms of both precision and recall.

PDF Details

IJCAI Conference 2017 Conference Paper

Solving Probability Problems in Natural Language

Anton Dries
Angelika Kimmig
Jesse Davis
Vaishak Belle
Luc De Raedt

The ability to solve probability word problems such as those found in introductory discrete mathematics textbooks, is an important cognitive and intellectual skill. In this paper, we develop a two-step end-to-end fully automated approach for solving such questions that is able to automatically provide answers to exercises about probability formulated in natural language. In the first step, a question formulated in natural language is analysed and transformed into a high-level model specified in a declarative language. In the second step, a solution to the high-level model is computed using a probabilistic programming system. On a dataset of 2160 probability problems, our solver is able to correctly answer 97. 5% of the questions given a correct model. On the end-to-end evaluation, we are able to answer 12. 5% of the questions (or 31. 1% if we exclude examples not supported by design).

PDF Details

IJCAI Conference 2016 Conference Paper

Dynamic Early Stopping for Naive Bayes

A
auml; ron Verachtert
Hendrik Blockeel
Jesse Davis

Energy efficiency is a concern for any software running on mobile devices. As such software employs machine-learned models to make predictions, this motivates research on efficiently executable models. In this paper, we propose a variant of the widely used Naive Bayes (NB) learner that yields a more efficient predictive model. In contrast to standard NB, where the learned model inspects all features to come to a decision, or NB with feature selection, where the model uses a fixed subset of the features, our model dynamically determines, on a case-by-case basis, when to stop inspecting features. We show that our approach is often much more efficient than the current state of the art, without loss of accuracy.

PDF Details

ECAI Conference 2016 Conference Paper

Interpretable Encoding of Densities Using Possibilistic Logic

Ondrej Kuzelka
Jesse Davis
Steven Schockaert

Probability density estimation from data is a widely studied problem. Often, the primary goal is to faithfully mimic the underlying empirical density. Having an interpretable model that allows insight into why certain predictions were made is often of secondary importance. Using logic-based formalisms, such as Markov logic, can help with interpretability, but even in Markov logic it can be difficult to gain insight into a model's behavior due to interactions between the logical formulas used to specific the model. This paper explores an alternative approach to representing densities that makes use of possibilistic logic. Concretely, we propose a novel way to transform a learned density tree into a possibilistic logic theory. An advantage of our transformation is that it permits performing both MAP and, surprisingly, marginal inference, with the converted possibilistic logic theory. At the same time, we still retain the benefits conferred by using possibilistic logic, such as the ability to compact the theory and the interpretability of the model.

Details

IJCAI Conference 2016 Conference Paper

Learning Possibilistic Logic Theories from Default Rules

Ondřej Kuželka
Jesse Davis
Steven Schockaert

We introduce a setting for learning possibilistic logic theories from defaults of the form "if alpha then typically beta". We first analyse this problem from the point of view of machine learning theory, determining the VC dimension of possibilistic stratifications as well as the complexity of the associated learning problems, after which we present a heuristic learning algorithm that can easily scale to thousands of defaults. An important property of our approach is that it is inherently able to handle noisy and conflicting sets of defaults. Among others, this allows us to learn possibilistic logic theories from crowdsourced data and to approximate propositional Markov logic networks using heuristic MAP solvers. We present experimental results that demonstrate the effectiveness of this approach.

PDF Details

ECAI Conference 2016 Conference Paper

Learning the Structure of Dynamic Hybrid Relational Models

Davide Nitti 0001
Irma Ravkic
Jesse Davis
Luc De Raedt

Typical approaches to relational MDPs consider only discrete variables or else discretize the continuous variables prior to inference or learning. In contrast, we consider hybrid relational MDPs, which are represented as probabilistic programs and specify the probability density function of the continuous variables. Our key contribution is that we introduce a technique for learning their structure (and parameters) from data. The learned models contain rich relational descriptions as well as mathematical equations. We demonstrate the utility of our approach by learning a model that accurately predicts the effects of robot-arm actions. The learned model is then used for planning tasks.

Details

UAI Conference 2015 Conference Paper

Encoding Markov logic networks in Possibilistic Logic

Ondrej Kuzelka
Jesse Davis
Steven Schockaert

Markov logic uses weighted formulas to compactly encode a probability distribution over possible worlds. Despite the use of logical formulas, Markov logic networks (MLNs) can be difficult to interpret, due to the often counter-intuitive meaning of their weights. To address this issue, we propose a method to construct a possibilistic logic theory that exactly captures what can be derived from a given MLN using maximum a posteriori (MAP) inference. Unfortunately, the size of this theory is exponential in general. We therefore also propose two methods which can derive compact theories that still capture MAP inference, but only for specific types of evidence. These theories can be used, among others, to make explicit the hidden assumptions underlying an MLN or to explain the predictions it makes.

Details

AAAI Conference 2015 Conference Paper

TODTLER: Two-Order-Deep Transfer Learning

Jan Van Haaren
Andrey Kolobov
Jesse Davis

The traditional way of obtaining models from data, inductive learning, has proved itself both in theory and in many practical applications. However, in domains where data is difficult or expensive to obtain, e. g. , medicine, deep transfer learning is a more promising technique. It circumvents the model acquisition difficulties caused by scarce data in a target domain by carrying over structural properties of a model learned in a source domain where training data is ample. Nonetheless, the lack of a principled view of transfer learning so far has limited its adoption. In this paper, we address this issue by regarding transfer learning as a process that biases learning in a target domain in favor of patterns useful in a source domain. Specifically, we consider a first-order logic model of the data as an instantiation of a set of second-order templates. Hence, the usefulness of a model is partly determined by the learner’s prior distribution over these template sets. The main insight of our work is that transferring knowledge amounts to acquiring a posterior over the second-order template sets by learning in the source domain and using this posterior when learning in the target setting. Our experimental evaluation demonstrates our approach to outperform the existing transfer learning techniques in terms of accuracy and runtime.

PDF Details

NeurIPS Conference 2015 Conference Paper

Tractable Learning for Complex Probability Queries

Jessa Bekker
Jesse Davis
Arthur Choi
Adnan Darwiche
Guy Van den Broeck

Tractable learning aims to learn probabilistic models where inference is guaranteed to be efficient. However, the particular class of queries that is tractable depends on the model and underlying representation. Usually this class is MPE or conditional probabilities $\Pr(\xs|\ys)$ for joint assignments~$\xs, \ys$. We propose a tractable learner that guarantees efficient inference for a broader class of queries. It simultaneously learns a Markov network and its tractable circuit representation, in order to guarantee and measure tractability. Our approach differs from earlier work by using Sentential Decision Diagrams (SDD) as the tractable language instead of Arithmetic Circuits (AC). SDDs have desirable properties, which more general representations such as ACs lack, that enable basic primitives for Boolean circuit compilation. This allows us to support a broader class of complex probability queries, including counting, threshold, and parity, in polytime.

PDF Details

IJCAI Conference 2015 Conference Paper

Unsupervised Learning of an IS-A Taxonomy from a Limited Domain-Specific Corpus

Daniele Alfarone
Jesse Davis

Taxonomies hierarchically organize concepts in a domain. Building and maintaining them by hand is a tedious and time-consuming task. This paper proposes a novel, unsupervised algorithm for automatically learning an IS-A taxonomy from scratch by analyzing a given text corpus. Our approach is designed to deal with infrequently occurring concepts, so it can effectively induce taxonomies even from small corpora. Algorithmically, the approach makes two important contributions. First, it performs inference based on clustering and the distributional semantics, which can capture links among concepts never mentioned together. Second, it uses a novel graph-based algorithm to detect and remove incorrect is-a relations from a taxonomy. An empirical evaluation on five corpora demonstrates the utility of our proposed approach.

PDF Details

JMLR Journal 2014 Journal Article

Improving Markov Network Structure Learning Using Decision Trees

Daniel Lowd
Jesse Davis

Most existing algorithms for learning Markov network structure either are limited to learning interactions among few variables or are very slow, due to the large space of possible structures. In this paper, we propose three new methods for using decision trees to learn Markov network structures. The advantage of using decision trees is that they are very fast to learn and can represent complex interactions among many variables. The first method, DTSL, learns a decision tree to predict each variable and converts each tree into a set of conjunctive features that define the Markov network structure. The second, DT-BLM, builds on DTSL by using it to initialize a search-based Markov network learning algorithm recently proposed by Davis and Domingos (2010). The third, DT+L1, combines the features learned by DTSL with those learned by an L1-regularized logistic regression method (L1) proposed by Ravikumar et al. (2009). In an extensive empirical evaluation on 20 data sets, DTSL is comparable to L1 and significantly faster and more accurate than two other baselines. DT-BLM is slower than DTSL, but obtains slightly higher accuracy. DT+L1 combines the strengths of DTSL and L1 to perform significantly better than either of them with only a modest increase in training time. [abs] [ pdf ][ bib ] &copy JMLR 2014. ( edit, beta )

PDF Details

NeurIPS Conference 2013 Conference Paper

First-order Decomposition Trees

Nima Taghipour
Jesse Davis
Hendrik Blockeel

Lifting attempts to speedup probabilistic inference by exploiting symmetries in the model. Exact lifted inference methods, like their propositional counterparts, work by recursively decomposing the model and the problem. In the propositional case, there exist formal structures, such as decomposition trees (dtrees), that represent such a decomposition and allow us to determine the complexity of inference a priori. However, there is currently no equivalent structure nor analogous complexity results for lifted inference. In this paper, we introduce FO-dtrees, which upgrade propositional dtrees to the first-order level. We show how these trees can characterize a lifted inference solution for a probabilistic logical model (in terms of a sequence of lifted operations), and make a theoretical analysis of the complexity of lifted inference in terms of the novel notion of lifted width for the tree.

PDF Details

AAAI Conference 2012 Conference Paper

Conditioning in First-Order Knowledge Compilation and Lifted Probabilistic Inference

Guy Van den Broeck
Jesse Davis

Knowledge compilation is a powerful technique for compactly representing and efficiently reasoning about logical knowledge bases. It has been successfully applied to numerous problems in artificial intelligence, such as probabilistic inference and conformant planning. Conditioning, which updates a knowledge base with observed truth values for some propositions, is one of the fundamental operations employed for reasoning. In the propositional setting, conditioning can be efficiently applied in all cases. Recently, people have explored compilation for first-order knowledge bases. The majority of this work has centered around using first-order d- DNNF circuits as the target compilation language. However, conditioning has not been studied in this setting. This paper explores how to condition a first-order d-DNNF circuit. We show that it is possible to efficiently condition these circuits on unary relations. However, we prove that conditioning on higher arity relations is #P-hard. We study the implications of these findings on the application of performing lifted inference for first-order probabilistic models. This leads to a better understanding of which types of queries lifted inference can address.

PDF Details

ICML Conference 2012 Conference Paper

Demand-Driven Clustering in Relational Domains for Predicting Adverse Drug Events

Jesse Davis
Vítor Santos Costa
Elizabeth Berg
David Page
Peggy L. Peissig
Michael Caldwell

Details

AAAI Conference 2012 Conference Paper

Markov Network Structure Learning: A Randomized Feature Generation Approach

Jan Van Haaren
Jesse Davis

The structure of a Markov network is typically learned in one of two ways. The first approach is to treat this task as a global search problem. However, these algorithms are slow as they require running the expensive operation of weight (i. e. , parameter) learning many times. The second approach involves learning a set of local models and then combining them into a global model. However, it can be computationally expensive to learn the local models for datasets that contain a large number of variables and/or examples. This paper pursues a third approach that views Markov network structure learning as a feature generation problem. The algorithm combines a data-driven, specific-to-general search strategy with randomization to quickly generate a large set of candidate features that all have support in the data. It uses weight learning, with L1 regularization, to select a subset of generated features to include in the model. On a large empirical study, we find that our algorithm is equivalently accurate to other state-of-the-art methods while exhibiting a much faster run time.

PDF Details

ICML Conference 2012 Conference Paper

Unachievable Region in Precision-Recall Space and Its Effect on Empirical Evaluation

Kendrick Boyd
Jesse Davis
David Page
Vítor Santos Costa

Details

IJCAI Conference 2011 Conference Paper

Lifted Probabilistic Inference by First-Order Knowledge Compilation

Guy Van den Broeck
Nima Taghipour
Wannes Meert
Jesse Davis
Luc De Raedt

Probabilistic logical languages provide powerful formalisms forknowledge representation and learning. Yet performing inference inthese languages is extremely costly, especially if it is done at thepropositional level. Lifted inference algorithms, which avoid repeatedcomputation by treating indistinguishable groups of objects as one, helpmitigate this cost. Seeking inspiration from logical inference, wherelifted inference (e. g. , resolution) is commonly performed, we developa model theoretic approach to probabilistic lifted inference. Our algorithmcompiles a first-order probabilistic theory into a first-orderdeterministic decomposable negation normal form (d-DNNF) circuit. Compilation offers the advantage that inference is polynomial in thesize of the circuit. Furthermore, by borrowing techniques from theknowledge compilation literature our algorithm effectively exploitsthe logical structure (e. g. , context-specific independencies) withinthe first-order model, which allows more computation to be done at the lifted level. An empirical comparison demonstrates the utility of the proposed approach.

PDF Details DOI

ICML Conference 2010 Conference Paper

Bottom-Up Learning of Markov Network Structure

Jesse Davis
Pedro M. Domingos

Details

ICML Conference 2009 Conference Paper

Deep transfer via second-order Markov logic

Jesse Davis
Pedro M. Domingos

Details

IJCAI Conference 2007 Conference Paper

Jesse Davis
Irene Ong
Jan Struyf
Elizabeth Burnside
David Page
V
iacute; tor Santos Costa

Statistical relational learning (SRL) algorithms learn statistical models from relational data, such as that stored in a relational database. We previously introduced view learning for SRL, in which the view of a relational database can be automatically modified, yielding more accurate statistical models. The present paper presents SAYU-VISTA, an algorithm which advances beyond the initial view learning approach in three ways. First, it learns views that introduce new relational tables, rather than merely new fields for an existing table of the database. Second, new tables or new fields are not limited to being approximations to some target concept; instead, the new approach performs a type of predicate invention. The new approach avoids the classical problem with predicate invention, of learning many useless predicates, by keeping only new fields or tables (i. e. , new predicates) that immediately improve the performance of the statistical model. Third, retained fields or tables can then be used in the definitions of further new fields or tables. We evaluate the new view learning approach on three relational classification tasks.

PDF Details

ICML Conference 2007 Conference Paper

An integrated approach to feature invention and model construction for drug activity prediction

Jesse Davis
Vítor Santos Costa
Soumya Ray
David Page

Details

ICML Conference 2006 Conference Paper

The relationship between Precision-Recall and ROC curves

Jesse Davis
Mark H. Goadrich

Receiver Operator Characteristic (ROC) curves are commonly used to present results for binary decision problems in machine learning. However, when dealing with highly skewed datasets, Precision-Recall (PR) curves give a more informative picture of an algorithm's performance. We show that a deep connection exists between ROC space and PR space, such that a curve dominates in ROC space if and only if it dominates in PR space. A corollary is the notion of an achievable PR curve, which has properties much like the convex hull in ROC space; we show an efficient algorithm for computing this curve. Finally, we also note differences in the two types of curves are significant for algorithm design. For example, in PR space it is incorrect to linearly interpolate between points. Furthermore, algorithms that optimize the area under the ROC curve are not guaranteed to optimize the area under the PR curve.

Details

IJCAI Conference 2005 Conference Paper

View Learning for Statistical Relational Learning: With an Application to Mammography

Jesse Davis
Elizabeth Burnside
Inês Dutra
David Page
Raghu Ramakrishnan
Vítor Santos Costa
Jude

Statistical relational learning (SRL) constructs probabilistic models from relational databases. A key capability of SRL is the learning of arcs (in the Bayes net sense) connecting entries in different rows of a relational table, or in different tables. Nevertheless, SRL approaches currently are constrained to use the existing database schema. For many database applications, users find it profitable to define alternative “views” of the database, in effect defining new fields or tables. Such new fields or tables can also be highly useful in learning. We provide SRL with the capability of learning new views.

PDF