Arrow Research search

Author name cluster

Hendrik Blockeel

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

19 papers
2 author rows

Possible papers

19

ICML Conference 2025 Conference Paper

Compressing tree ensembles through Level-wise Optimization and Pruning

  • Laurens Devos
  • Timo Martens
  • Deniz Can Oruc
  • Wannes Meert
  • Hendrik Blockeel
  • Jesse Davis

Tree ensembles (e. g. , gradient boosting decision trees) are often used in practice because they offer excellent predictive performance while still being easy and efficient to learn. In some contexts, it is important to additionally optimize their size: this is specifically the case when models need to have verifiable properties (verification of fairness, robustness, etc. is often exponential in the ensemble’s size), or when models run on battery-powered devices (smaller ensembles consume less energy, increasing battery autonomy). For this reason, compression of tree ensembles is worth studying. This paper presents LOP, a method for compressing a given tree ensemble by pruning or entirely removing trees in it, while updating leaf predictions in such a way that predictive accuracy is mostly unaffected. Empirically, LOP achieves compression factors that are often 10 to 100 times better than that of competing methods.

AAAI Conference 2024 Conference Paper

DeepSaDe: Learning Neural Networks That Guarantee Domain Constraint Satisfaction

  • Kshitij Goyal
  • Sebastijan Dumancic
  • Hendrik Blockeel

As machine learning models, specifically neural networks, are becoming increasingly popular, there are concerns regarding their trustworthiness, specially in safety-critical applications, e.g. actions of an autonomous vehicle must be safe. There are approaches that can train neural networks where such domain requirements are enforced as constraints, but they either cannot guarantee that the constraint will be satisfied by all possible predictions (even on unseen data) or they are limited in the type of constraints that can be enforced. In this paper, we present an approach to train neural networks which can enforce a wide variety of constraints and guarantee that the constraint is satisfied by all possible predictions. The approach builds on earlier work where learning linear models is formulated as a constraint satisfaction problem (CSP). To make this idea applicable to neural networks, two crucial new elements are added: constraint propagation over the network layers, and weight updates based on a mix of gradient descent and CSP solving. Evaluation on various machine learning tasks demonstrates that our approach is flexible enough to enforce a wide variety of domain constraints and is able to guarantee them in neural networks.

AAAI Conference 2022 Conference Paper

Unifying Knowledge Base Completion with PU Learning to Mitigate the Observation Bias

  • Jonas Schouterden
  • Jessa Bekker
  • Jesse Davis
  • Hendrik Blockeel

Methods for Knowledge Base Completion (KBC) reason about a knowledge base (KB) in order to derive new facts that should be included in the KB. This is challenging for two reasons. First, KBs only contain positive examples. This complicates model evaluation which needs both positive and negative examples. Second, those facts that were selected to be included in the knowledge base, are most likely not an i. i. d. sample of the true facts, due to the way knowledge bases are constructed. In this paper, we focus on rule-based approaches, which traditionally address the first challenge by making assumptions that enable identifying negative examples, which in turn makes it possible to compute a rule’s confidence or precision. However, they largely ignore the second challenge, which means that their estimates of a rule’s confidence can be biased. This paper approaches rule-based KBC through the lens of PU learning, which can cope with both challenges. We make three contributions. (1) We provide a unifying view that formalizes the relationship between multiple existing confidences measures based on (i) what assumption they make about and (ii) how their accuracy depends on the selection mechanism. (2) We introduce two new confidence measures that can mitigate known biases by using propensity scores that quantify how likely a fact is to be included the KB. (3) We show through theoretical and empirical analysis that taking the bias into account improves the confidence estimates, even when the propensity scores are not known exactly.

IJCAI Conference 2019 Conference Paper

Learning Relational Representations with Auto-encoding Logic Programs

  • Sebastijan Dumancic
  • Tias Guns
  • Wannes Meert
  • Hendrik Blockeel

Deep learning methods capable of handling relational data have proliferated over the past years. In contrast to traditional relational learning methods that leverage first-order logic for representing such data, these methods aim at re-representing symbolic relational data in Euclidean space. They offer better scalability, but can only approximate rich relational structures and are less flexible in terms of reasoning. This paper introduces a novel framework for relational representation learning that combines the best of both worlds. This framework, inspired by the auto-encoding principle, uses first-order logic as a data representation language, and the mapping between the the original and latent representation is done by means of logic programs instead of neural networks. We show how learning can be cast as a constraint optimisation problem for which existing solvers can be used. The use of logic as a representation language makes the proposed framework more accurate (as the representation is exact, rather than approximate), more flexible, and more interpretable than deep learning methods. We experimentally show that these latent representations are indeed beneficial in relational learning tasks.

AAAI Conference 2018 Conference Paper

MERCS: Multi-Directional Ensembles of Regression and Classification Trees

  • Elia Van Wolputte
  • Evgeniya Korneva
  • Hendrik Blockeel

Learning a function fX→Y that predicts Y from X is the archetypal Machine Learning (ML) problem. Typically, both sets of attributes (X, Y) have to be known before a model can be trained. When this is not the case, or when functions fX→Y are needed for varying X and Y, this may introduce significant overhead (separate learning runs for each function). In this paper, we explore the possibility of omitting the specification of X and Y at training time altogether, by learning a multi-directional, or versatile model, which will allow prediction of any Y from any X. Specifically, we introduce a decision tree-based paradigm that generalizes the well-known Random Forests approach to allow for multidirectionality. The result of these efforts is a novel method called MERCS: Multi-directional Ensembles of Regression and Classification treeS. Experiments show the viability of the approach.

IJCAI Conference 2017 Conference Paper

Clustering-Based Relational Unsupervised Representation Learning with an Explicit Distributed Representation

  • Sebastijan Dumancic
  • Hendrik Blockeel

The goal of unsupervised representation learning is to extract a new representation of data, such that solving many different tasks becomes easier. Existing methods typically focus on vectorized data and offer little support for relational data, which additionally describes relationships among instances. In this work we introduce an approach for relational unsupervised representation learning. Viewing a relational dataset as a hypergraph, new features are obtained by clustering vertices and hyperedges. To find a representation suited for many relational learning tasks, a wide range of similarities between relational objects is considered, e. g. feature and structural similarities. We experimentally evaluate the proposed approach and show that models learned on such latent representations perform better, have lower complexity, and outperform the existing approaches on classification tasks.

IJCAI Conference 2017 Conference Paper

COBRA: A Fast and Simple Method for Active Clustering with Pairwise Constraints

  • Toon Van Craenendonck
  • Sebastijan Dumancic
  • Hendrik Blockeel

Clustering is inherently ill-posed: there often exist multiple valid clusterings of a single dataset, and without any additional information a clustering system has no way of knowing which clustering it should produce. This motivates the use of constraints in clustering, as they allow users to communicate their interests to the clustering system. Active constraint-based clustering algorithms select the most useful constraints to query, aiming to produce a good clustering using as few constraints as possible. We propose COBRA, an active method that first over-clusters the data by running K-means with a $K$ that is intended to be too large, and subsequently merges the resulting small clusters into larger ones based on pairwise constraints. In its merging step, COBRA is able to keep the number of pairwise queries low by maximally exploiting constraint transitivity and entailment. We experimentally show that COBRA outperforms the state of the art in terms of clustering quality and runtime, without requiring the number of clusters in advance.

ECAI Conference 2016 Conference Paper

An Efficient and Expressive Similarity Measure for Relational Clustering Using Neighbourhood Trees

  • Sebastijan Dumancic
  • Hendrik Blockeel

Clustering is an underspecified task: there are no universal criteria for what makes a good clustering. This is especially true for relational data, where similarity can be based on the features of individuals, the relationships between them, or a mix of both. Existing methods for relational clustering have strong and often implicit biases in this respect. In this paper, we introduce a novel similarity measure for relational data. It is the first measure to incorporate a wide variety of types of similarity, including similarity of attributes, similarity of relational context, and proximity in a hypergraph. We experimentally evaluate how using this similarity affects the quality of clustering on very different types of datasets. The experiments demonstrate that (a) using this similarity in standard clustering methods consistently gives good results, whereas other measures work well only on datasets that match their bias; and (b) on most datasets, the novel similarity outperforms even the best among the existing ones.

IJCAI Conference 2016 Conference Paper

Dynamic Early Stopping for Naive Bayes

  • A
  • auml; ron Verachtert
  • Hendrik Blockeel
  • Jesse Davis

Energy efficiency is a concern for any software running on mobile devices. As such software employs machine-learned models to make predictions, this motivates research on efficiently executable models. In this paper, we propose a variant of the widely used Naive Bayes (NB) learner that yields a more efficient predictive model. In contrast to standard NB, where the learned model inspects all features to come to a decision, or NB with feature selection, where the model uses a fixed subset of the features, our model dynamically determines, on a case-by-case basis, when to stop inspecting features. We show that our approach is often much more efficient than the current state of the art, without loss of accuracy.

ECAI Conference 2016 Conference Paper

Relational Grounded Language Learning

  • Leonor Becerra-Bonache
  • Hendrik Blockeel
  • María Galván
  • François Jacquenet

In the past, research on learning language models mainly used syntactic information during the learning process but in recent years, researchers began to also use semantic information. This paper presents such an approach where the input of our learning algorithm is a dataset of pairs made up of sentences and the contexts in which they are produced. The system we present is based on inductive logic programming techniques that aim to learn a mapping between n-grams and a semantic representation of their associated meaning. Experiments have shown that we can learn such a mapping that made it possible later to generate relevant descriptions of images or learn the meaning of words without any linguistic resource.

NeurIPS Conference 2013 Conference Paper

First-order Decomposition Trees

  • Nima Taghipour
  • Jesse Davis
  • Hendrik Blockeel

Lifting attempts to speedup probabilistic inference by exploiting symmetries in the model. Exact lifted inference methods, like their propositional counterparts, work by recursively decomposing the model and the problem. In the propositional case, there exist formal structures, such as decomposition trees (dtrees), that represent such a decomposition and allow us to determine the complexity of inference a priori. However, there is currently no equivalent structure nor analogous complexity results for lifted inference. In this paper, we introduce FO-dtrees, which upgrade propositional dtrees to the first-order level. We show how these trees can characterize a lifted inference solution for a probabilistic logical model (in terms of a sequence of lifted operations), and make a theoretical analysis of the complexity of lifted inference in terms of the novel notion of lifted width for the tree.

IJCAI Conference 2007 Conference Paper

  • Tom Croonenborghs
  • Jan Ramon
  • Hendrik Blockeel
  • Maurice Bruynooghe

In recent years, there has been a growing interest in using rich representations such as relational languages for reinforcement learning. However, while expressive languages have many advantages in terms of generalization and reasoning, extending existing approaches to such a relational setting is a non-trivial problem. In this paper, we present a first step towards the online learning and exploitation of relational models. We propose a representation for the transition and reward function that can be learned online and present a method that exploits these models by augmenting Relational Reinforcement Learning algorithms with planning techniques. The benefits and robustness of our approach are evaluated experimentally.

JMLR Journal 2003 Journal Article

Query Transformations for Improving the Efficiency of ILP Systems

  • Vítor Santos Costa
  • Ashwin Srinivasan
  • Rui Camacho
  • Hendrik Blockeel
  • Bart Demoen
  • Gerda Janssens
  • Jan Struyf
  • Henk Vandecasteele

Relatively simple transformations can speed up the execution of queries for data mining considerably. While some ILP systems use such transformations, relatively little is known about them or how they relate to each other. This paper describes a number of such transformations. Not all of them are novel, but there have been no studies comparing their efficacy. The main contributions of the paper are: (a) it clarifies the relationship between the transformations; (b) it contains an empirical study of what can be gained by applying the transformations; and (c) it provides some guidance on the kinds of problems that are likely to benefit from the transformations. [abs] [ pdf ][ ps.gz ][ ps ]

JMLR Journal 2002 Journal Article

Efficient Algorithms for Decision Tree Cross-validation

  • Hendrik Blockeel
  • Jan Struyf

Cross-validation is a useful and generally applicable technique often employed in machine learning, including decision tree induction. An important disadvantage of straightforward implementation of the technique is its computational overhead. In this paper we show that, for decision trees, the computational overhead of cross-validation can be reduced significantly by integrating the cross-validation with the normal decision tree induction process. We discuss how existing decision tree algorithms can be adapted to this aim, and provide an analysis of the speedups these adaptations may yield. We identify a number of parameters that influence the obtainable speedups, and validate and refine our analysis with experiments on a variety of data sets with two different implementations. Besides cross-validation, we also briefly explore the usefulness of these techniques for bagging. We conclude with some guidelines concerning when these optimizations should be considered.

AIJ Journal 1998 Journal Article

Top-down induction of first-order logical decision trees

  • Hendrik Blockeel
  • Luc De Raedt

A first-order framework for top-down induction of logical decision trees is introduced. The expressivity of these trees is shown to be larger than that of the flat logic programs which are typically induced by classical ILP systems, and equal to that of first-order decision lists. These results are related to predicate invention and mixed variable quantification. Finally, an implementation of this framework, the TILDE system, is presented and empirically evaluated.