Arrow Research search

Author name cluster

Pablo Barcelo

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

5 papers
1 author row

Possible papers

5

TMLR Journal 2025 Journal Article

Link Prediction with Relational Hypergraphs

  • Xingyue Huang
  • Miguel Romero Orth
  • Pablo Barcelo
  • Michael M. Bronstein
  • Ismail Ilkan Ceylan

Link prediction with knowledge graphs has been thoroughly studied in graph machine learning, leading to a rich landscape of graph neural network architectures with successful applications. Nonetheless, it remains challenging to transfer the success of these architectures to inductive link prediction with relational hypergraphs, where the task is over $k$-ary relations, substantially harder than link prediction on knowledge graphs with binary relations only. In this paper, we propose a framework for link prediction with relational hypergraphs, empowering applications of graph neural networks on fully relational structures. Theoretically, we conduct a thorough analysis of the expressive power of the resulting model architectures via corresponding relational Weisfeiler-Leman algorithms and also via logical expressiveness. Empirically, we validate the power of the proposed model architectures on various relational hypergraph benchmarks. The resulting model architectures substantially outperform every baseline for inductive link prediction and also lead to competitive results for transductive link prediction.

JAIR Journal 2025 Journal Article

On Computing Probabilistic Explanations for Decision Trees

  • Marcelo Arenas
  • Pablo Barcelo
  • Alexander Kozachinskiy
  • Miguel Romero
  • Bernardo Subercaseaux

Formal XAI (explainable AI) is a growing area that focuses on computing explanations with mathematical guarantees for the decisions made by ML models. Inside formal XAI, one of the most studied cases is that of explaining the choices taken by decision trees, as they are traditionally deemed as one of the most interpretable classes of models. Recent work has focused on studying the computation of sufficient reasons, a kind of explanation in which given a decision tree T and an instance x, one explains the decision T (x) by providing a subset y of the features of x such that for any other instance z compatible with y, it holds that T (z) = T (x), intuitively meaning that the features in y are already enough to fully justify the classification of x by T. It has been argued, however, that sufficient reasons constitute a restrictive notion of explanation. For such a reason, the community has started to study their probabilistic counterpart, in which one requires that the probability of T (z) = T (x) must be at least some value δ ∈ (0, 1], where z is a random instance that is compatible with y. Our paper settles the computational complexity of δ-sufficient-reasons over decision trees, showing that both (1) finding δ-sufficient-reasons that are minimal in size, and (2) finding δ-sufficient-reasons that are minimal inclusion-wise, are computationally intractable. By doing this, we answer two open problems originally raised by Izza et al. (2021), and extend the hardness of explanations for Boolean circuits presented by Wäldchen et al. (2021) to the more restricted case of decision trees. Furthermore, we present sharp non-approximability results under a widely believed complexity hypothesis. On the positive side, we identify structural restrictions of decision trees that make the problem tractable.

JMLR Journal 2023 Journal Article

On the Complexity of SHAP-Score-Based Explanations: Tractability via Knowledge Compilation and Non-Approximability Results

  • Marcelo Arenas
  • Pablo Barcelo
  • Leopoldo Bertossi
  • Mikael Monet

Scores based on Shapley values are widely used for providing explanations to classification results over machine learning models. A prime example of this is the influential~ Shap-score, a version of the Shapley value that can help explain the result of a learned model on a specific entity by assigning a score to every feature. While in general computing Shapley values is a computationally intractable problem, we prove a strong positive result stating that the Shap-score can be computed in polynomial time over deterministic and decomposable Boolean circuits under the so-called product distributions on entities. Such circuits are studied in the field of Knowledge Compilation and generalize a wide range of Boolean circuits and binary decision diagrams classes, including binary decision trees, Ordered Binary Decision Diagrams (OBDDs) and Free Binary Decision Diagrams (FBDDs). Our positive result extends even beyond binary classifiers, as it continues to hold if each feature is associated with a finite domain of possible values. We also establish the computational limits of the notion of Shap-score by observing that, under a mild condition, computing it over a class of Boolean models is always polynomially as hard as the model counting problem for that class. This implies that both determinism and decomposability are essential properties for the circuits that we consider, as removing one or the other renders the problem of computing the Shap-score intractable (namely, $\#P$-hard). It also implies that computing Shap-scores is $\#P$-hard even over the class of propositional formulas in DNF. Based on this negative result, we look for the existence of fully-polynomial randomized approximation schemes (FPRAS) for computing Shap-scores over such class. In stark contrast to the model counting problem for DNF formulas, which admits an FPRAS, we prove that no such FPRAS exists (under widely believed complexity assumptions) for the computation of Shap-scores. Surprisingly, this negative result holds even for the class of monotone formulas in DNF. These techniques can be further extended to prove another strong negative result: Under widely believed complexity assumptions, there is no polynomial-time algorithm that checks, given a monotone DNF formula $\varphi$ and features $x,y$, whether the Shap-score of $x$ in $\varphi$ is smaller than the Shap-score of $y$ in $\varphi$. [abs] [ pdf ][ bib ] &copy JMLR 2023. ( edit, beta )

KR Conference 2016 Conference Paper

Bisimulations on data graphs

  • Sergio Abriola
  • Pablo Barcelo
  • Diego Figueira
  • Santiago Figueira

Bisimulation provides structural conditions to characterize indistinguishability between nodes on graph-like structures from an external observer. It is a fundamental notion used in many areas. However, many applications use graphs where nodes have data, and where observers can test for equality or inequality of data values (e. g., asking the attribute ‘name’ of a node to be different from that of all its neighbors). The present work constitutes a first investigation of “data aware” bisimulations on data graphs. We study the problem of computing such bisimulations, based on the observational indistinguishability for XPath —a language that extends modal logic with tests for data equality. We show that in general the problem is PS PACE-complete, but identify several restrictions that yield better complexity bounds (C O -NP, PT IME) by controlling suitable parameters of the problem; namely, the amount of non-locality allowed, and the class of models considered (graph, DAG, tree). In particular, this analysis yields a hierarchy of tractable fragments.

Highlights Conference 2013 Conference Abstract

Semantic acyclicity on graph databases

  • Pablo Barcelo
  • Gaëlle Fontaine
  • Miguel Romero
  • Moshe Y. Vardi

Unions of acyclic conjunctive queries (CQs) can be evaluated in linear time, as opposed to arbitrary CQs, for which the evaluation problem is NP-complete. It is known that evaluation of semantically acyclic unions of CQs - i. e. , unions of CQs that are equivalent to a union of acyclic ones - is tractable. We study the notion of semantic acyclicity in the context of graph databases and unions of conjunctive regular path queries (UCRPQs). We prove that checking whether a UCRPQ is semantically acyclic is decidable in 2ExpSpace and is ExpSpace-hard. We show that evaluation of semantically acyclic UCRPQs is fixed-parameter tractable.