Author name cluster

David Tena Cucala

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

7 papers

1 author row

AIJ Journal 2026 Journal Article

From monotonic graph neural networks to datalog and back: Expressive power and practical applications

David Tena Cucala
Bernardo Cuenca Grau
Boris Motik
Egor V. Kostylev

Many tasks over knowledge graphs, such as link prediction, can be conceptualised as a problem of learning a transformation of sets of relational facts. Machine learning models such as graph neural networks (GNNs) can be used to realise this transformation, allowing the transformation to be learned from examples. However, it is often difficult to verify formally the properties of such a transformation, or understand why it derives a specific fact. Alternatively, such a transformation can be realised using a set of rules expressed in a knowledge representation language such as Datalog. Formal properties of such a transformation can be verified using symbolic means, and each derived fact can be justified by a rule; however, writing and curating the rules is costly and requires expertise in both the application domain and the formal language. To bridge the gap between these two approaches, in this paper we study the relationship between transformations realised by monotonic max-sum GNNs , a subclass of GNNs with nonnegative weights and max and sum aggregation functions, and transformations realised by Datalog rules. First, we provide an algorithm that can verify whether a given Datalog rule is sound for a network, in the sense that the GNN always derives all consequences of the rule on any input dataset. Second, we provide an algorithm that allows us to justify any fact derived by a GNN by computing a rule that is sound for the GNN and that derives the fact. Third, we study the expressive power of monotonic max-sum GNNs and show that, for each such GNN, one can compute a Datalog program where applying the GNN to any dataset produces the same facts as a single round of application of the program’s rules to the dataset; we also sharpen our result to the subclass of monotonic max GNNs, which use only the max aggregation function, and identify a corresponding class of Datalog programs. Finally, we carry out a practical evaluation and show that monotonic max-sum GNNs can be successfully trained in practice on common knowledge graph tasks, and that extracting rules from max-sum GNNs is practically feasible.

Details DOI

AAAI Conference 2026 Conference Paper

Towards a Common Framework for Autoformalization

Agnieszka Mensfelt
David Tena Cucala
Santiago Franco
Angeliki Koutsoukou-Argyraki
Vince Trencsenyi
Kostas Stathis

Autoformalization has emerged as a term referring to the automation of formalization in the context of the formalization of mathematics using interactive theorem provers (proof assistants). Its rapid development has been driven by progress in deep learning, especially large language models (LLMs). More recently, usage of the term has expanded beyond mathematics to describe tasks that involve translating natural language input into verifiable logical representations. At the same time, a growing body of research explores using LLMs to translate informal language into formal representations for reasoning, planning, and knowledge representation, but without explicitly referring to this process as autoformalization. As a result, despite addressing similar tasks, the largely independent development of these research areas has limited opportunities for shared methodologies, benchmarks, and theoretical frameworks that could accelerate progress. Our goal is to review - explicit or implicit - instances of what can be considered autoformalization and to propose a unified framework, encouraging cross-pollination between different fields to advance the development of next generation AI systems.

PDF Details DOI

AAAI Conference 2024 Conference Paper

Recurrent Graph Neural Networks and Their Connections to Bisimulation and Logic

Maximilian Pflueger
David Tena Cucala
Egor V. Kostylev

The success of Graph Neural Networks (GNNs) in practice has motivated extensive research on their theoretical properties. This includes recent results that characterise node classifiers expressible by GNNs in terms of first order logic. Most of the analysis, however, has been focused on GNNs with fixed number of message-passing iterations (i.e., layers), which cannot realise many simple classifiers such as reachability of a node with a given label. In this paper, we start to fill this gap and study the foundations of GNNs that can perform more than a fixed number of message-passing iterations. We first formalise two generalisations of the basic GNNs: recurrent GNNs (RecGNNs), which repeatedly apply message-passing iterations until the node classifications become stable, and graph-size GNNs (GSGNNs), which exploit a built-in function of the input graph size to decide the number of message-passings. We then formally prove that GNN classifiers are strictly less expressive than RecGNN ones, and RecGNN classifiers are strictly less expressive than GSGNN ones. To get this result, we identify novel semantic characterisations of the three formalisms in terms of suitable variants of bisimulation, which we believe have their own value for our understanding of GNNs. Finally, we prove syntactic logical characterisations of RecGNNs and GSGNNs analogous to the logical characterisation of plain GNNs, where we connect the two formalisms to monadic monotone fixpoint logic---a generalisation of first-order logic that supports recursion.

PDF Details DOI

KR Conference 2023 Conference Paper

On the Correspondence Between Monotonic Max-Sum GNNs and Datalog

David Tena Cucala
Bernardo Cuenca Grau
Boris Motik
Egor V. Kostylev

Although there has been significant interest in applying machine learning techniques to structured data, the expressivity (i. e. , a description of what can be learned) of such techniques is still poorly understood. In this paper, we study data transformations based on graph neural networks (GNNs). First, we note that the choice of how a dataset is encoded into a numeric form processable by a GNN can obscure the characterisation of a model's expressivity, and we argue that a canonical encoding provides an appropriate basis. Second, we study the expressivity of monotonic max-sum GNNs, which cover a subclass of GNNs with max and sum aggregation functions. We show that, for each such GNN, one can compute a Datalog program such that applying the GNN to any dataset produces the same facts as a single round of application of the program's rules to the dataset. Monotonic max-sum GNNs can sum an unbounded number of feature vectors which can result in arbitrarily large feature values, whereas rule application requires only a bounded number of constants. Hence, our result shows that the unbounded summation of monotonic max-sum GNNs does not increase their expressive power. Third, we sharpen our result to the subclass of monotonic max GNNs, which use only the max aggregation function, and identify a corresponding class of Datalog programs.

PDF Details DOI

AIJ Journal 2021 Journal Article

Pay-as-you-go consequence-based reasoning for the description logic SROIQ

David Tena Cucala
Bernardo Cuenca Grau
Ian Horrocks

Consequence-based (CB) reasoners combine ideas from resolution and (hyper)tableau calculi for solving key reasoning problems in Description Logics (DLs), such as ontology classification. Existing CB reasoners, however, are only capable of handling DLs without nominals (such as ALCHIQ ), or DLs without disjunction (such as Horn- ALCHOIQ ). In this paper, we present a consequence-based calculus for concept subsumption and classification in the DL ALCHOI Q +, which extends ALC with role hierarchies, inverse roles, number restrictions, and nominals; to the best of our knowledge, ours is the first CB calculus for an NExpTime-complete DL. By using standard transformations, our calculus extends to SROIQ, which covers all of OWL 2 DL except for datatypes. A key feature of our calculus is its pay-as-you-go behaviour: our calculus is worst-case optimal for all the well-known proper fragments of ALCHOI Q +. Furthermore, our calculus can be applied to DL reasoning problems other than subsumption and ontology classification, such as instance retrieval and realisation. We have implemented our calculus as an extension of Sequoia, a CB reasoner which previously supported ontology classification in SRIQ. We have performed an empirical evaluation of our implementation, which shows that Sequoia offers competitive performance. Although there still remains plenty of room for further optimisation, the calculus presented in this paper and its implementation provide an important addition to the repertoire of reasoning techniques and practical systems for expressive DLs.

Details DOI

IJCAI Conference 2018 Conference Paper

Consequence-based Reasoning for Description Logics with Disjunction, Inverse Roles, Number Restrictions, and Nominals

David Tena Cucala
Bernardo Cuenca Grau
Ian Horrocks

We present a consequence-based calculus for concept subsumption and classification in the description logic ALCHOIQ, which extends ALC with role hierarchies, inverse roles, number restrictions, and nominals. By using standard transformations, our calculus extends to SROIQ, which covers all of OWL 2 DL except for datatypes. A key feature of our calculus is its pay-as-you-go behaviour: unlike existing algorithms, our calculus is worst-case optimal for all the well-known proper fragments of ALCHOIQ, albeit not for the full logic.

PDF Details

JAIR Journal 2018 Journal Article

Consequence-Based Reasoning for Description Logics with Disjunctions and Number Restrictions

Andrew Bate
Boris Motik
Bernardo Cuenca Grau
David Tena Cucala
František Simančík
Ian Horrocks

Classification of description logic (DL) ontologies is a key computational problem in modern data management applications, so considerable effort has been devoted to the development and optimisation of practical reasoning calculi. Consequence-based calculi combine ideas from hypertableau and resolution in a way that has proved very effective in practice. However, existing consequence-based calculi can handle either Horn DLs (which do not support disjunction) or DLs without number restrictions. In this paper, we overcome this important limitation and present the first consequence-based calculus for deciding concept subsumption in the DL ALCHIQ+. Our calculus runs in exponential time assuming unary coding of numbers, and on ELH ontologies it runs in polynomial time. The extension to disjunctions and number restrictions is technically involved: we capture the relevant consequences using first-order clauses, and our inference rules adapt paramodulation techniques from first-order theorem proving. By using a well-known preprocessing step, the calculus can also decide concept subsumptions in SRIQ---a rich DL that covers all features of OWL 2 DL apart from nominals and datatypes. We have implemented our calculus in a new reasoner called Sequoia. We present the architecture of our reasoner and discuss several novel and important implementation techniques such as clause indexing and redundancy elimination. Finally, we present the results of an extensive performance evaluation, which revealed Sequoia to be competitive with existing reasoners. Thus, the calculus and the techniques we present in this paper provide an important addition to the repertoire of practical implementation techniques for description logic reasoning.

PDF Details DOI