Arrow Research search

Author name cluster

Gianluca Cima

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

19 papers
1 author row

Possible papers

19

AAAI Conference 2026 Conference Paper

Expressive Recursive Answers for Ontological Knowledge Bases

  • Luca Andolfi
  • Gianluca Cima
  • Marco Console
  • Maurizio Lenzerini

A fundamental use of knowledge bases (KBs) is query answering, i.e., retrieving the information entailed by the KB in response to a user query. When both the KB and the query are specified as logical formulae, the standard form of answer provided to users is the set of all certain answers (CAs): tuples of constants that satisfy the formula defining the query in every model of the logical theory defining the KB. Despite their wide adoption, CAs are known to be just a lossy representation of the information that a KB and a query provide. While several alternative answer languages have been proposed in the literature, no general consensus has emerged on the most suitable approach to query answering over ontological KBs, as each language comes with its own limitations. To address some of these issues, we introduce Regularly Recurrent Answers (RRAs), a novel answer language for queries over ontological KBs based on regular expressions. RRAs support the representation of infinite sets of tuples of constants via a simple (and arguably well understood) generation mechanism. We show that RRAs can capture a fundamental fragment of the certain information entailed by union of conjunctive queries and DL-Lite KBs, making them a strong candidate for informative query answering settings. Our contribution includes the formal definition of RRAs, a proof of their informativeness, and a study of the computational complexity of query answering problem using RRAs.

AAAI Conference 2026 Conference Paper

Foundations of Formal Reasoning over Knowledge Bases Combining Symbolic and Sub-Symbolic Knowledge

  • Gianluca Cima
  • Marco Console
  • Laura Papi

More and more organizations are relying on Machine Learning (ML) models to support internal decision-making processes. To better support such processes, it would be highly beneficial to contextualize the inductively acquired knowledge encoded in these models and enable formal reasoning over it. Despite significant progress in Neural-Symbolic AI, this specific challenge remains largely under-explored. We propose a framework that allows to integrate the knowledge induced by ML classifiers with the knowledge specified by logic-based formalisms. The framework is based on the novel notion of Hybrid Knowledge Base (HKB), consisting of two components: an ontology and a set of ML binary classifiers. As usual, the ontology provides an intensional representation of the modeled domain through logic-based axioms, while the binary classifiers implicitly encode the extensional knowledge. Specifically, a HKB associates to each concept and role mentioned in the ontology a classifier based on a set of features deemed to be relevant for the application domain, thereby virtually populating the concepts and roles with the instances and pairs of instances from the feature space. Besides the definition of the new framework, as a more technical contribution we show how to reason in this framework by studying query answering over HKBs. In particular, we investigate the computational complexity of query answering in a rich language over HKBs in which the ontology is specified in (the Description Logic counterpart of) RDFS, while the binary classifiers are represented by Multi-Layer Perceptrons.

KR Conference 2025 System Paper

Advances in Logic-Based Entity Resolution: Enhancing ASPEN with Local Merges and Optimality Criteria

  • Zhiliang Xiang
  • Meghyn Bienvenu
  • Gianluca Cima
  • Víctor Gutiérrez-Basulto
  • Yazmín Ibáñez-García

In this paper, we present ASPEN+, which extends an existing ASP-based system, ASPEN, for collective entity resolution with two important functionalities: support for local merges and new optimality criteria for preferred solutions. Indeed, ASPEN only supports so-called global merges of entity-referring constants (e. g. author ids), in which all occurrences of matched constants are treated as equivalent and merged accordingly. However, it has been argued that when resolving data values, local merges are often more appropriate, as e. g. some instances of ‘J. Lee’ may refer to ‘Joy Lee’, while others should be matched with ‘Jake Lee’. In addition to allowing such local merges, ASPEN+ offers new optimality criteria for selecting solutions, such as minimizing rule violations or maximising the number of rules supporting a merge. Our main contributions are thus (1) the formalisation and computational analysis of various notions of optimal solution, and (2) an extensive experimental evaluation on real-world datasets, demonstrating the effect of local merges and the new optimality criteria on both accuracy and runtime.

AAAI Conference 2025 Conference Paper

Answering Conjunctive Queries with Safe Negation and Inequalities over RDFS Knowledge Bases

  • Gianluca Cima
  • Marco Console
  • Roberto Maria Delfino
  • Maurizio Lenzerini
  • Antonella Poggi

Expressing negative conditions is a crucial feature of query languages for knowledge bases (KBs). Answering such queries over ontological KBs, however, is a very challenging task that becomes undecidable even for lightweight Description Logic (DL) ontologies. Such negative results hold even for Conjunctive Queries (CQs) equipped with basic forms of negative conditions such as the so-called safe negation or inequality atoms. One ontology language that is seemingly unaffected by these results is (the DL counterpart of) RDFS even if equipped with disjointness axioms. Answering CQs with inequalities over such ontologies is known to be Pi^p_2-complete, if the number of inequality atoms is unbounded, and NP-complete if we limit this number to one. Notably, these results leave open the cases of CQs with a fixed number greater than two of inequality atoms. Additionally, such a thorough analysis is missing for CQs with safe negation. In this paper, we embark in a refined analysis of the combined complexity of answering CQs with inequality atoms and safe negation over RDFS ontologies augmented with disjointness axioms. Firstly, we provide a unified Pi^p_2 query answering algorithm for the general problem. Secondly, we confirm the generally held conjecture according to which answering CQs with two inequality atoms over such ontologies is already Pi^p_2-hard. This result closes an important gap in the current literature and has an impact on the widely influential problem of query containment. Lastly, for CQs with safe negation, we prove a behavior similar to that of CQs with inequality atoms. Specifically, we show that answering CQs with at most one negated atom can be done in NP, while allowing at most two negated atoms is sufficient to obtain Pi^p_2-hardness.

IJCAI Conference 2025 Conference Paper

Assessing the Exposure to Public Knowledge in Policy-Protected Description Logic Ontologies

  • Gianluca Cima
  • Domenico Lembo
  • Lorenzo Marconi
  • Riccardo Rosati
  • Domenico Fabio Savo

We propose a general framework for assessing the exposure of sensitive knowledge in policy-protected knowledge bases (KBs), where knowledge is represented as logical theories and data protection policies are defined declaratively using epistemic dependencies. The framework models scenarios in which confidential parts of the KB may be publicly known due to security breaches. We study two fundamental decision problems: determining whether the exposed knowledge violates the data protection policy (leakage), and whether there exists a secure view of the KB that complies with the policy. We analyze the computational complexity (specifically, data complexity) of these problems, focusing on the DL-Lite_R and EL_\bot Description Logics. Our findings show that, for DL-Lite_R with restricted forms of policy, both the problems can be efficiently solved through query rewriting methods. For EL_\bot, we establish conditions for tractable computational bounds. Our results highlight the potential of this framework for practical applications in confidentiality-preserving knowledge management.

KR Conference 2024 System Paper

ASPEN: ASP-Based System for Collective Entity Resolution

  • Zhiliang Xiang
  • Meghyn Bienvenu
  • Gianluca Cima
  • Víctor Gutiérrez Basulto
  • Yazmín Ibáñez García

In this paper, we present ASPEN, an answer set programming (ASP) implementation of a recently proposed declarative framework for collective entity resolution (ER). While an ASP encoding had been previously suggested, several practical issues had been neglected, most notably, the question of how to efficiently compute the (externally defined) similarity facts that are used in rule bodies. This leads us to propose new variants of the encodings (including Datalog approximations) and show how to employ different functionalities of ASP solvers to compute (maximal) solutions, and (approximations of) the sets of possible and certain merges. A comprehensive experimental evaluation of ASPEN on real-world datasets shows that the approach is promising, achieving high accuracy in real-life ER scenarios. Our experiments also yield useful insights into the relative merits of different types of (approximate) ER solutions, the impact of recursion, and factors influencing performance.

IJCAI Conference 2024 Conference Paper

Enhancing Controlled Query Evaluation through Epistemic Policies

  • Gianluca Cima
  • Domenico Lembo
  • Lorenzo Marconi
  • Riccardo Rosati
  • Domenico Fabio Savo

In this paper, we propose the use of epistemic dependencies to express data protection policies in Controlled Query Evaluation (CQE), which is a form of confidentiality-preserving query answering over ontologies and databases. The resulting policy language goes significantly beyond those proposed in the literature on CQE so far, allowing for very rich and practically interesting forms of data protection rules. We show the expressive abilities of our framework and study the data complexity of CQE for (unions of) conjunctive queries when ontologies are specified in the Description Logic DL-LiteR. Interestingly, while we show that the problem is in general intractable, we prove tractability for the case of acyclic epistemic dependencies by providing a suitable query rewriting algorithm. The latter result paves the way towards the implementation and practical application of this new approach to CQE.

AAAI Conference 2024 Conference Paper

What Does a Query Answer Tell You? Informativeness of Query Answers for Knowledge Bases

  • Luca Andolfi
  • Gianluca Cima
  • Marco Console
  • Maurizio Lenzerini

Query answering for Knowledge Bases (KBs) amounts to extracting information from the various models of a KB, and presenting the user with an object that represents such information. In the vast majority of cases, this object consists of those tuples of constants that satisfy the query expression either in every model (certain answers) or in some model (possible answers). However, similarly to the case of incomplete databases, both these forms of answers are a lossy representation of all the knowledge inferable from the query and the queried KB. In this paper, we illustrate a formal framework to characterize the information that query answers for KBs are able to represent. As a first application of the framework, we study the informativeness of current query answering approaches, including the recently introduced partial answers. We then define a novel notion of answers, allowing repetition of variables across answer tuples. We show that these answers are capable of representing a meaningful form of information, and we also study their data complexity properties.

KR Conference 2023 Conference Paper

Combining Global and Local Merges in Logic-based Entity Resolution

  • Meghyn Bienvenu
  • Gianluca Cima
  • Víctor Gutiérrez-Basulto
  • Yazmín Ibáñez-García

In the recently proposed LACE framework for collective entity resolution, logical rules and constraints are used to identify pairs of entity references (e. g. author or paper ids) that denote the same entity. This identification is global: all occurrences of those entity references (possibly across multiple database tuples) are deemed equal and can be merged. By contrast, a local form of merge is often more natural when identifying pairs of data values, e. g. some occurrences of 'J. Smith' may be equated with 'Joe Smith', while others should merge with 'Jane Smith'. This motivates us to extend LACE with local merges of values and explore the computational properties of the resulting formalism.

AAAI Conference 2023 Conference Paper

Epistemic Disjunctive Datalog for Querying Knowledge Bases

  • Gianluca Cima
  • Marco Console
  • Maurizio Lenzerini
  • Antonella Poggi

The Datalog query language can express several powerful recursive properties, often crucial in real-world scenarios. While answering such queries is feasible over relational databases, the picture changes dramatically when data is enriched with intensional knowledge. It is indeed well-known that answering Datalog queries is undecidable already over lightweight knowledge bases (KBs) of the DL-Lite family. To overcome this issue, we propose a new query language based on Disjunctive Datalog rules combined with a modal epistemic operator. Rules in this language interact with the queried KB exclusively via the epistemic operator, thus extracting only the information true in every model of the KB. This form of interaction is crucial for not falling into undecidability. The contribution provided by this paper is threefold. First, we illustrate the syntax and the semantics of the novel query language. Second, we study the expressive power of different fragments of our new language and compare it with Disjunctive Datalog and its variants. Third, we outline the precise data complexity of answering queries in our new language over KBs expressed in various well-known formalisms.

IJCAI Conference 2023 Conference Paper

REPLACE: A Logical Framework for Combining Collective Entity Resolution and Repairing

  • Meghyn Bienvenu
  • Gianluca Cima
  • Víctor Gutiérrez-Basulto

This paper considers the problem of querying dirty databases, which may contain both erroneous facts and multiple names for the same entity. While both of these data quality issues have been widely studied in isolation, our contribution is a holistic framework for jointly deduplicating and repairing data. Our REPLACE framework follows a declarative approach, utilizing logical rules to specify under which conditions a pair of entity references can or must be merged and logical constraints to specify consistency requirements. The semantics defines a space of solutions, each consisting of a set of merges to perform and a set of facts to delete, which can be further refined by applying optimality criteria. As there may be multiple optimal solutions, we use classical notions of possible and certain query answers to reason over the alternative solutions, and introduce a novel notion of most informative answer to obtain a more compact presentation of query results. We perform a detailed analysis of the data complexity of the central reasoning tasks of recognizing optimal solutions and (most informative) possible and certain answers, for each of the three notions of optimal solution and for both general and restricted specifications.

AAAI Conference 2022 Conference Paper

Monotone Abstractions in Ontology-Based Data Management

  • Gianluca Cima
  • Marco Console
  • Maurizio Lenzerini
  • Antonella Poggi

In Ontology-Based Data Management (OBDM), an abstraction of a source query q is a query over the ontology capturing the semantics of q in terms of the concepts and the relations available in the ontology. Since a perfect characterization of a source query may not exist, the notions of best sound and complete approximations of an abstraction have been introduced and studied in the typical OBDM context, i. e. , in the case where the ontology is expressed in DL-Lite, and source queries are expressed as unions of conjunctive queries (UCQs). Interestingly, if we restrict our attention to abstractions expressed as UCQs, even best approximations of abstractions are not guaranteed to exist. Thus, a natural question to ask is whether such limitations affect even larger classes of queries. In this paper, we answer this fundamental question for an essential class of queries, namely the class of monotone queries. We define a monotone query language based on disjunctive Datalog enriched with an epistemic operator, and show that its expressive power suffices for expressing the best approximations of monotone abstractions of UCQs.

AAAI Conference 2020 Conference Paper

Answering Conjunctive Queries with Inequalities in DL-Lite ℛ

  • Gianluca Cima
  • Maurizio Lenzerini
  • Antonella Poggi

In the context of the Description Logic DL-Lite = R, i. e. , DL-LiteR without UNA and with inequality axioms, we address the problem of adding to unions of conjunctive queries (UCQs) one of the simplest forms of negation, namely, inequality. It is well known that answering conjunctive queries with unrestricted inequalities over DL-LiteR ontologies is in general undecidable. Therefore, we explore two strategies for recovering decidability, and, hopefully, tractability. Firstly, we weaken the ontology language, and consider the variant of DL-Lite = R corresponding to RDFS enriched with both inequality and disjointness axioms. Secondly, we weaken the query language, by preventing inequalities to be applied to existentially quantified variables, thus obtaining the class of queries named UCQ =, b s. We prove that in the two cases, query answering is decidable, and we provide tight complexity bounds for the problem, both for data and combined complexity. Notably, the results show that answering UCQ =, b s over DL-Lite = R ontologies is still in AC0 in data complexity.

IJCAI Conference 2020 Conference Paper

Controlled Query Evaluation in Description Logics Through Instance Indistinguishability

  • Gianluca Cima
  • Domenico Lembo
  • Riccardo Rosati
  • Domenico Fabio Savo

We study privacy-preserving query answering in Description Logics (DLs). Specifically, we consider the approach of controlled query evaluation (CQE) based on the notion of instance indistinguishability. We derive data complexity results for query answering over DL-LiteR ontologies, through a comparison with an alternative, existing confidentiality-preserving approach to CQE. Finally, we identify a semantically well-founded notion of approximated query answering for CQE, and prove that, for DL-LiteR ontologies, this form of CQE is tractable with respect to data complexity and is first-order rewritable, i. e. , it is always reducible to the evaluation of a first-order query over the data instance.

KR Conference 2020 Conference Paper

Non-Monotonic Ontology-based Abstractions of Data Services

  • Gianluca Cima
  • Maurizio Lenzerini
  • Antonella Poggi

In Ontology-Based Data Access (OBDA), a domain ontology is linked to the data sources of an organization in order to query, integrate and manage data through the concepts and relations of the domain of interest, thus abstracting from the technical details of the data layer implementation. While the great majority of contributions in OBDA in the last decade have been concerned with the issue of computing the answers of queries expressed over the ontology, recent papers address a different problem, namely the one of providing suitable abstractions of data services, i. e. , characterizing or explaining the semantics of queries over the sources in terms of queries over the domain ontology. Current works on this subject are based on expressing abstractions in terms of unions of conjunctive queries (UCQs) over the ontology. In this paper we advocate the use of a non-monotonic language for this task. As a first contribution, we present a simple extension of UCQs with non-monotonic features, and show that non-monotonicity provides more expressive power in characterizing the semantics of data services. A second contribution is to prove that, similarly to the case of monotonic abstractions, depending on the expressive power of the languages used to specify the various components of the OBDA system, there are cases where neither perfect nor approximated abstractions exist for a given data service. As a third contribution, we single out interesting special cases where the existence of abstractions is guaranteed, and we present algorithms for computing such abstractions in these cases.

IJCAI Conference 2019 Conference Paper

Semantic Characterization of Data Services through Ontologies

  • Gianluca Cima
  • Maurizio Lenzerini
  • Antonella Poggi

We study the problem of associating formal semantic descriptions to data services. We base our proposal on the Ontology-based Data Access paradigm, where a domain ontology is used to provide a semantic layer mapped to the data sources of an organization. The basic idea is to explain the semantics of a data service in terms of a query over the ontology. We illustrate a formal framework for this problem, based on the notion of source-to-ontology (s-to-o) rewriting, which comes in three variants, called sound, complete and perfect, respectively. We present a thorough complexity analysis of two computational problems, namely verification (checking whether a query is an s-to-o rewriting of a given data service), and computation (computing an s-to-o rewriting of a data service).