Arrow Research search

Author name cluster

Heiner Stuckenschmidt

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

34 papers
2 author rows

Possible papers

34

JAIR Journal 2025 Journal Article

Causal Graphs and Fairness in Machine Learning: Addressing Practical Challenges in Causal Fairness Evaluation

  • Lea Cohausz
  • Jakob Kappenberger
  • Heiner Stuckenschmidt

Background: With the discussion of fairness in Machine Learning (ML) gaining traction in recent years, the idea of viewing fairness through the causal lens has become prominent. The main idea behind this is that by looking at the causal structure underlying the data used for an ML model, we can see and evaluate more concisely which influences of the sensitive variables on the target variable are problematic and how they are problematic. Doing so allows not only a nuanced view of fairness and an informed choice of fairness measures but also more targeted approaches (such as path-specific bias mitigation) to handle fairness issues. Objectives: Mainly, two important points have hindered the practical use of the causal lens and causality-based bias mitigation. First, a classification of different graphical structures with different fairness implications involving a sensitive variable and a target variable is still missing, as is a discussion of how different contexts can shape our evaluation of fairness. Second, the construction of such graphical models is not trivial and error-prone. However, recent work showed that combining background knowledge and data-driven network structure learning may lead to more accurate graphs. In this work, we attempt to address and tackle these two practical shortcomings. Methods: Our first contribution is a classification and discussion of causal structures with different fairness implications and how contexts shape our assessment. Our second contribution is an advancement in learning more accurate graphs by adapting structure learning algorithms, and a detailed evaluation of graph correctness and subsequent fairness implications. Results: We show that when including background knowledge naturally available in fairness settings, graph learning becomes more accurate, which also has positive implications for accurate fairness assessments. Conclusions: Our work may pave the way for a broader adoption of causal ML fairness by providing concrete suggestions about the implications of causal structures and contexts, and learning more accurate graphs. We also address current limitations and highlight the need for stakeholder inclusion.

ICLR Conference 2025 Conference Paper

Mitigating Information Loss in Tree-Based Reinforcement Learning via Direct Optimization

  • Sascha Marton
  • Tim Grams
  • Florian Vogt
  • Stefan Lüdtke
  • Christian Bartelt
  • Heiner Stuckenschmidt

Reinforcement learning (RL) has seen significant success across various domains, but its adoption is often limited by the black-box nature of neural network policies, making them difficult to interpret. In contrast, symbolic policies allow representing decision-making strategies in a compact and interpretable way. However, learning symbolic policies directly within on-policy methods remains challenging. In this paper, we introduce SYMPOL, a novel method for SYMbolic tree-based on-POLicy RL. SYMPOL employs a tree-based model integrated with a policy gradient method, enabling the agent to learn and adapt its actions while maintaining a high level of interpretability. We evaluate SYMPOL on a set of benchmark RL tasks, demonstrating its superiority over alternative tree-based RL approaches in terms of performance and interpretability. Unlike existing methods, it enables gradient-based, end-to-end learning of interpretable, axis-aligned decision trees within standard on-policy RL algorithms. Therefore, SYMPOL can become the foundation for a new class of interpretable RL based on decision trees. Our implementation is available under: https://github.com/s-marton/sympol

NeurIPS Conference 2024 Conference Paper

A Data-Centric Perspective on Evaluating Machine Learning Models for Tabular Data

  • Andrej Tschalzev
  • Sascha Marton
  • Stefan Lüdtke
  • Christian Bartelt
  • Heiner Stuckenschmidt

Tabular data is prevalent in real-world machine learning applications, and new models for supervised learning of tabular data are frequently proposed. Comparative studies assessing performance differences typically have model-centered evaluation setups with overly standardized data preprocessing. This limits the external validity of these studies, as in real-world modeling pipelines, models are typically applied after dataset-specific preprocessing and feature engineering. We address this gap by proposing a data-centric evaluation framework. We select 10 relevant datasets from Kaggle competitions and implement expert-level preprocessing pipelines for each dataset. We conduct experiments with different preprocessing pipelines and hyperparameter optimization (HPO) regimes to quantify the impact of model selection, HPO, feature engineering, and test-time adaptation. Our main findings reveal: 1) After dataset-specific feature engineering, model rankings change considerably, performance differences decrease, and the importance of model selection reduces. 2) Recent models, despite their measurable progress, still significantly benefit from manual feature engineering. This holds true for both tree-based models and neural networks. 3) While tabular data is typically considered static, samples are often collected over time, and adapting to distribution shifts can be important even in supposedly static data. These insights suggest that research efforts should be directed toward a data-centric perspective, acknowledging that tabular data requires feature engineering and often exhibits temporal characteristics.

IJCAI Conference 2024 Conference Paper

Enabling Mixed Effects Neural Networks for Diverse, Clustered Data Using Monte Carlo Methods

  • Andrej Tschalzev
  • Paul Nitschke
  • Lukas Kirchdorfer
  • Stefan Lüdtke
  • Christian Bartelt
  • Heiner Stuckenschmidt

Neural networks often assume independence among input data samples, disregarding correlations arising from inherent clustering patterns in real-world datasets (e. g. , due to different sites or repeated measurements). Recently, mixed effects neural networks (MENNs) which separate cluster-specific 'random effects' from cluster-invariant 'fixed effects' have been proposed to improve generalization and interpretability for clustered data. However, existing methods only allow for approximate quantification of cluster effects and are limited to regression and binary targets with only one clustering feature. We present MC-GMENN, a novel approach employing Monte Carlo techniques to train Generalized Mixed Effects Neural Networks. We empirically demonstrate that MC-GMENN outperforms existing mixed effects deep learning models in terms of generalization performance, time complexity, and quantification of inter-cluster variance. Additionally, MC-GMENN is applicable to a wide range of datasets, including multi-class classification tasks with multiple high-cardinality categorical features. For these datasets, we show that MC-GMENN outperforms conventional encoding and embedding methods, simultaneously offering a principled methodology for interpreting the effects of clustering patterns.

ECAI Conference 2024 Conference Paper

Fact Probability Vector Based Goal Recognition

  • Nils Wilken
  • Lea Cohausz
  • Christian Bartelt
  • Heiner Stuckenschmidt

We present a new approach to goal recognition that involves comparing observed facts with their expected probabilities. These probabilities depend on a specified goal g and initial state s0. Our method maps these probabilities and observed facts into a real vector space to compute heuristic values for potential goals. These heuristic values estimate the likelihood of a given goal being the true objective of the observed agent. As obtaining exact expected probabilities for observed facts in an observation sequence is often practically infeasible, we propose and empirically validate a method for approximating these probabilities. Our empirical results show that the proposed approach offers improved goal recognition precision compared to state-of-the-art techniques while reducing computational complexity.

AAAI Conference 2024 Conference Paper

GradTree: Learning Axis-Aligned Decision Trees with Gradient Descent

  • Sascha Marton
  • Stefan Lüdtke
  • Christian Bartelt
  • Heiner Stuckenschmidt

Decision Trees (DTs) are commonly used for many machine learning tasks due to their high degree of interpretability. However, learning a DT from data is a difficult optimization problem, as it is non-convex and non-differentiable. Therefore, common approaches learn DTs using a greedy growth algorithm that minimizes the impurity locally at each internal node. Unfortunately, this greedy procedure can lead to inaccurate trees. In this paper, we present a novel approach for learning hard, axis-aligned DTs with gradient descent. The proposed method uses backpropagation with a straight-through operator on a dense DT representation, to jointly optimize all tree parameters. Our approach outperforms existing methods on binary classification benchmarks and achieves competitive results for multi-class tasks. The implementation is available under: https://github.com/s-marton/GradTree

ICLR Conference 2024 Conference Paper

GRANDE: Gradient-Based Decision Tree Ensembles for Tabular Data

  • Sascha Marton
  • Stefan Lüdtke
  • Christian Bartelt
  • Heiner Stuckenschmidt

Despite the success of deep learning for text and image data, tree-based ensemble models are still state-of-the-art for machine learning with heterogeneous tabular data. However, there is a significant need for tabular-specific gradient-based methods due to their high flexibility. In this paper, we propose $\text{GRANDE}$, $\text{GRA}$die$\text{N}$t-Based $\text{D}$ecision Tree $\text{E}$nsembles, a novel approach for learning hard, axis-aligned decision tree ensembles using end-to-end gradient descent. GRANDE is based on a dense representation of tree ensembles, which affords to use backpropagation with a straight-through operator to jointly optimize all model parameters. Our method combines axis-aligned splits, which is a useful inductive bias for tabular data, with the flexibility of gradient-based optimization. Furthermore, we introduce an advanced instance-wise weighting that facilitates learning representations for both, simple and complex relations, within a single model. We conducted an extensive evaluation on a predefined benchmark with 19 classification datasets and demonstrate that our method outperforms existing gradient-boosting and deep learning frameworks on most datasets. The method is available under: https://github.com/s-marton/GRANDE

IJCAI Conference 2024 Conference Paper

History Repeats Itself: A Baseline for Temporal Knowledge Graph Forecasting

  • Julia Gastinger
  • Christian Meilicke
  • Federico Errica
  • Timo Sztyler
  • Anett Schülke
  • Heiner Stuckenschmidt

Temporal Knowledge Graph (TKG) Forecasting aims at predicting links in Knowledge Graphs for future timesteps based on a history of Knowledge Graphs. To this day, standardized evaluation protocols and rigorous comparison across TKG models are available, but the importance of simple baselines is often neglected in the evaluation, which prevents researchers from discerning actual and fictitious progress. We propose to close this gap by designing an intuitive baseline for TKG Forecasting based on predicting recurring facts. Compared to most TKG models, it requires little hyperparameter tuning and no iterative training. Further, it can help to identify failure modes in existing approaches. The empirical findings are quite unexpected: compared to 11 methods on five datasets, our baseline ranks first or third in three of them, painting a radically different picture of the predictive quality of the state of the art.

IJCAI Conference 2024 Conference Paper

PyClause - Simple and Efficient Rule Handling for Knowledge Graphs

  • Patrick Betz
  • Luis Galárraga
  • Simon Ott
  • Christian Meilicke
  • Fabian Suchanek
  • Heiner Stuckenschmidt

Rule mining finds patterns in structured data such as knowledge graphs. Rules can predict facts, help correct errors, and yield explainable insights about the data. However, existing rule mining implementations focus exclusively on mining rules -- and not on their application. The PyClause library offers a rich toolkit for the application of the mined rules: from explaining facts to predicting links, scoring rules, and deducing query results. The library is easy to use and can handle substantial data loads.

AAMAS Conference 2024 Conference Paper

RAISE the Bar: R estriction of A ction Spaces for I mproved S ocial Welfare and E quity in Traffic Management

  • Michael Oesterle
  • Tim Grams
  • Christian Bartelt
  • Heiner Stuckenschmidt

Restriction-based governance has recently been proposed as an alternative to reward shaping for achieving system-level goals in competitive multi-agent systems. In this work, we apply these two approaches to the domain of traffic management, specifically investigating their efficacy and fairness. Our results show that edge restrictions in congested traffic networks are superior to dynamic pricing with regard to equity (i. e. , equal treatment of agents) while achieving comparable travel-time improvements. We argue that the former metric, as an adequate proxy for fairness, can be crucial for the quality and acceptance of a governance scheme, particularly when human agents are affected.

NeurIPS Conference 2024 Conference Paper

TGB 2.0: A Benchmark for Learning on Temporal Knowledge Graphs and Heterogeneous Graphs

  • Julia Gastinger
  • Shenyang Huang
  • Mikhail Galkin
  • Erfan Loghmani
  • Ali Parviz
  • Farimah Poursafaei
  • Jacob Danovitch
  • Emanuele Rossi

Multi-relational temporal graphs are powerful tools for modeling real-world data, capturing the evolving and interconnected nature of entities over time. Recently, many novel models are proposed for ML on such graphs intensifying the need for robust evaluation and standardized benchmark datasets. However, the availability of such resources remains scarce and evaluation faces added complexity due to reproducibility issues in experimental protocols. To address these challenges, we introduce Temporal Graph Benchmark 2. 0 (TGB 2. 0), a novel benchmarking framework tailored for evaluating methods for predicting future links on Temporal Knowledge Graphs and Temporal Heterogeneous Graphs with a focus on large-scale datasets, extending the Temporal Graph Benchmark. TGB 2. 0 facilitates comprehensive evaluations by presenting eight novel datasets spanning five domains with up to 53 million edges. TGB 2. 0 datasets are significantly largerthan existing datasets in terms of number of nodes, edges, or timestamps. In addition, TGB 2. 0 provides a reproducible and realistic evaluation pipeline for multi-relational temporal graphs. Through extensive experimentation, we observe that 1) leveraging edge-type information is crucial to obtain high performance, 2) simple heuristic baselines are often competitive with more complex methods, 3) most methods fail to run on our largest datasets, highlighting the need for research on more scalable methods.

AAAI Conference 2023 Conference Paper

Online Random Feature Forests for Learning in Varying Feature Spaces

  • Christian Schreckenberger
  • Yi He
  • Stefan Lüdtke
  • Christian Bartelt
  • Heiner Stuckenschmidt

In this paper, we propose a new online learning algorithm tailored for data streams described by varying feature spaces (VFS), wherein new features constantly emerge and old features may stop to be observed over various time spans. Our proposed algorithm, named Online Random Feature Forests for Feature space Variabilities (ORF3V), provides a strategy to respect such feature dynamics by generating, updating, pruning, as well as online re-weighing an ensemble of what we call feature forests, which are generated and updated based on a compressed and storage efficient representation for each observed feature. We benchmark our algorithm on 12 datasets, including one novel real-world dataset of government COVID-19 responses collected through a crowd-sensing program in Spain. The empirical results substantiate the viability and effectiveness of our ORF3V algorithm and its superior accuracy performance over the state-of-the-art rival models.

IJCAI Conference 2023 Conference Paper

Towards Utilitarian Online Learning -- A Review of Online Algorithms in Open Feature Space

  • Yi He
  • Christian Schreckenberger
  • Heiner Stuckenschmidt
  • Xindong Wu

Human intelligence comes from the capability to describe and make sense of the world surrounding us, often in a lifelong manner. Online Learning (OL) allows a model to simulate this capability, which involves processing data in sequence, making predictions, and learning from predictive errors. However, traditional OL assumes a fixed set of features to describe data, which can be restrictive. In reality, new features may emerge and old features may vanish or become obsolete, leading to an open feature space. This dynamism can be caused by more advanced or outdated technology for sensing the world, or it can be a natural process of evolution. This paper reviews recent breakthroughs that strived to enable OL in open feature spaces, referred to as Utilitarian Online Learning (UOL). We taxonomize existing UOL models into three categories, analyze their pros and cons, and discuss their application scenarios. We also benchmark the performance of representative UOL models, highlighting open problems, challenges, and potential future directions of this emerging topic.

IJCAI Conference 2022 Conference Paper

Adversarial Explanations for Knowledge Graph Embeddings

  • Patrick Betz
  • Christian Meilicke
  • Heiner Stuckenschmidt

We propose a novel black-box approach for performing adversarial attacks against knowledge graph embedding models. An adversarial attack is a small perturbation of the data at training time to cause model failure at test time. We make use of an efficient rule learning approach and use abductive reasoning to identify triples which are logical explanations for a particular prediction. The proposed attack is then based on the simple idea to suppress or modify one of the triples in the most confident explanation. Although our attack scheme is model independent and only needs access to the training data, we report results on par with state-of-the-art white-box attack methods that additionally require full access to the model architecture, the learned embeddings, and the loss functions. This is a surprising result which indicates that knowledge graph embedding models can partly be explained post hoc with the help of symbolic methods.

IJCAI Conference 2022 Conference Paper

Exchangeability-Aware Sum-Product Networks

  • Stefan Lüdtke
  • Christian Bartelt
  • Heiner Stuckenschmidt

Sum-Product Networks (SPNs) are expressive probabilistic models that provide exact, tractable inference. They achieve this efficiency by making use of local independence. On the other hand, mixtures of exchangeable variable models (MEVMs) are a class of tractable probabilistic models that make use of exchangeability of discrete random variables to render inference tractable. Exchangeability, which arises naturally in relational domains, has not been considered for efficient representation and inference in SPNs yet. The contribution of this paper is a novel probabilistic model which we call Exchangeability-Aware Sum-Product Networks (XSPNs). It contains both SPNs and MEVMs as special cases, and combines the ability of SPNs to efficiently learn deep probabilistic models with the ability of MEVMs to efficiently handle exchangeable random variables. We introduce a structure learning algorithm for XSPNs and empirically show that they can be more accurate than conventional SPNs when the data contains repeated, interchangeable parts.

NeSy Conference 2021 Conference Paper

Backpropagating through Markov Logic Networks

  • Patrick Betz
  • Mathias Niepert
  • Pasquale Minervini
  • Heiner Stuckenschmidt

We integrate Markov Logic networks with deep learning architectures operating on high-dimensional and noisy feature inputs. Instead of relaxing the discrete components into smooth functions, we propose an approach that allows us to backpropagate through standard statistical relational learning components using perturbation-based differentiation. The resulting hybrid models are shown to outperform models solely relying on deep learning based function fitting. We find that using noise perturbations is required to allow the proposed hybrid models to robustly learn from the training data.

EUMAS Conference 2021 Conference Paper

Governing Black-Box Agents in Competitive Multi-Agent Systems

  • Michael Pernpeintner
  • Christian Bartelt
  • Heiner Stuckenschmidt

Abstract Competitive Multi-Agent Systems (MAS) are inherently hard to control due to agent autonomy and strategic behavior, which is particularly problematic when there are system-level objectives to be achieved or specific environmental states to be avoided. Existing solutions for this task mostly assume specific knowledge about agent preferences, utilities and strategies, neglecting the fact that actions are not always directly linked to genuine agent preferences, but can also reflect anticipated competitor behavior, be a concession to a superior adversary or simply be intended to mislead other agents. This assumption both reduces applicability to real-world systems and opens room for manipulation. We therefore propose a new governance approach for competitive MAS which relies exclusively on publicly observable actions and transitions, and uses the acquired knowledge to purposefully restrict action spaces, thereby achieving the system’s objectives while preserving a high level of autonomy for the agents.

IJCAI Conference 2019 Conference Paper

Anytime Bottom-Up Rule Learning for Knowledge Graph Completion

  • Christian Meilicke
  • Melisachew Wudage Chekol
  • Daniel Ruffinelli
  • Heiner Stuckenschmidt

We propose an anytime bottom-up technique for learning logical rules from large knowledge graphs. We apply the learned rules to predict candidates in the context of knowledge graph completion. Our approach outperforms other rule-based approaches and it is competitive with current state of the art, which is based on latent representations. Besides, our approach is significantly faster, requires less computational resources, and yields an explanation in terms of the rules that propose a candidate.

IJCAI Conference 2019 Conference Paper

PRoFET: Predicting the Risk of Firms from Event Transcripts

  • Christoph Kilian Theil
  • Samuel Broscheit
  • Heiner Stuckenschmidt

Financial risk, defined as the chance to deviate from return expectations, is most commonly measured with volatility. Due to its value for investment decision making, volatility prediction is probably among the most important tasks in finance and risk management. Although evidence exists that enriching purely financial models with natural language information can improve predictions of volatility, this task is still comparably underexplored. We introduce PRoFET, the first neural model for volatility prediction jointly exploiting both semantic language representations and a comprehensive set of financial features. As language data, we use transcripts from quarterly recurring events, so-called "earnings calls"; in these calls, the performance of publicly traded companies is summarized and prognosticated by their management. We show that our proposed architecture, which models verbal context with an attention mechanism, significantly outperforms the previous state-of-the-art and other strong baselines. Finally, we visualize this attention mechanism on the token-level, thus aiding interpretability and providing a use case of PRoFET as a tool for investment decision support.

TIME Conference 2019 Conference Paper

Time-Aware Probabilistic Knowledge Graphs

  • Melisachew Wudage Chekol
  • Heiner Stuckenschmidt

The emergence of open information extraction as a tool for constructing and expanding knowledge graphs has aided the growth of temporal data, for instance, YAGO, NELL and Wikidata. While YAGO and Wikidata maintain the valid time of facts, NELL records the time point at which a fact is retrieved from some Web corpora. Collectively, these knowledge graphs (KG) store facts extracted from Wikipedia and other sources. Due to the imprecise nature of the extraction tools that are used to build and expand KG, such as NELL, the facts in the KG are weighted (a confidence value representing the correctness of a fact). Additionally, NELL can be considered as a transaction time KG because every fact is associated with extraction date. On the other hand, YAGO and Wikidata use the valid time model because they maintain facts together with their validity time (temporal scope). In this paper, we propose a bitemporal model (that combines transaction and valid time models) for maintaining and querying bitemporal probabilistic knowledge graphs. We study coalescing and scalability of marginal and MAP inference. Moreover, we show that complexity of reasoning tasks in atemporal probabilistic KG carry over to the bitemporal setting. Finally, we report our evaluation results of the proposed model.

IJCAI Conference 2017 Conference Paper

Automatic Assessment of Absolute Sentence Complexity

  • Sanja Stajner
  • Simone Paolo Ponzetto
  • Heiner Stuckenschmidt

Lexically and syntactically simpler sentences result in shorter reading time and better understanding in many people. However, no reliable systems for automatic assessment of absolute sentence complexity have been proposed so far. Instead, the assessment is usually done manually, requiring expert human annotators. To address this problem, we first define the sentence complexity assessment as a five-level classification task, and build a ‘gold standard’ dataset. Next, we propose robust systems for sentence complexity assessment, using a novel set of features based on leveraging lexical properties of freely available corpora, and investigate the impact of the feature type and corpus size on the classification performance.

AAAI Conference 2017 Conference Paper

Marrying Uncertainty and Time in Knowledge Graphs

  • Melisachew Chekol
  • Giuseppe Pirr˜
  • Joerg Schoenfisch
  • Heiner Stuckenschmidt

The management of uncertainty is crucial when harvesting structured content from unstructured and noisy sources. Knowledge Graphs (KGs) are a prominent example. KGs maintain both numerical and non-numerical facts, with the support of an underlying schema. These facts are usually accompanied by a confidence score that witnesses how likely is for them to hold. Despite their popularity, most of existing KGs focus on static data thus impeding the availability of timewise knowledge. What is missing is a comprehensive solution for the management of uncertain and temporal data in KGs. The goal of this paper is to fill this gap. We rely on two main ingredients. The first is a numerical extension of Markov Logic Networks (MLNs) that provide the necessary underpinning to formalize the syntax and semantics of uncertain temporal KGs. The second is a set of Datalog constraints with inequalities that extend the underlying schema of the KGs and help to detect inconsistencies. From a theoretical point of view, we discuss the complexity of two important classes of queries for uncertain temporal KGs: maximum a-posteriori and conditional probability inference. Due to the hardness of these problems and the fact that MLN solvers do not scale well, we also explore the usage of Probabilistic Soft Logics (PSL) as a practical tool to support our reasoning tasks. We report on an experimental evaluation comparing the MLN and PSL approaches.

FLAP Journal 2017 Journal Article

Multi-Attribute Decision Making with Weighted Description Logics.

  • Erman Acar
  • Manuel Fink
  • Christian Meilicke
  • Camilo Thorne
  • Heiner Stuckenschmidt

We introduce a decision-theoretic framework based on Description Logics (DLs), which can be used to encode and solve single stage multi-attribute decision problems. In particular, we consider the background knowledge as a DL knowledge base where each attribute is represented by a concept, weighted by a utility value which is asserted by the user. This yields a compact representation of preferences over attributes. Moreover, we represent choices as knowledge base individuals, and induce a ranking via the aggregation of attributes that they satisfy. We discuss the benefits of the approach from a decision theory point of view. Furthermore, we introduce an implementation of the framework as a Protégé plugin called uDecide. The plugin takes as input an ontology as background knowledge, and returns the choices consistent with the user’s (the knowledge base) preferences. We describe a use case with data from DBpedia. We also provide empirical results for its performance in the size of the ontology using the reasoner Konclude.

IJCAI Conference 2016 Conference Paper

Group Decision Making via Probabilistic Belief Merging

  • Nico Potyka
  • Erman Acar
  • Matthias Thimm
  • Heiner Stuckenschmidt

We propose a probabilistic-logical framework for group decision-making. Its main characteristic is that we derive group preferences from agents' beliefs and utilities rather than from their individual preferences as done in social choice approaches. This can be more appropriate when the individual preferences hide too much of the individuals' opinions that determined their preferences. We introduce three preference relations and investigate the relationships between the group preferences and individual and subgroup preferences.

ECAI Conference 2016 Conference Paper

Markov Logic Networks with Numerical Constraints

  • Melisachew Wudage Chekol
  • Jakob Huber
  • Christian Meilicke
  • Heiner Stuckenschmidt

Markov logic networks (MLNs) have proven to be useful tools for reasoning about uncertainty in complex knowledge bases. In this paper, we extend MLNs with numerical constraints and present an efficient implementation in terms of a cutting plane method. This extension is useful for reasoning over uncertain temporal data. To show the applicability of this extension, we enrich log-linear description logics (DLs) with concrete domains (datatypes). Thereby, allowing to reason over weighted DLs with datatypes. Moreover, we use the resulting formalism to reason about temporal assertions in DB-pedia, thus illustrating its practical use.

ECAI Conference 2016 Conference Paper

Schema-Based Debugging of Federated Data Sources

  • Andreas Nolle
  • Christian Meilicke
  • Melisachew Wudage Chekol
  • German Nemirovski
  • Heiner Stuckenschmidt

Information explosion leads to continuous growth of data distributed over different data sources. However, the increasing number of data sources increases the risk of inconsistency. In such a federative setting, description logics can be applied to define a central schema that serves as a conceptual view comprising and extending the semantics of each data source. Consequently, each data source is treated as a single knowledge base that is integrated in a federated knowledge base. Following this idea, we propose an approach for automated debugging of federated knowledge bases that targets the identification and repair of inconsistency. We report on experiments with a large distributed dataset from the domain of library science.

AAAI Conference 2013 Conference Paper

RockIt: Exploiting Parallelism and Symmetry for MAP Inference in Statistical Relational Models

  • Jan Noessner
  • Mathias Niepert
  • Heiner Stuckenschmidt

ROCKIT is a maximum a-posteriori (MAP) query engine for statistical relational models. MAP inference in graphical models is an optimization problem which can be compiled to integer linear programs (ILPs). We describe several advances in translating MAP queries to ILP instances and present the novel meta-algorithm cutting plane aggregation (CPA). CPA exploits local context-specific symmetries and bundles up sets of linear constraints. The resulting counting constraints lead to more compact ILPs and make the symmetry of the ground model more explicit to state-of-the-art ILP solvers. Moreover, ROCKIT parallelizes most parts of the MAP inference pipeline taking advantage of ubiquitous shared-memory multi-core architectures. We report on extensive experiments with Markov logic network (MLN) benchmarks showing that ROCKIT outperforms the state-of-the-art systems ALCHEMY, MARKOV THEBEAST, and TUFFY both in terms of efficiency and quality of results.

IJCAI Conference 2011 Conference Paper

Log-Linear Description Logics

  • Mathias Niepert
  • Jan Noessner
  • Heiner Stuckenschmidt

Log-linear description logics are a family of probabilistic logics integrating various concepts and methods from the areas of knowledge representation and reasoning and statistical relational AI. We define the syntax and semantics of log-linear description logics, describe a convenient representation as sets of first-order formulas, and discuss computational and algorithmic aspects of probabilistic queries in the language. The paper concludes with an experimental evaluation of an implementation of a log-linear DL reasoner.

AAAI Conference 2010 Conference Paper

A Probabilistic-Logical Framework for Ontology Matching

  • Mathias Niepert
  • Christian Meilicke
  • Heiner Stuckenschmidt

Ontology matching is the problem of determining correspondences between concepts, properties, and individuals of different heterogeneous ontologies. With this paper we present a novel probabilistic-logical framework for ontology matching based on Markov logic. We define the syntax and semantics and provide a formalization of the ontology matching problem within the framework. The approach has several advantages over existing methods such as ease of experimentation, incoherence mitigation during the alignment process, and the incorporation of a-priori confidence values. We show empirically that the approach is efficient and more accurate than existing matchers on an established ontology alignment benchmark dataset.

AAAI Conference 2007 Conference Paper

Partial Matchmaking using Approximate Subsumption

  • Heiner Stuckenschmidt

Description Logics, and in particular the web ontology language OWL has been proposed as an appropriate basis for computing matches between structured objects for the sake of information integration and service discovery. A drawback of the direct use of subsumption as a matching criterion is the inability to compute partial matches and qualify the degree of mismatch. In this paper, we describe a method for overcoming these problems that is based on approximate logical reasoning. In particular, we approximate the subsumption relation by defining the notion of subsumption with respect to a certain subset of the concept and relation names. We present the formal semantics of this relation, describe a sound and complete algorithm for computing approximate subsumption and discuss its application to matching tasks.

KER Journal 2003 Journal Article

Foreword: ontologies for distributed systems

  • Heiner Stuckenschmidt

The benefits of using ontologies have been recognised in many areas such as knowledge and content management, electronic commerce and recently the emerging field of the Semantic Web. These new applications can be seen as a great success of research in ontologies. On the other hand, moving into real application comes with new challenges that need to be addressed on a principled level rather than for specific applications. This special issue will be devoted to less well-explored topics that have come into focus recently as a response to the new problems we face when trying to use ontologies in heterogeneous distributed environments. These environments include the use of ontologies in peer-to-peer and pervasive computing systems.

IJCAI Conference 2003 Conference Paper

Integrity and Change in Modular Ontologies

  • Heiner Stuckenschmidt
  • Michel Klein

The benefits of modular representations arc well known from many areas of computer science. In this paper, we concentrate on the benefits of modular ontologies with respect to local containment of terminological reasoning. We define an architecture for modular ontologies that supports local reasoning by compiling implied subsumption relations. We further address the problem of guaranteeing the integrity of a modular ontology in the presence of local changes. We propose a strategy for analyzing changes and guiding the process of updating compiled information.