NAI 2026
A Neurosymbolic Approach to Counterfactual Fairness
Abstract
Integrating fairness into machine learning models has been an important consideration for the last decade. Here, neurosymbolic models offer a valuable opportunity, as they allow the specification of symbolic, logical constraints that are often guaranteed to be satisfied. However, research on neurosymbolic applications to algorithmic fairness is still in an early stage. In this work, we bridge this gap by integrating counterfactual fairness into the neurosymbolic framework of logic tensor networks (LTN). We use LTN to express accuracy and counterfactual fairness constraints in first-order logic and employ them to achieve desirable levels of both performance and fairness at training time. Our approach is agnostic to the underlying causal model and data generation technique; for this reason, it may be easily integrated into existing pipelines that generate and extract counterfactual examples. We show, through concrete examples on three benchmark datasets, that logical reasoning about counterfactual fairness has some important advantages, among which its intrinsic interpretability, and its flexibility in handling subgroup fairness. Compared to three recent methodologies in counterfactual fairness, our experiments show that a neurosymbolic, LTN-based approach attains better levels of counterfactual fairness.
Authors
Keywords
No keywords are indexed for this paper.
Context
- Venue
- Neurosymbolic Artificial Intelligence
- Archive span
- 2024-2026
- Indexed papers
- 43
- Paper id
- 477175672234548921