Author name cluster

Omer Reingold

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

49 papers

2 author rows

JBHI Journal 2026 Journal Article

A Post-Processing Fairness Mitigation Method for Medical Prediction Models

Noga Stern
Inbal Livni Navon
Omer Reingold
Eitan Bachmat
Noam Barda
Noa Dagan

A common challenge in medicine is the allocation of limited resources. Prediction models can assist in making resource allocation decisions, but they may also exhibit unfairness. We introduce a post-processing algorithm tailored to this scenario and compare its performance to modified pre-processing and in-processing algorithms. We developed predictors for in-hospital mortality, utilizing ICU data from two datasets. We then applied a post-processing algorithm we developed to enforce equal opportunity while adhering to a limited resources constraint and compared it to modified existing pre-processing and in-processing algorithms. The results were evaluated in terms of fairness and predictive performance, and the usability of the three algorithms was compared. All algorithms showed substantial improvement in the “equal opportunity” metric, presenting an average decrease of 52% in the span of sensitivity values across subgroups, and none of the algorithms consistently outperformed the others. In some cases enforcing fairness reduced predictive performance, with an average decrease of 4% in sensitivity and 3% in positive predictive value. However, there were usability differences between the algorithms: the pre-processing and post-processing algorithms preserve the numerical risk predictions, and only the post-processing algorithm does not require re-training to change the size of the intervention group. Our results demonstrate that all three methods enhance model fairness, with no single approach consistently outperforming the others. All three methods also achieve similar overall predictive performance, which in some cases is reduced compared to the base model. However, our post-processing algorithm offers practical advantages in usability compared to the alternatives.

Details DOI

TMLR Journal 2025 Journal Article

Fairness with respect to Stereotype Predictors: Impossibilities and Best Practices

Inbal Rachel Livni Navon
Omer Reingold
Judy Hanwen Shen

As AI systems increasingly influence decision-making from consumer recommendations to educational opportunities, their accountability becomes paramount. This need for oversight has driven extensive research into algorithmic fairness, a body of work that has examined both allocative and representational harms. However, numerous works examining representational harms such as stereotypes encompass many different concepts measured by different criteria, yielding many, potentially conflicting, characterizations of harm. The abundance of measurement approaches makes the mitigation of stereotypes in downstream machine learning models highly challenging. Our work introduces and unifies a broad class of auditors through the framework of \textit{stereotype predictors}. We map notions of fairness with respect to these predictors to existing notions of group fairness. We give guidance, with theoretical foundations, for selecting one or a set of stereotype predictors and provide algorithms for achieving fairness with respect to stereotype predictors under various fairness notions. We demonstrate the effectiveness of our algorithms with different stereotype predictors in two empirical case studies.

PDF Details

FOCS Conference 2025 Conference Paper

How Global Calibration Strengthens Multiaccuracy

Sílvia Casacuberta
Parikshit Gopalan
Varun Kanade
Omer Reingold

Multiaccuracy and multicalibration are multi-group fairness notions for prediction that have found numerous applications in learning and computational complexity [HKRR18]. They can be achieved from a single learning primitive: weak agnostic learning. A line of work starting from [GKR+22] has shown that multicalibration implies a very strong form of learning. Here we investigate the power of multiaccuracy as a learning primitive, both with and without the additional assumption of calibration. We find that multiaccuracy in itself is rather weak, but that the addition of global calibration (this notion is called calibrated multiaccuracy) boosts its power substantially, enough to recover implications that were previously known only assuming the stronger notion of multicalibration. We give evidence that multiaccuracy might not be as powerful as standard weak agnostic learning, by showing that there is no way to post-process a multiaccurate predictor to get a weak learner, even assuming the best hypothesis has correlation 1/2. Rather, we show that it yields a restricted form of weak agnostic learning, which requires some concept in the class to have correlation greater than 1/2 with the labels. However, by also requiring the predictor to be calibrated, we recover not just weak, but strong agnostic learning. A similar picture emerges when we consider the derivation of hardcore measures from predictors satisfying multigroup fairness notions [TTV09], [CDV24]. On the one hand, while multiaccuracy only yields hardcore measures of density half the optimal, we show that (a weighted version of) calibrated multiaccuracy achieves optimal density. Our results yield new insights into the complementary roles played by multiaccuracy and calibration in each setting. They shed light on why multiaccuracy and global calibration, although not particularly powerful by themselves, together yield considerably stronger notions.