Author name cluster

Uri Shalit

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

31 papers

2 author rows

ICML Conference 2025 Conference Paper

Heterogeneous Treatment Effect in Time-to-Event Outcomes: Harnessing Censored Data with Recursively Imputed Trees

Tomer Meir
Uri Shalit
Malka Gorfine

Tailoring treatments to individual needs is a central goal in fields such as medicine. A key step toward this goal is estimating Heterogeneous Treatment Effects (HTE)—the way treatments impact different subgroups. While crucial, HTE estimation is challenging with survival data, where time until an event (e. g. , death) is key. Existing methods often assume complete observation, an assumption violated in survival data due to right-censoring, leading to bias and inefficiency. Cui et al. (2023) proposed a doubly-robust method for HTE estimation in survival data under no hidden confounders, combining a causal survival forest with an augmented inverse-censoring weighting estimator. However, we find it struggles under heavy censoring, which is common in rare-outcome problems such as Amyotrophic lateral sclerosis (ALS). Moreover, most current methods cannot handle instrumental variables, which are a crucial tool in the causal inference arsenal. We introduce Multiple Imputation for Survival Treatment Response (MISTR), a novel, general, and non-parametric method for estimating HTE in survival data. MISTR uses recursively imputed survival trees to handle censoring without directly modeling the censoring mechanism. Through extensive simulations and analysis of two real-world datasets—the AIDS Clinical Trials Group Protocol 175 and the Illinois unemployment dataset we show that MISTR outperforms prior methods under heavy censoring in the no-hidden-confounders setting, and extends to the instrumental variable setting. To our knowledge, MISTR is the first non-parametric approach for HTE estimation with unobserved confounders via instrumental variables.

Details

ICML Conference 2025 Conference Paper

Set Valued Predictions For Robust Domain Generalization

Ron Tsibulsky
Daniel Nevo
Uri Shalit

Despite the impressive advancements in modern machine learning, achieving robustness in Domain Generalization (DG) tasks remains a significant challenge. In DG, models are expected to perform well on samples from unseen test distributions (also called domains), by learning from multiple related training distributions. Most existing approaches to this problem rely on single-valued predictions, which inherently limit their robustness. We argue that set-valued predictors could be leveraged to enhance robustness across unseen domains, while also taking into account that these sets should be as small as possible. We introduce a theoretical framework defining successful set prediction in the DG setting, focusing on meeting a predefined performance criterion across as many domains as possible, and provide theoretical insights into the conditions under which such domain generalization is achievable. We further propose a practical optimization method compatible with modern learning architectures, that balances robust performance on unseen domains with small prediction set sizes. We evaluate our approach on several real-world datasets from the WILDS benchmark, demonstrating its potential as a promising direction for robust domain generalization.

Details

NeurIPS Conference 2024 Conference Paper

When to Act and When to Ask: Policy Learning With Deferral Under Hidden Confounding

Marah Ghoummaid
Uri Shalit

We consider the task of learning how to act in collaboration with a human expert based on observational data. The task is motivated by high-stake scenarios such as healthcare and welfare where algorithmic action recommendations are made to a human expert, opening the option of deferring making a recommendation in cases where the human might act better on their own. This task is especially challenging when dealing with observational data, as using such data runs the risk of hidden confounders whose existence can lead to biased and harmful policies. However, unlike standard policy learning, the presence of a human expert can mitigate some of these risks. We build on the work of Mozannar and Sontag (2020) on consistent surrogate loss for learning with the option of deferral to an expert, where they solve a cost-sensitive supervised classification problem. Since we are solving a causal problem, where labels don’t exist, we use a causal model to learn costs which are robust to a bounded degree of hidden confounding. We prove that our approach can take advantage of the strengths of both the model and the expert to obtain a better policy than either. We demonstrate our results by conducting experiments on synthetic and semi-synthetic data and show the advantages of our method compared to baselines.

PDF Details DOI

ICML Conference 2023 Conference Paper

B-Learner: Quasi-Oracle Bounds on Heterogeneous Causal Effects Under Hidden Confounding

Miruna Oprescu
Jacob Dorn
Marah Ghoummaid
Andrew Jesson
Nathan Kallus
Uri Shalit

Estimating heterogeneous treatment effects from observational data is a crucial task across many fields, helping policy and decision-makers take better actions. There has been recent progress on robust and efficient methods for estimating the conditional average treatment effect (CATE) function, but these methods often do not take into account the risk of hidden confounding, which could arbitrarily and unknowingly bias any causal estimate based on observational data. We propose a meta-learner called the B-Learner, which can efficiently learn sharp bounds on the CATE function under limits on the level of hidden confounding. We derive the B-Learner by adapting recent results for sharp and valid bounds of the average treatment effect (Dorn et al. , 2021) into the framework given by Kallus & Oprescu (2023) for robust and model-agnostic learning of conditional distributional treatment effects. The B-Learner can use any function estimator such as random forests and deep neural networks, and we prove its estimates are valid, sharp, efficient, and have a quasi-oracle property with respect to the constituent estimators under more general conditions than existing methods. Semi-synthetic experimental comparisons validate the theoretical findings, and we use real-world data demonstrate how the method might be used in practice.

Details

ICLR Conference 2023 Conference Paper

Malign Overfitting: Interpolation and Invariance are Fundamentally at Odds

Yoav Wald
Gal Yona
Uri Shalit
Yair Carmon

Learned classifiers should often possess certain invariance properties meant to encourage fairness, robustness, or out-of-distribution generalization. However, multiple recent works empirically demonstrate that common invariance-inducing regularizers are ineffective in the over-parameterized regime, in which classifiers perfectly fit (i.e. interpolate) the training data. This suggests that the phenomenon of ``benign overfitting," in which models generalize well despite interpolating, might not favorably extend to settings in which robustness or fairness are desirable. In this work, we provide a theoretical justification for these observations. We prove that---even in the simplest of settings---any interpolating learning rule (with an arbitrarily small margin) will not satisfy these invariance properties. We then propose and analyze an algorithm that---in the same setting---successfully learns a non-interpolating classifier that is provably invariant. We validate our theoretical observations on simulated data and the Waterbirds dataset.

Details

JMLR Journal 2022 Journal Article

Generalization Bounds and Representation Learning for Estimation of Potential Outcomes and Causal Effects

Fredrik D. Johansson
Uri Shalit
Nathan Kallus
David Sontag

Practitioners in diverse fields such as healthcare, economics and education are eager to apply machine learning to improve decision making. The cost and impracticality of performing experiments and a recent monumental increase in electronic record keeping has brought attention to the problem of evaluating decisions based on non-experimental observational data. This is the setting of this work. In particular, we study estimation of individual-level potential outcomes and causal effects---such as a single patient's response to alternative medication---from recorded contexts, decisions and outcomes. We give generalization bounds on the error in estimated outcomes based on distributional distance measures between re-weighted samples of groups receiving different treatments. We provide conditions under which our bounds are tight and show how they relate to results for unsupervised domain adaptation. Led by our theoretical results, we devise algorithms which learn representations and weighting functions that minimize our bounds by regularizing the representation's induced treatment group distance, and encourage sharing of information between treatment groups. Finally, an experimental evaluation on real and synthetic data shows the value of our proposed representation architecture and regularization scheme. [abs] [ pdf ][ bib ] &copy JMLR 2022. ( edit, beta )