Multi-Expert Distributionally Robust Optimization for Out-of-Distribution Generalization

Jinyong Jeong; Hyungu Kahng; Seoung Bum Kim

Back to NeurIPS

NeurIPS 2025

Multi-Expert Distributionally Robust Optimization for Out-of-Distribution Generalization

Conference Paper Main Conference Track Artificial Intelligence · Machine Learning

PDF Details

Abstract

Distribution shifts between training and test data undermine the reliability of deep neural networks, challenging real-world applications across domains and subpopulations. While distributionally robust optimization (DRO) methods like GroupDRO aim to improve robustness by optimizing worst-case performance over predefined groups, their use of a single global classifier can be restrictive when facing substantial inter-environment variability. We propose Multi-Expert Distributionally Robust Optimization (MEDRO), a novel extension of GroupDRO designed to address such complex shifts. MEDRO employs a shared feature extractor with $m$ environment-specific expert classifier heads, and introduces a min-max objective over all $m^{2}$ expert-environment pairings, explicitly modeling cross-environment risks. This expanded uncertainty set captures fine-grained distributional variations that a single classifier might overlook. Empirical evaluations on a range of standard distribution shift benchmarks demonstrate that MEDRO often achieves robust predictive performance compared to existing methods. Furthermore, MEDRO offers practical inference strategies, such as ensembling or gating mechanisms, for typical scenarios where environment labels are unavailable at test time. Our findings suggest MEDRO as a promising step toward resilient and generalizable machine learning under real-world distribution shifts.

Multi-Expert Distributionally Robust Optimization for Out-of-Distribution Generalization

Abstract

Authors

Keywords

Context