Peer Nowack Papers

NeurIPS Conference 2025 Conference Paper

Causal Climate Emulation with Bayesian Filtering

Sebastian H. M. Hickman
Ilija Trajković
Julia Kaltenborn
Francis Pelletier
Alex Archibald
Yaniv Gurwicz
Peer Nowack
David Rolnick

Traditional models of climate change use complex systems of coupled equations to simulate physical processes across the Earth system. These simulations are highly computationally expensive, limiting our predictions of climate change and analyses of its causes and effects. Machine learning has the potential to quickly emulate data from climate models, but current approaches are not able to incorporate physically-based causal relationships. Here, we develop an interpretable climate model emulator based on causal representation learning. We derive a novel approach including a Bayesian filter for stable long-term autoregressive emulation. We demonstrate that our emulator learns accurate climate dynamics, and we show the importance of each one of its components on a realistic synthetic dataset and data from two widely deployed climate models.

PDF Details

NeurIPS Conference 2023 Conference Paper

ClimateSet: A Large-Scale Climate Model Dataset for Machine Learning

Julia Kaltenborn
Charlotte Lange
Venkatesh Ramesh
Philippe Brouillard
Yaniv Gurwicz
Chandni Nagda
Jakob Runge
Peer Nowack

Climate models have been key for assessing the impact of climate change and simulating future climate scenarios. The machine learning (ML) community has taken an increased interest in supporting climate scientists’ efforts on various tasks such as climate model emulation, downscaling, and prediction tasks. Many of those tasks have been addressed on datasets created with single climate models. However, both the climate science and ML communities have suggested that to address those tasks at scale, we need large, consistent, and ML-ready climate model datasets. Here, we introduce ClimateSet, a dataset containing the inputs and outputs of 36 climate models from the Input4MIPs and CMIP6 archives. In addition, we provide a modular dataset pipeline for retrieving and preprocessing additional climate models and scenarios. We showcase the potential of our dataset by using it as a benchmark for ML-based climate model emulation. We gain new insights about the performance and generalization capabilities of the different ML models by analyzing their performance across different climate models. Furthermore, the dataset can be used to train an ML emulator on several climate models instead of just one. Such a “super emulator” can quickly project new climate change scenarios, complementing existing scenarios already provided to policymakers. We believe ClimateSet will create the basis needed for the ML community to tackle climate-related tasks at scale.

PDF Details

Possible papers

Causal Climate Emulation with Bayesian Filtering

ClimateSet: A Large-Scale Climate Model Dataset for Machine Learning