Embarrassingly Parallel GFlowNets

Tiago da Silva; Luiz Max Carvalho; Amauri H. Souza Jr.; Samuel Kaski; Diego Mesquita

Back to ICML

ICML 2024

Embarrassingly Parallel GFlowNets

Conference Paper Accept (Poster) Artificial Intelligence · Machine Learning

Details

Abstract

GFlowNets are a promising alternative to MCMC sampling for discrete compositional random variables. Training GFlowNets requires repeated evaluations of the unnormalized target distribution, or reward function. However, for large-scale posterior sampling, this may be prohibitive since it incurs traversing the data several times. Moreover, if the data are distributed across clients, employing standard GFlowNets leads to intensive client-server communication. To alleviate both these issues, we propose embarrassingly parallel GFlowNet (EP-GFlowNet). EP-GFlowNet is a provably correct divide-and-conquer method to sample from product distributions of the form $R(\cdot) \propto R_1(\cdot). .. R_N(\cdot)$ — e. g. , in parallel or federated Bayes, where each $R_n$ is a local posterior defined on a data partition. First, in parallel, we train a local GFlowNet targeting each $R_n$ and send the resulting models to the server. Then, the server learns a global GFlowNet by enforcing our newly proposed aggregating balance condition, requiring a single communication step. Importantly, EP-GFlowNets can also be applied to multi-objective optimization and model reuse. Our experiments illustrate the effectiveness of EP-GFlowNets on multiple tasks, including parallel Bayesian phylogenetics, multi-objective multiset and sequence generation, and federated Bayesian structure learning.

Embarrassingly Parallel GFlowNets

Abstract

Authors

Keywords

Context