A Reduction-based Framework for Sequential Decision Making with Delayed Feedback

Yunchang Yang; Han Zhong; Tianhao Wu; Bin Liu; Liwei Wang; Simon S. Du

Back to NeurIPS

NeurIPS 2023

A Reduction-based Framework for Sequential Decision Making with Delayed Feedback

Conference Paper Main Conference Track Artificial Intelligence · Machine Learning

PDF Details

Abstract

We study stochastic delayed feedback in general single-agent and multi-agent sequential decision making, which includes bandits, single-agent Markov decision processes (MDPs), and Markov games (MGs). We propose a novel reduction-based framework, which turns any multi-batched algorithm for sequential decision making with instantaneous feedback into a sample-efficient algorithm that can handle stochastic delays in sequential decision making. By plugging different multi-batched algorithms into our framework, we provide several examples demonstrating that our framework not only matches or improves existing results for bandits, tabular MDPs, and tabular MGs, but also provides the first line of studies on delays in sequential decision making with function approximation. In summary, we provide a complete set of sharp results for single-agent and multi-agent sequential decision making with delayed feedback.

Authors

Keywords

No keywords are indexed for this paper.

Context

Venue: Annual Conference on Neural Information Processing Systems
Archive span: 1987-2025
Indexed papers: 30776
Paper id: 954611905060675817