Rating-Based Reinforcement Learning

Devin White; Mingkang Wu; Ellen Novoseller; Vernon J. Lawhern; Nicholas Waytowich; Yongcan Cao

doi:10.1609/aaai.v38i9.28886

Back to AAAI

AAAI 2024

Rating-Based Reinforcement Learning

Conference Paper AAAI Technical Track on Humans and AI Artificial Intelligence

PDF Details DOI

Abstract

This paper develops a novel rating-based reinforcement learning approach that uses human ratings to obtain human guidance in reinforcement learning. Different from the existing preference-based and ranking-based reinforcement learning paradigms, based on human relative preferences over sample pairs, the proposed rating-based reinforcement learning approach is based on human evaluation of individual trajectories without relative comparisons between sample pairs. The rating-based reinforcement learning approach builds on a new prediction model for human ratings and a novel multi-class loss function. We conduct several experimental studies based on synthetic ratings and real human ratings to evaluate the effectiveness and benefits of the new rating-based reinforcement learning approach.

Authors

Keywords

HAI: Human-in-the-loop Machine Learning
HAI: Learning Human Values and Preferences

Context

Venue: AAAI Conference on Artificial Intelligence
Archive span: 1980-2026
Indexed papers: 28718
Paper id: 957407415633277252