RePO: Understanding Preference Learning Through ReLU-Based Optimization

Junkang Wu; Kexin Huang; Xue Wang; Jinyang Gao; Bolin Ding; Jiancan Wu; Xiangnan He; Xiang Wang

Back to NeurIPS

NeurIPS 2025

RePO: Understanding Preference Learning Through ReLU-Based Optimization

Conference Paper Main Conference Track Artificial Intelligence · Machine Learning

PDF Details

Abstract

Preference learning has become a common approach in various recent methods for aligning large language models with human values. These methods optimize the preference margin between chosen and rejected responses, subject to certain constraints for avoiding over-optimization. In this paper, we report surprising empirical findings that simple ReLU activation can learn meaningful alignments even using \emph{none} of the following: (i) sigmoid-based gradient constraints, (ii) explicit regularization terms. Our experiments show that over-optimization does exist, but a threshold parameter $\gamma$ plays an essential role in preventing it by dynamically filtering training examples. We further provide theoretical analysis demonstrating that ReLU-based Preference Optimization (RePO) corresponds to the convex envelope of the 0-1 loss, establishing its fundamental soundness. Our ``RePO'' method achieves competitive or superior results compared to established preference optimization approaches. We hope this simple baseline will motivate researchers to rethink the fundamental mechanisms behind preference optimization for language model alignment.

RePO: Understanding Preference Learning Through ReLU-Based Optimization

Abstract

Authors

Keywords

Context