Arrow Research search
Back to NeurIPS

NeurIPS 2025

Preference Optimization on Pareto Sets: On a Theory of Multi-Objective Optimization

Conference Paper Main Conference Track Artificial Intelligence ยท Machine Learning

Abstract

In multi-objective optimization, a single decision vector must balance the trade-offs across many objectives. Pareto-optimal solutions are those achieving optimal trade-offs, where improving any objective comes at a cost to another. As many different decisions can be Pareto optimal, this raises the question of which solution to pick and how. We formulate this problem as one of optimizing a preference function over the set of Pareto-optimal solutions, or Pareto-constrained optimization for short. It poses significant challenges: not only is the constraint set defined implicitly, but it is also generally non-convex and non-smooth, even when the objectives are strongly convex. We propose an equivalent formulation of the problem where the constraint set is the simplex, leading to clearer notions of optimality and stationarity that improve upon existing definitions in literature. We give an algorithm with a last-iterate convergence rate of $O(K^{-1/2})$ to stationarity when the preference function is Lipschitz smooth and when the objective functions are strongly convex and Lipschitz smooth. Motivated by applications like Reinforcement Learning with Human Feedback (RLHF), we also extend this algorithm to the case where access to the preference function is only available through dueling feedback.

Authors

Keywords

No keywords are indexed for this paper.

Context

Venue
Annual Conference on Neural Information Processing Systems
Archive span
1987-2025
Indexed papers
30776
Paper id
878033519777120356