Divide and Conquer: Provably Unveiling the Pareto Front with Multi-Objective Reinforcement Learning
- URL: http://arxiv.org/abs/2402.07182v3
- Date: Thu, 06 Feb 2025 07:55:20 GMT
- Title: Divide and Conquer: Provably Unveiling the Pareto Front with Multi-Objective Reinforcement Learning
- Authors: Willem Röpke, Mathieu Reymond, Patrick Mannion, Diederik M. Roijers, Ann Nowé, Roxana Rădulescu,
- Abstract summary: We introduce Iterated Pareto Referent optimisation (IPRO)
IPRO decomposes finding the Pareto front into a sequence of constrained single-objective problems.
By leveraging problem-specific single-objective solvers, our approach holds promise for applications beyond multi-objective reinforcement learning.
- Score: 5.897578963773195
- License:
- Abstract: An important challenge in multi-objective reinforcement learning is obtaining a Pareto front of policies to attain optimal performance under different preferences. We introduce Iterated Pareto Referent Optimisation (IPRO), which decomposes finding the Pareto front into a sequence of constrained single-objective problems. This enables us to guarantee convergence while providing an upper bound on the distance to undiscovered Pareto optimal solutions at each step. We evaluate IPRO using utility-based metrics and its hypervolume and find that it matches or outperforms methods that require additional assumptions. By leveraging problem-specific single-objective solvers, our approach also holds promise for applications beyond multi-objective reinforcement learning, such as planning and pathfinding.
Related papers
- Self-Improvement Towards Pareto Optimality: Mitigating Preference Conflicts in Multi-Objective Alignment [74.25832963097658]
Multi-Objective Alignment (MOA) aims to align responses with multiple human preference objectives.
We find that DPO-based MOA approaches suffer from widespread preference conflicts in the data.
arXiv Detail & Related papers (2025-02-20T08:27:00Z) - Deep Pareto Reinforcement Learning for Multi-Objective Recommender Systems [60.91599969408029]
optimizing multiple objectives simultaneously is an important task for recommendation platforms.
Existing multi-objective recommender systems do not systematically consider such dynamic relationships.
arXiv Detail & Related papers (2024-07-04T02:19:49Z) - UCB-driven Utility Function Search for Multi-objective Reinforcement Learning [75.11267478778295]
In Multi-objective Reinforcement Learning (MORL) agents are tasked with optimising decision-making behaviours.
We focus on the case of linear utility functions parameterised by weight vectors w.
We introduce a method based on Upper Confidence Bound to efficiently search for the most promising weight vectors during different stages of the learning process.
arXiv Detail & Related papers (2024-05-01T09:34:42Z) - UMOEA/D: A Multiobjective Evolutionary Algorithm for Uniform Pareto
Objectives based on Decomposition [19.13435817442015]
Multiobjective optimization (MOO) is prevalent in numerous applications.
Previous methods commonly utilize the set of Pareto objectives (particles on the PF) to represent the entire PF.
We suggest in this paper constructing emphuniformly distributed objectives on PF, so as to alleviate the limited diversity found in previous MOO approaches.
arXiv Detail & Related papers (2024-02-14T08:09:46Z) - Optimizing Solution-Samplers for Combinatorial Problems: The Landscape
of Policy-Gradient Methods [52.0617030129699]
We introduce a novel theoretical framework for analyzing the effectiveness of DeepMatching Networks and Reinforcement Learning methods.
Our main contribution holds for a broad class of problems including Max-and Min-Cut, Max-$k$-Bipartite-Bi, Maximum-Weight-Bipartite-Bi, and Traveling Salesman Problem.
As a byproduct of our analysis we introduce a novel regularization process over vanilla descent and provide theoretical and experimental evidence that it helps address vanishing-gradient issues and escape bad stationary points.
arXiv Detail & Related papers (2023-10-08T23:39:38Z) - Pareto Manifold Learning: Tackling multiple tasks via ensembles of
single-task models [50.33956216274694]
In Multi-Task Learning (MTL), tasks may compete and limit the performance achieved on each other, rather than guiding the optimization to a solution.
We propose textitPareto Manifold Learning, an ensembling method in weight space.
arXiv Detail & Related papers (2022-10-18T11:20:54Z) - Learning Proximal Operators to Discover Multiple Optima [66.98045013486794]
We present an end-to-end method to learn the proximal operator across non-family problems.
We show that for weakly-ized objectives and under mild conditions, the method converges globally.
arXiv Detail & Related papers (2022-01-28T05:53:28Z) - Scalable Uni-directional Pareto Optimality for Multi-Task Learning with
Constraints [4.4044968357361745]
We propose a scalable MOO solver for Multi-Objective (MOO) problems, including support for optimization under constraints.
An important application of this is to estimate high-dimensional runtime for neural classification tasks.
arXiv Detail & Related papers (2021-10-28T21:35:59Z) - Multi-Objective Learning to Predict Pareto Fronts Using Hypervolume
Maximization [0.0]
Real-world problems are often multi-objective with decision-makers unable to specify a priori which trade-off between the conflicting objectives is preferable.
We propose a novel learning approach to estimate the Pareto front by maximizing the dominated hypervolume (HV) of the average loss vectors corresponding to a set of learners.
In our approach, the set of learners are trained multi-objectively with a dynamic loss function, wherein each learner's losses are weighted by their HV maximizing gradients.
Experiments on three different multi-objective tasks show that the outputs of the set of learners are indeed well-spread on
arXiv Detail & Related papers (2021-02-08T20:41:21Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.