Divide and Conquer: Provably Unveiling the Pareto Front with
Multi-Objective Reinforcement Learning
- URL: http://arxiv.org/abs/2402.07182v1
- Date: Sun, 11 Feb 2024 12:35:13 GMT
- Title: Divide and Conquer: Provably Unveiling the Pareto Front with
Multi-Objective Reinforcement Learning
- Authors: Willem R\"opke, Mathieu Reymond, Patrick Mannion, Diederik M. Roijers,
Ann Now\'e, Roxana R\u{a}dulescu
- Abstract summary: We introduce IPRO, a principled algorithm that decomposes the task of finding the Pareto front into a sequence of single-objective problems.
Empirical evaluations demonstrate that IPRO matches or outperforms methods that require additional domain knowledge.
By leveraging problem-specific single-objective solvers, our approach holds promise for applications beyond multi-objective reinforcement learning.
- Score: 2.5115843173830252
- License: http://creativecommons.org/publicdomain/zero/1.0/
- Abstract: A significant challenge in multi-objective reinforcement learning is
obtaining a Pareto front of policies that attain optimal performance under
different preferences. We introduce Iterated Pareto Referent Optimisation
(IPRO), a principled algorithm that decomposes the task of finding the Pareto
front into a sequence of single-objective problems for which various solution
methods exist. This enables us to establish convergence guarantees while
providing an upper bound on the distance to undiscovered Pareto optimal
solutions at each step. Empirical evaluations demonstrate that IPRO matches or
outperforms methods that require additional domain knowledge. By leveraging
problem-specific single-objective solvers, our approach also holds promise for
applications beyond multi-objective reinforcement learning, such as in
pathfinding and optimisation.
Related papers
- How to Find the Exact Pareto Front for Multi-Objective MDPs? [28.70863169250383]
Multi-objective Markov Decision Processes (MDPs) are receiving increasing attention, as real-world decision-making problems often involve conflicting objectives that cannot be addressed by a single-objective MDP.
The Pareto front identifies the set of policies that cannot be dominated, providing a foundation for finding optimal solutions that can efficiently adapt to various preferences.
arXiv Detail & Related papers (2024-10-21T01:03:54Z) - UCB-driven Utility Function Search for Multi-objective Reinforcement Learning [75.11267478778295]
In Multi-objective Reinforcement Learning (MORL) agents are tasked with optimising decision-making behaviours.
We focus on the case of linear utility functions parameterised by weight vectors w.
We introduce a method based on Upper Confidence Bound to efficiently search for the most promising weight vectors during different stages of the learning process.
arXiv Detail & Related papers (2024-05-01T09:34:42Z) - UMOEA/D: A Multiobjective Evolutionary Algorithm for Uniform Pareto
Objectives based on Decomposition [19.13435817442015]
Multiobjective optimization (MOO) is prevalent in numerous applications.
Previous methods commonly utilize the set of Pareto objectives (particles on the PF) to represent the entire PF.
We suggest in this paper constructing emphuniformly distributed objectives on PF, so as to alleviate the limited diversity found in previous MOO approaches.
arXiv Detail & Related papers (2024-02-14T08:09:46Z) - Optimizing Solution-Samplers for Combinatorial Problems: The Landscape
of Policy-Gradient Methods [52.0617030129699]
We introduce a novel theoretical framework for analyzing the effectiveness of DeepMatching Networks and Reinforcement Learning methods.
Our main contribution holds for a broad class of problems including Max-and Min-Cut, Max-$k$-Bipartite-Bi, Maximum-Weight-Bipartite-Bi, and Traveling Salesman Problem.
As a byproduct of our analysis we introduce a novel regularization process over vanilla descent and provide theoretical and experimental evidence that it helps address vanishing-gradient issues and escape bad stationary points.
arXiv Detail & Related papers (2023-10-08T23:39:38Z) - A Scale-Independent Multi-Objective Reinforcement Learning with
Convergence Analysis [0.6091702876917281]
Many sequential decision-making problems need optimization of different objectives which possibly conflict with each other.
We develop a single-agent scale-independent multi-objective reinforcement learning on the basis of the Advantage Actor-Critic (A2C) algorithm.
A convergence analysis is then done for the devised multi-objective algorithm providing a convergence-in-mean guarantee.
arXiv Detail & Related papers (2023-02-08T16:38:55Z) - Pareto Manifold Learning: Tackling multiple tasks via ensembles of
single-task models [50.33956216274694]
In Multi-Task Learning (MTL), tasks may compete and limit the performance achieved on each other, rather than guiding the optimization to a solution.
We propose textitPareto Manifold Learning, an ensembling method in weight space.
arXiv Detail & Related papers (2022-10-18T11:20:54Z) - Learning Proximal Operators to Discover Multiple Optima [66.98045013486794]
We present an end-to-end method to learn the proximal operator across non-family problems.
We show that for weakly-ized objectives and under mild conditions, the method converges globally.
arXiv Detail & Related papers (2022-01-28T05:53:28Z) - Instance-Dependent Confidence and Early Stopping for Reinforcement
Learning [99.57168572237421]
Various algorithms for reinforcement learning (RL) exhibit dramatic variation in their convergence rates as a function of problem structure.
This research provides guarantees that explain textitex post the performance differences observed.
A natural next step is to convert these theoretical guarantees into guidelines that are useful in practice.
arXiv Detail & Related papers (2022-01-21T04:25:35Z) - Scalable Uni-directional Pareto Optimality for Multi-Task Learning with
Constraints [4.4044968357361745]
We propose a scalable MOO solver for Multi-Objective (MOO) problems, including support for optimization under constraints.
An important application of this is to estimate high-dimensional runtime for neural classification tasks.
arXiv Detail & Related papers (2021-10-28T21:35:59Z) - Pareto Multi-Task Learning [53.90732663046125]
Multi-task learning is a powerful method for solving multiple correlated tasks simultaneously.
It is often impossible to find one single solution to optimize all the tasks, since different tasks might conflict with each other.
Recently, a novel method is proposed to find one single Pareto optimal solution with good trade-off among different tasks by casting multi-task learning as multiobjective optimization.
arXiv Detail & Related papers (2019-12-30T08:58:40Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.