The Max-Min Formulation of Multi-Objective Reinforcement Learning: From Theory to a Model-Free Algorithm
- URL: http://arxiv.org/abs/2406.07826v1
- Date: Wed, 12 Jun 2024 02:47:54 GMT
- Title: The Max-Min Formulation of Multi-Objective Reinforcement Learning: From Theory to a Model-Free Algorithm
- Authors: Giseung Park, Woohyeon Byeon, Seongmin Kim, Elad Havakuk, Amir Leshem, Youngchul Sung,
- Abstract summary: We consider multi-objective reinforcement learning, which arises in many real-world problems with multiple optimization goals.
We develop a relevant theory and a practical model-free algorithm under the max-min framework.
- Score: 21.36281978932632
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In this paper, we consider multi-objective reinforcement learning, which arises in many real-world problems with multiple optimization goals. We approach the problem with a max-min framework focusing on fairness among the multiple goals and develop a relevant theory and a practical model-free algorithm under the max-min framework. The developed theory provides a theoretical advance in multi-objective reinforcement learning, and the proposed algorithm demonstrates a notable performance improvement over existing baseline methods.
Related papers
- Coding for Intelligence from the Perspective of Category [66.14012258680992]
Coding targets compressing and reconstructing data, and intelligence.
Recent trends demonstrate the potential homogeneity of these two fields.
We propose a novel problem of Coding for Intelligence from the category theory view.
arXiv Detail & Related papers (2024-07-01T07:05:44Z) - Towards Principled Task Grouping for Multi-Task Learning [14.3385939018772]
We present a novel approach to task grouping in Multitask Learning (MTL)
Our approach offers a more theoretically grounded method that does not rely on restrictive assumptions for constructing transfer gains.
arXiv Detail & Related papers (2024-02-23T13:51:20Z) - Hierarchical Optimization-Derived Learning [58.69200830655009]
We establish a new framework, named Hierarchical ODL (HODL), to simultaneously investigate the intrinsic behaviors of optimization-derived model construction and its corresponding learning process.
This is the first theoretical guarantee for these two coupled ODL components: optimization and learning.
arXiv Detail & Related papers (2023-02-11T03:35:13Z) - Multi-Objective Policy Gradients with Topological Constraints [108.10241442630289]
We present a new algorithm for a policy gradient in TMDPs by a simple extension of the proximal policy optimization (PPO) algorithm.
We demonstrate this on a real-world multiple-objective navigation problem with an arbitrary ordering of objectives both in simulation and on a real robot.
arXiv Detail & Related papers (2022-09-15T07:22:58Z) - MODRL/D-EL: Multiobjective Deep Reinforcement Learning with Evolutionary
Learning for Multiobjective Optimization [10.614594804236893]
This paper proposes a multiobjective deep reinforcement learning with evolutionary learning algorithm for a typical complex problem called the multiobjective vehicle routing problem with time windows.
The experimental results on MO-VRPTW instances demonstrate the superiority of the proposed algorithm over other learning-based and iterative-based approaches.
arXiv Detail & Related papers (2021-07-16T15:22:20Z) - Decentralized Personalized Federated Learning for Min-Max Problems [79.61785798152529]
This paper is the first to study PFL for saddle point problems encompassing a broader range of optimization problems.
We propose new algorithms to address this problem and provide a theoretical analysis of the smooth (strongly) convex-(strongly) concave saddle point problems.
Numerical experiments for bilinear problems and neural networks with adversarial noise demonstrate the effectiveness of the proposed methods.
arXiv Detail & Related papers (2021-06-14T10:36:25Z) - Discriminator Augmented Model-Based Reinforcement Learning [47.094522301093775]
It is common in practice for the learned model to be inaccurate, impairing planning and leading to poor performance.
This paper aims to improve planning with an importance sampling framework that accounts for discrepancy between the true and learned dynamics.
arXiv Detail & Related papers (2021-03-24T06:01:55Z) - Investigating Bi-Level Optimization for Learning and Vision from a
Unified Perspective: A Survey and Beyond [114.39616146985001]
In machine learning and computer vision fields, despite the different motivations and mechanisms, a lot of complex problems contain a series of closely related subproblms.
In this paper, we first uniformly express these complex learning and vision problems from the perspective of Bi-Level Optimization (BLO)
Then we construct a value-function-based single-level reformulation and establish a unified algorithmic framework to understand and formulate mainstream gradient-based BLO methodologies.
arXiv Detail & Related papers (2021-01-27T16:20:23Z) - Provable Multi-Objective Reinforcement Learning with Generative Models [98.19879408649848]
We study the problem of single policy MORL, which learns an optimal policy given the preference of objectives.
Existing methods require strong assumptions such as exact knowledge of the multi-objective decision process.
We propose a new algorithm called model-based envelop value (EVI) which generalizes the enveloped multi-objective $Q$-learning algorithm.
arXiv Detail & Related papers (2020-11-19T22:35:31Z) - Improving Few-Shot Learning through Multi-task Representation Learning
Theory [14.8429503385929]
We consider the framework of multi-task representation (MTR) learning where the goal is to use source tasks to learn a representation that reduces the sample complexity of solving a target task.
We show that recent advances in MTR theory can provide novel insights for popular meta-learning algorithms when analyzed within this framework.
This is the first contribution that puts the most recent learning bounds of MTR theory into practice for the task of few-shot classification.
arXiv Detail & Related papers (2020-10-05T13:24:43Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.