Related papers: A Scale-Independent Multi-Objective Reinforcement Learning with Convergence Analysis

A Scale-Independent Multi-Objective Reinforcement Learning with Convergence Analysis

URL: http://arxiv.org/abs/2302.04179v1
Date: Wed, 8 Feb 2023 16:38:55 GMT
Title: A Scale-Independent Multi-Objective Reinforcement Learning with Convergence Analysis
Authors: Mohsen Amidzadeh
Abstract summary: Many sequential decision-making problems need optimization of different objectives which possibly conflict with each other. We develop a single-agent scale-independent multi-objective reinforcement learning on the basis of the Advantage Actor-Critic (A2C) algorithm. A convergence analysis is then done for the devised multi-objective algorithm providing a convergence-in-mean guarantee.
Score: 0.6091702876917281
License: http://creativecommons.org/licenses/by-nc-sa/4.0/
Abstract: Many sequential decision-making problems need optimization of different objectives which possibly conflict with each other. The conventional way to deal with a multi-task problem is to establish a scalar objective function based on a linear combination of different objectives. However, for the case of having conflicting objectives with different scales, this method needs a trial-and-error approach to properly find proper weights for the combination. As such, in most cases, this approach cannot guarantee an optimal Pareto solution. In this paper, we develop a single-agent scale-independent multi-objective reinforcement learning on the basis of the Advantage Actor-Critic (A2C) algorithm. A convergence analysis is then done for the devised multi-objective algorithm providing a convergence-in-mean guarantee. We then perform some experiments over a multi-task problem to evaluate the performance of the proposed algorithm. Simulation results show the superiority of developed multi-objective A2C approach against the single-objective algorithm.

Related papers

A Comparison-Relationship-Surrogate Evolutionary Algorithm for Multi-Objective Optimization [0.0]
We propose a new evolutionary algorithm "CRSEA" which uses the comparison-relationship model. We find that CRSEA finds better converged solutions than the tested SAEAs on many medium-scale, biobjective problems.
arXiv Detail & Related papers (2025-04-28T01:39:38Z)
Towards Efficient Pareto Set Approximation via Mixture of Experts Based Model Fusion [53.33473557562837]
Solving multi-objective optimization problems for large deep neural networks is a challenging task due to the complexity of the loss landscape and the expensive computational cost. We propose a practical and scalable approach to solve this problem via mixture of experts (MoE) based model fusion. By ensembling the weights of specialized single-task models, the MoE module can effectively capture the trade-offs between multiple objectives.
arXiv Detail & Related papers (2024-06-14T07:16:18Z)
UCB-driven Utility Function Search for Multi-objective Reinforcement Learning [75.11267478778295]
In Multi-objective Reinforcement Learning (MORL) agents are tasked with optimising decision-making behaviours. We focus on the case of linear utility functions parameterised by weight vectors w. We introduce a method based on Upper Confidence Bound to efficiently search for the most promising weight vectors during different stages of the learning process.
arXiv Detail & Related papers (2024-05-01T09:34:42Z)
Sample-Efficient Multi-Agent RL: An Optimization Perspective [103.35353196535544]
We study multi-agent reinforcement learning (MARL) for the general-sum Markov Games (MGs) under the general function approximation. We introduce a novel complexity measure called the Multi-Agent Decoupling Coefficient (MADC) for general-sum MGs. We show that our algorithm provides comparable sublinear regret to the existing works.
arXiv Detail & Related papers (2023-10-10T01:39:04Z)
Mitigating Gradient Bias in Multi-objective Learning: A Provably Convergent Stochastic Approach [38.76462300149459]
We develop a Multi-objective Correction (MoCo) method for multi-objective gradient optimization. The unique feature of our method is that it can guarantee convergence without increasing the non fairness gradient.
arXiv Detail & Related papers (2022-10-23T05:54:26Z)
A Study of Scalarisation Techniques for Multi-Objective QUBO Solving [0.0]
Quantum and quantum-inspired optimisation algorithms have shown promising performance when applied to academic benchmarks as well as real-world problems. However, QUBO solvers are single objective solvers. To make them more efficient at solving problems with multiple objectives, a decision on how to convert such multi-objective problems to single-objective problems need to be made.
arXiv Detail & Related papers (2022-10-20T14:54:37Z)
Pareto Set Learning for Neural Multi-objective Combinatorial Optimization [6.091096843566857]
Multiobjective optimization (MOCO) problems can be found in many real-world applications. We develop a learning-based approach to approximate the whole Pareto set for a given MOCO problem without further search procedure. Our proposed method significantly outperforms some other methods on the multiobjective traveling salesman problem, multiconditioned vehicle routing problem and multi knapsack problem in terms of solution quality, speed, and model efficiency.
arXiv Detail & Related papers (2022-03-29T09:26:22Z)
A survey on multi-objective hyperparameter optimization algorithms for Machine Learning [62.997667081978825]
This article presents a systematic survey of the literature published between 2014 and 2020 on multi-objective HPO algorithms. We distinguish between metaheuristic-based algorithms, metamodel-based algorithms, and approaches using a mixture of both. We also discuss the quality metrics used to compare multi-objective HPO procedures and present future research directions.
arXiv Detail & Related papers (2021-11-23T10:22:30Z)
Provable Multi-Objective Reinforcement Learning with Generative Models [98.19879408649848]
We study the problem of single policy MORL, which learns an optimal policy given the preference of objectives. Existing methods require strong assumptions such as exact knowledge of the multi-objective decision process. We propose a new algorithm called model-based envelop value (EVI) which generalizes the enveloped multi-objective $Q$-learning algorithm.
arXiv Detail & Related papers (2020-11-19T22:35:31Z)
Momentum-based Gradient Methods in Multi-Objective Recommendation [30.894950420437926]
We create a multi-objective model-agnostic Adamize method for solving single-objective problems. We evaluate the benefits of Multi-objective Adamize on two multi-objective recommender systems and for three different objective combinations, both correlated or conflicting.
arXiv Detail & Related papers (2020-09-10T07:12:21Z)
Follow the bisector: a simple method for multi-objective optimization [65.83318707752385]
We consider optimization problems, where multiple differentiable losses have to be minimized. The presented method computes descent direction in every iteration to guarantee equal relative decrease of objective functions.
arXiv Detail & Related papers (2020-07-14T09:50:33Z)

This list is automatically generated from the titles and abstracts of the papers in this site.