Discovering Diverse Solutions in Deep Reinforcement Learning
- URL: http://arxiv.org/abs/2103.07084v1
- Date: Fri, 12 Mar 2021 04:54:31 GMT
- Title: Discovering Diverse Solutions in Deep Reinforcement Learning
- Authors: Takayuki Osa, Voot Tangkaratt and Masashi Sugiyama
- Abstract summary: Reinforcement learning algorithms are typically limited to learning a single solution of a specified task.
We propose an RL method that can learn infinitely many solutions by training a policy conditioned on a continuous or discrete low-dimensional latent variable.
- Score: 84.45686627019408
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Reinforcement learning (RL) algorithms are typically limited to learning a
single solution of a specified task, even though there often exists diverse
solutions to a given task. Compared with learning a single solution, learning a
set of diverse solutions is beneficial because diverse solutions enable robust
few-shot adaptation and allow the user to select a preferred solution. Although
previous studies have showed that diverse behaviors can be modeled with a
policy conditioned on latent variables, an approach for modeling an infinite
set of diverse solutions with continuous latent variables has not been
investigated. In this study, we propose an RL method that can learn infinitely
many solutions by training a policy conditioned on a continuous or discrete
low-dimensional latent variable. Through continuous control tasks, we
demonstrate that our method can learn diverse solutions in a data-efficient
manner and that the solutions can be used for few-shot adaptation to solve
unseen tasks.
Related papers
- Discovering Multiple Solutions from a Single Task in Offline Reinforcement Learning [51.00472376469131]
We propose an algorithm that learns multiple solutions from a single task in offline reinforcement learning.
Our experimental results show that the proposed algorithm learns multiple qualitatively and quantitatively distinctive solutions in offline RL.
arXiv Detail & Related papers (2024-06-10T03:25:49Z) - UCB-driven Utility Function Search for Multi-objective Reinforcement Learning [75.11267478778295]
In Multi-objective Reinforcement Learning (MORL) agents are tasked with optimising decision-making behaviours.
We focus on the case of linear utility functions parameterised by weight vectors w.
We introduce a method based on Upper Confidence Bound to efficiently search for the most promising weight vectors during different stages of the learning process.
arXiv Detail & Related papers (2024-05-01T09:34:42Z) - PolyNet: Learning Diverse Solution Strategies for Neural Combinatorial
Optimization [4.764047597837088]
We introduce PolyNet, an approach for improving exploration of the solution space by learning complementary solution strategies.
In contrast to other works, PolyNet uses only a single-decoder and a training schema that does not enforce diverse solution generation.
arXiv Detail & Related papers (2024-02-21T16:38:14Z) - Continuous Tensor Relaxation for Finding Diverse Solutions in Combinatorial Optimization Problems [0.6906005491572401]
This study introduces Continual Anne Relaxationing (CTRA) for unsupervised-learning (UL)-based CO solvers.
CTRA is a computationally efficient framework for finding diverse solutions in a single training run.
Numerical experiments show that CTRA enables UL-based solvers to find these diverse solutions much faster than repeatedly running existing solvers.
arXiv Detail & Related papers (2024-02-03T15:31:05Z) - Pareto Set Learning for Neural Multi-objective Combinatorial
Optimization [6.091096843566857]
Multiobjective optimization (MOCO) problems can be found in many real-world applications.
We develop a learning-based approach to approximate the whole Pareto set for a given MOCO problem without further search procedure.
Our proposed method significantly outperforms some other methods on the multiobjective traveling salesman problem, multiconditioned vehicle routing problem and multi knapsack problem in terms of solution quality, speed, and model efficiency.
arXiv Detail & Related papers (2022-03-29T09:26:22Z) - Learning Proximal Operators to Discover Multiple Optima [66.98045013486794]
We present an end-to-end method to learn the proximal operator across non-family problems.
We show that for weakly-ized objectives and under mild conditions, the method converges globally.
arXiv Detail & Related papers (2022-01-28T05:53:28Z) - GACEM: Generalized Autoregressive Cross Entropy Method for Multi-Modal
Black Box Constraint Satisfaction [69.94831587339539]
We present a modified Cross-Entropy Method (CEM) that uses a masked auto-regressive neural network for modeling uniform distributions over the solution space.
Our algorithm is able to express complicated solution spaces, thus allowing it to track a variety of different solution regions.
arXiv Detail & Related papers (2020-02-17T20:21:20Z) - Pareto Multi-Task Learning [53.90732663046125]
Multi-task learning is a powerful method for solving multiple correlated tasks simultaneously.
It is often impossible to find one single solution to optimize all the tasks, since different tasks might conflict with each other.
Recently, a novel method is proposed to find one single Pareto optimal solution with good trade-off among different tasks by casting multi-task learning as multiobjective optimization.
arXiv Detail & Related papers (2019-12-30T08:58:40Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.