Solution and Fitness Evolution (SAFE): Coevolving Solutions and Their
Objective Functions
- URL: http://arxiv.org/abs/2206.12707v1
- Date: Sat, 25 Jun 2022 18:41:00 GMT
- Title: Solution and Fitness Evolution (SAFE): Coevolving Solutions and Their
Objective Functions
- Authors: Moshe Sipper, Jason H. Moore, Ryan J. Urbanowicz
- Abstract summary: An effective objective function to textitevaluate strategies may not be a simple function of the distance to the objective.
We present textbfSolution textbfAnd textbfFitness textbfEvolution (textbfSAFE), a textitcommensalistic coevolutionary algorithm.
- Score: 4.149117182410553
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We recently highlighted a fundamental problem recognized to confound
algorithmic optimization, namely, \textit{conflating} the objective with the
objective function. Even when the former is well defined, the latter may not be
obvious, e.g., in learning a strategy to navigate a maze to find a goal
(objective), an effective objective function to \textit{evaluate} strategies
may not be a simple function of the distance to the objective. We proposed to
automate the means by which a good objective function may be discovered -- a
proposal reified herein. We present \textbf{S}olution \textbf{A}nd
\textbf{F}itness \textbf{E}volution (\textbf{SAFE}), a \textit{commensalistic}
coevolutionary algorithm that maintains two coevolving populations: a
population of candidate solutions and a population of candidate objective
functions. As proof of principle of this concept, we show that SAFE
successfully evolves not only solutions within a robotic maze domain, but also
the objective functions needed to measure solution quality during evolution.
Related papers
- Measuring Goal-Directedness [13.871986295154782]
We define maximum entropy goal-directedness (MEG), a formal measure of goal-directedness in causal models and Markov decision processes.
MEG is based on an adaptation of the maximum causal entropy framework used in inverse reinforcement learning.
arXiv Detail & Related papers (2024-12-06T03:48:47Z) - Discrete Factorial Representations as an Abstraction for Goal
Conditioned Reinforcement Learning [99.38163119531745]
We show that applying a discretizing bottleneck can improve performance in goal-conditioned RL setups.
We experimentally prove the expected return on out-of-distribution goals, while still allowing for specifying goals with expressive structure.
arXiv Detail & Related papers (2022-11-01T03:31:43Z) - Unsupervised Learning for Combinatorial Optimization with Principled
Objective Relaxation [19.582494782591386]
This work proposes an unsupervised learning framework for optimization (CO) problems.
Our key contribution is the observation that if the relaxed objective satisfies entry-wise concavity, a low optimization loss guarantees the quality of the final integral solutions.
In particular, this observation can guide the design of objective models in applications where the objectives are not given explicitly while requiring being modeled in prior.
arXiv Detail & Related papers (2022-07-13T06:44:17Z) - Solution and Fitness Evolution (SAFE): A Study of Multiobjective
Problems [4.149117182410553]
We have recently presented SAFE, a commensalistic coevolutionary algorithm that maintains two coevolving populations.
We show that SAFE was successful at evolving solutions within a robotic maze domain.
Though preliminary, the results suggest that SAFE, and the concept of coevolving solutions and objective functions, can identify a similar set of optimal multiobjective solutions.
arXiv Detail & Related papers (2022-06-25T18:42:05Z) - Adaptive Multi-Goal Exploration [118.40427257364729]
We show how AdaGoal can be used to tackle the objective of learning an $epsilon$-optimal goal-conditioned policy.
AdaGoal is anchored in the high-level algorithmic structure of existing methods for goal-conditioned deep reinforcement learning.
arXiv Detail & Related papers (2021-11-23T17:59:50Z) - C-Planning: An Automatic Curriculum for Learning Goal-Reaching Tasks [133.40619754674066]
Goal-conditioned reinforcement learning can solve tasks in a wide range of domains, including navigation and manipulation.
We propose the distant goal-reaching task by using search at training time to automatically generate intermediate states.
E-step corresponds to planning an optimal sequence of waypoints using graph search, while the M-step aims to learn a goal-conditioned policy to reach those waypoints.
arXiv Detail & Related papers (2021-10-22T22:05:31Z) - Adversarial Intrinsic Motivation for Reinforcement Learning [60.322878138199364]
We investigate whether the Wasserstein-1 distance between a policy's state visitation distribution and a target distribution can be utilized effectively for reinforcement learning tasks.
Our approach, termed Adversarial Intrinsic Motivation (AIM), estimates this Wasserstein-1 distance through its dual objective and uses it to compute a supplemental reward function.
arXiv Detail & Related papers (2021-05-27T17:51:34Z) - Outcome-Driven Reinforcement Learning via Variational Inference [95.82770132618862]
We discuss a new perspective on reinforcement learning, recasting it as the problem of inferring actions that achieve desired outcomes, rather than a problem of maximizing rewards.
To solve the resulting outcome-directed inference problem, we establish a novel variational inference formulation that allows us to derive a well-shaped reward function.
We empirically demonstrate that this method eliminates the need to design reward functions and leads to effective goal-directed behaviors.
arXiv Detail & Related papers (2021-04-20T18:16:21Z) - CACTUS: Detecting and Resolving Conflicts in Objective Functions [16.784454432715712]
In multi-objective optimization, conflicting objectives and constraints is a major area of concern.
In this paper, we extend this line of work by prototyping a technique to visualize multi-objective objective functions.
We show that our technique helps users interactively specify meaningful objective functions by resolving potential conflicts for a classification task.
arXiv Detail & Related papers (2021-03-13T22:38:47Z) - Automatic Curriculum Learning through Value Disagreement [95.19299356298876]
Continually solving new, unsolved tasks is the key to learning diverse behaviors.
In the multi-task domain, where an agent needs to reach multiple goals, the choice of training goals can largely affect sample efficiency.
We propose setting up an automatic curriculum for goals that the agent needs to solve.
We evaluate our method across 13 multi-goal robotic tasks and 5 navigation tasks, and demonstrate performance gains over current state-of-the-art methods.
arXiv Detail & Related papers (2020-06-17T03:58:25Z) - sKPNSGA-II: Knee point based MOEA with self-adaptive angle for Mission
Planning Problems [2.191505742658975]
Some problems have many objectives which lead to a large number of non-dominated solutions.
This paper presents a new algorithm that has been designed to obtain the most significant solutions.
This new algorithm has been applied to the real world application in Unmanned Air Vehicle (UAV) Mission Planning Problem.
arXiv Detail & Related papers (2020-02-20T17:07:08Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.