Related papers: Maximum Mutation Reinforcement Learning for Scalable Control

Maximum Mutation Reinforcement Learning for Scalable Control

URL: http://arxiv.org/abs/2007.13690v7
Date: Sat, 16 Jan 2021 23:51:53 GMT
Title: Maximum Mutation Reinforcement Learning for Scalable Control
Authors: Karush Suri, Xiao Qi Shi, Konstantinos N. Plataniotis, Yuri A. Lawryshyn
Abstract summary: Reinforcement Learning (RL) has demonstrated data efficiency and optimal control over large state spaces at the cost of scalable performance. We present the Evolution-based Soft Actor-Critic (ESAC), a scalable RL algorithm.
Score: 25.935468948833073
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Advances in Reinforcement Learning (RL) have demonstrated data efficiency and optimal control over large state spaces at the cost of scalable performance. Genetic methods, on the other hand, provide scalability but depict hyperparameter sensitivity towards evolutionary operations. However, a combination of the two methods has recently demonstrated success in scaling RL agents to high-dimensional action spaces. Parallel to recent developments, we present the Evolution-based Soft Actor-Critic (ESAC), a scalable RL algorithm. We abstract exploration from exploitation by combining Evolution Strategies (ES) with Soft Actor-Critic (SAC). Through this lens, we enable dominant skill transfer between offsprings by making use of soft winner selections and genetic crossovers in hindsight and simultaneously improve hyperparameter sensitivity in evolutions using the novel Automatic Mutation Tuning (AMT). AMT gradually replaces the entropy framework of SAC allowing the population to succeed at the task while acting as randomly as possible, without making use of backpropagation updates. In a study of challenging locomotion tasks consisting of high-dimensional action spaces and sparse rewards, ESAC demonstrates improved performance and sample efficiency in comparison to the Maximum Entropy framework. Additionally, ESAC presents efficacious use of hardware resources and algorithm overhead. A complete implementation of ESAC can be found at karush17.github.io/esac-web/.

Related papers

Residual Learning Inspired Crossover Operator and Strategy Enhancements for Evolutionary Multitasking [0.3749861135832073]
In evolutionary multitasking, strategies such as crossover operators and skill factor assignment are critical for effective knowledge transfer. This paper proposes the Multifactorial Evolutionary Algorithm-Residual Learning (MFEA-RL) method based on residual learning. A ResNet-based mechanism dynamically assigns skill factors to improve task adaptability, while a random mapping mechanism efficiently performs crossover operations.
arXiv Detail & Related papers (2025-03-27T10:27:17Z)
Evolution-based Region Adversarial Prompt Learning for Robustness Enhancement in Vision-Language Models [52.8949080772873]
We propose an evolution-based region adversarial prompt tuning method called ER-APT. In each training iteration, we first generate AEs using traditional gradient-based methods. Subsequently, a genetic evolution mechanism incorporating selection, mutation, and crossover is applied to optimize the AEs. The final evolved AEs are used for prompt tuning, achieving region-based adversarial optimization instead of conventional single-point adversarial prompt tuning.
arXiv Detail & Related papers (2025-03-17T07:08:47Z)
MARS: Unleashing the Power of Variance Reduction for Training Large Models [56.47014540413659]
Large gradient algorithms like Adam, Adam, and their variants have been central to the development of this type of training. We propose a framework that reconciles preconditioned gradient optimization methods with variance reduction via a scaled momentum technique.
arXiv Detail & Related papers (2024-11-15T18:57:39Z)
Active learning for energy-based antibody optimization and enhanced screening [0.0]
We propose an active learning workflow that efficiently trains a deep learning model to learn energy functions for specific targets. In a case study targeting HER2-binding Trastuzumab mutants, our approach significantly improved the screening performance over random selection.
arXiv Detail & Related papers (2024-09-17T08:01:58Z)
Trackable Agent-based Evolution Models at Wafer Scale [0.0]
We focus on the problem of extracting phylogenetic information from agent-based evolution on the 850,000 processor Cerebras Wafer Scale Engine (WSE) We present an asynchronous island-based genetic algorithm (GA) framework for WSE hardware. We validate phylogenetic reconstructions from these trials and demonstrate their suitability for inference of underlying evolutionary conditions.
arXiv Detail & Related papers (2024-04-16T19:24:14Z)
Learning Better with Less: Effective Augmentation for Sample-Efficient Visual Reinforcement Learning [57.83232242068982]
Data augmentation (DA) is a crucial technique for enhancing the sample efficiency of visual reinforcement learning (RL) algorithms. It remains unclear which attributes of DA account for its effectiveness in achieving sample-efficient visual RL. This work conducts comprehensive experiments to assess the impact of DA's attributes on its efficacy.
arXiv Detail & Related papers (2023-05-25T15:46:20Z)
Improving Covariance Conditioning of the SVD Meta-layer by Orthogonality [65.67315418971688]
Nearest Orthogonal Gradient (NOG) and Optimal Learning Rate (OLR) are proposed. Experiments on visual recognition demonstrate that our methods can simultaneously improve the covariance conditioning and generalization.
arXiv Detail & Related papers (2022-07-05T15:39:29Z)
Transformers are Meta-Reinforcement Learners [0.060917028769172814]
We present TrMRL, a meta-RL agent that mimics the memory reinstatement mechanism using the transformer architecture. We show that the self-attention computes a consensus representation that minimizes the Bayes Risk at each layer. Results show that TrMRL presents comparable or superior performance, sample efficiency, and out-of-distribution generalization.
arXiv Detail & Related papers (2022-06-14T06:21:13Z)
Hyper-Learning for Gradient-Based Batch Size Adaptation [2.944323057176686]
Scheduling the batch size to increase is an effective strategy to control noise when training deep neural networks. We introduce Arbiter as a new hyper-optimization algorithm to perform batch size adaptations for learnable schedulings. We demonstrate Arbiter's effectiveness in several illustrative experiments.
arXiv Detail & Related papers (2022-05-17T11:01:14Z)
Direct Mutation and Crossover in Genetic Algorithms Applied to Reinforcement Learning Tasks [0.9137554315375919]
This paper will focus on applying neuroevolution using a simple genetic algorithm (GA) to find the weights of a neural network that produce optimally behaving agents. We present two novel modifications that improve the data efficiency and speed of convergence when compared to the initial implementation.
arXiv Detail & Related papers (2022-01-13T07:19:28Z)
Adaptive Stochastic ADMM for Decentralized Reinforcement Learning in Edge Industrial IoT [106.83952081124195]
Reinforcement learning (RL) has been widely investigated and shown to be a promising solution for decision-making and optimal control processes. We propose an adaptive ADMM (asI-ADMM) algorithm and apply it to decentralized RL with edge-computing-empowered IIoT networks. Experiment results show that our proposed algorithms outperform the state of the art in terms of communication costs and scalability, and can well adapt to complex IoT environments.
arXiv Detail & Related papers (2021-06-30T16:49:07Z)
Adam revisited: a weighted past gradients perspective [57.54752290924522]
We propose a novel adaptive method weighted adaptive algorithm (WADA) to tackle the non-convergence issues. We prove that WADA can achieve a weighted data-dependent regret bound, which could be better than the original regret bound of ADAGRAD.
arXiv Detail & Related papers (2021-01-01T14:01:52Z)
EOS: a Parallel, Self-Adaptive, Multi-Population Evolutionary Algorithm for Constrained Global Optimization [68.8204255655161]
EOS is a global optimization algorithm for constrained and unconstrained problems of real-valued variables. It implements a number of improvements to the well-known Differential Evolution (DE) algorithm. Results prove that EOSis capable of achieving increased performance compared to state-of-the-art single-population self-adaptive DE algorithms.
arXiv Detail & Related papers (2020-07-09T10:19:22Z)

This list is automatically generated from the titles and abstracts of the papers in this site.