Related papers: Scaling Policy Gradient Quality-Diversity with Massive Parallelization via Behavioral Variations

Scaling Policy Gradient Quality-Diversity with Massive Parallelization via Behavioral Variations

URL: http://arxiv.org/abs/2501.18723v1
Date: Thu, 30 Jan 2025 19:56:04 GMT
Title: Scaling Policy Gradient Quality-Diversity with Massive Parallelization via Behavioral Variations
Authors: Konstantinos Mitsides, Maxence Faldor, Antoine Cully,
Abstract summary: We introduce a fast, sample-efficient ME based algorithm capable of scaling up with massive parallelization.<n>Our experiments show that ASCII-ME can generate a diverse collection of high-performing deep neural network policies in less than 250 seconds on a single GPU.
Score: 4.787389127632926
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Quality-Diversity optimization comprises a family of evolutionary algorithms aimed at generating a collection of diverse and high-performing solutions. MAP-Elites (ME), a notable example, is used effectively in fields like evolutionary robotics. However, the reliance of ME on random mutations from Genetic Algorithms limits its ability to evolve high-dimensional solutions. Methods proposed to overcome this include using gradient-based operators like policy gradients or natural evolution strategies. While successful at scaling ME for neuroevolution, these methods often suffer from slow training speeds, or difficulties in scaling with massive parallelization due to high computational demands or reliance on centralized actor-critic training. In this work, we introduce a fast, sample-efficient ME based algorithm capable of scaling up with massive parallelization, significantly reducing runtimes without compromising performance. Our method, ASCII-ME, unlike existing policy gradient quality-diversity methods, does not rely on centralized actor-critic training. It performs behavioral variations based on time step performance metrics and maps these variations to solutions using policy gradients. Our experiments show that ASCII-ME can generate a diverse collection of high-performing deep neural network policies in less than 250 seconds on a single GPU. Additionally, it operates on average, five times faster than state-of-the-art algorithms while still maintaining competitive sample efficiency.

Related papers

Frankenstein Optimizer: Harnessing the Potential by Revisiting Optimization Tricks [2.932254642052481]
We propose Frankenstein, which combines various adaptive algorithms' mechanisms. We show that Frankenstein surpasses existing adaptive algorithms and gradient descent (SGD) This research deepens our understanding of adaptive algorithms through centered kernel alignment analysis and loss landscape visualization during the learning process.
arXiv Detail & Related papers (2025-03-04T00:25:54Z)
Linearly Convergent Mixup Learning [0.0]
We present two novel algorithms that extend to a broader range of binary classification models.<n>Unlike gradient-based approaches, our algorithms do not require hyper parameters like learning rates, simplifying their implementation and optimization.<n>Our algorithms achieve faster convergence to the optimal solution compared to descent gradient approaches, and that mixup data augmentation consistently improves the predictive performance across various loss functions.
arXiv Detail & Related papers (2025-01-14T02:33:40Z)
MARS: Unleashing the Power of Variance Reduction for Training Large Models [56.47014540413659]
Large gradient algorithms like Adam, Adam, and their variants have been central to the development of this type of training. We propose a framework that reconciles preconditioned gradient optimization methods with variance reduction via a scaled momentum technique.
arXiv Detail & Related papers (2024-11-15T18:57:39Z)
Classifier-guided Gradient Modulation for Enhanced Multimodal Learning [50.7008456698935]
Gradient-Guided Modulation (CGGM) is a novel method to balance multimodal learning with gradients. We conduct extensive experiments on four multimodal datasets: UPMC-Food 101, CMU-MOSI, IEMOCAP and BraTS. CGGM outperforms all the baselines and other state-of-the-art methods consistently.
arXiv Detail & Related papers (2024-11-03T02:38:43Z)
Massive Dimensions Reduction and Hybridization with Meta-heuristics in Deep Learning [0.24578723416255746]
Histogram-based Differential Evolution (HBDE) hybridizes gradient-based and gradient-free algorithms to optimize parameters. HBDE outperforms baseline gradient-based and parent gradient-free DE algorithms evaluated on CIFAR-10 and CIFAR-100 datasets.
arXiv Detail & Related papers (2024-08-13T20:28:20Z)
Training Artificial Neural Networks by Coordinate Search Algorithm [0.20971479389679332]
We propose an efficient version of the gradient-free Coordinate Search (CS) algorithm for training neural networks. The proposed algorithm can be used with non-differentiable activation functions and tailored to multi-objective/multi-loss problems. Finding the optimal values for weights of ANNs is a large-scale optimization problem.
arXiv Detail & Related papers (2024-02-20T01:47:25Z)
GE-AdvGAN: Improving the transferability of adversarial samples by gradient editing-based adversarial generative model [69.71629949747884]
Adversarial generative models, such as Generative Adversarial Networks (GANs), are widely applied for generating various types of data. In this work, we propose a novel algorithm named GE-AdvGAN to enhance the transferability of adversarial samples.
arXiv Detail & Related papers (2024-01-11T16:43:16Z)
Improving Gradient Methods via Coordinate Transformations: Applications to Quantum Machine Learning [0.0]
Machine learning algorithms heavily rely on optimization algorithms based on gradients, such as gradient descent and alike. The overall performance is dependent on the appearance of local minima and barren plateaus, which slow-down calculations and lead to non-optimal solutions. In this paper we introduce a generic strategy to accelerate and improve the overall performance of such methods, allowing to alleviate the effect of barren plateaus and local minima.
arXiv Detail & Related papers (2023-04-13T18:26:05Z)
Neuroevolution of Physics-Informed Neural Nets: Benchmark Problems and Comparative Results [25.12291688711645]
Physics-informed neural networks (PINNs) are one of the key techniques at the forefront of recent advances. PINNs' unique loss formulations lead to a high degree of complexity and ruggedness that may not be conducive for gradient descent. Neuroevolution algorithms, with their superior global search capacity, may be a better choice for PINNs.
arXiv Detail & Related papers (2022-12-15T05:54:16Z)
Accelerated Policy Learning with Parallel Differentiable Simulation [59.665651562534755]
We present a differentiable simulator and a new policy learning algorithm (SHAC) Our algorithm alleviates problems with local minima through a smooth critic function. We show substantial improvements in sample efficiency and wall-clock time over state-of-the-art RL and differentiable simulation-based algorithms.
arXiv Detail & Related papers (2022-04-14T17:46:26Z)
Learning Sampling Policy for Faster Derivative Free Optimization [100.27518340593284]
We propose a new reinforcement learning based ZO algorithm (ZO-RL) with learning the sampling policy for generating the perturbations in ZO optimization instead of using random sampling. Our results show that our ZO-RL algorithm can effectively reduce the variances of ZO gradient by learning a sampling policy, and converge faster than existing ZO algorithms in different scenarios.
arXiv Detail & Related papers (2021-04-09T14:50:59Z)
GACEM: Generalized Autoregressive Cross Entropy Method for Multi-Modal Black Box Constraint Satisfaction [69.94831587339539]
We present a modified Cross-Entropy Method (CEM) that uses a masked auto-regressive neural network for modeling uniform distributions over the solution space. Our algorithm is able to express complicated solution spaces, thus allowing it to track a variety of different solution regions.
arXiv Detail & Related papers (2020-02-17T20:21:20Z)

This list is automatically generated from the titles and abstracts of the papers in this site.