Shrink-Perturb Improves Architecture Mixing during Population Based
Training for Neural Architecture Search
- URL: http://arxiv.org/abs/2307.15621v1
- Date: Fri, 28 Jul 2023 15:29:52 GMT
- Title: Shrink-Perturb Improves Architecture Mixing during Population Based
Training for Neural Architecture Search
- Authors: Alexander Chebykin, Arkadiy Dushatskiy, Tanja Alderliesten, Peter A.
N. Bosman
- Abstract summary: We show that simultaneously training and mixing neural networks is a promising way to conduct Neural Architecture Search (NAS)
We propose PBT-NAS, an adaptation of PBT to NAS where architectures are improved during training by replacing poorly-performing networks in a population with the result of mixing well-performing ones and inheriting the weights using the shrink-perturb technique.
- Score: 62.997667081978825
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: In this work, we show that simultaneously training and mixing neural networks
is a promising way to conduct Neural Architecture Search (NAS). For
hyperparameter optimization, reusing the partially trained weights allows for
efficient search, as was previously demonstrated by the Population Based
Training (PBT) algorithm. We propose PBT-NAS, an adaptation of PBT to NAS where
architectures are improved during training by replacing poorly-performing
networks in a population with the result of mixing well-performing ones and
inheriting the weights using the shrink-perturb technique. After PBT-NAS
terminates, the created networks can be directly used without retraining.
PBT-NAS is highly parallelizable and effective: on challenging tasks (image
generation and reinforcement learning) PBT-NAS achieves superior performance
compared to baselines (random search and mutation-based PBT).
Related papers
- Online Pseudo-Zeroth-Order Training of Neuromorphic Spiking Neural Networks [69.2642802272367]
Brain-inspired neuromorphic computing with spiking neural networks (SNNs) is a promising energy-efficient computational approach.
Most recent methods leverage spatial and temporal backpropagation (BP), not adhering to neuromorphic properties.
We propose a novel method, online pseudo-zeroth-order (OPZO) training.
arXiv Detail & Related papers (2024-07-17T12:09:00Z) - Robustifying and Boosting Training-Free Neural Architecture Search [49.828875134088904]
We propose a robustifying and boosting training-free NAS (RoBoT) algorithm to develop a robust and consistently better-performing metric on diverse tasks.
Remarkably, the expected performance of our RoBoT can be theoretically guaranteed, which improves over the existing training-free NAS.
arXiv Detail & Related papers (2024-03-12T12:24:11Z) - Online Training Through Time for Spiking Neural Networks [66.7744060103562]
Spiking neural networks (SNNs) are promising brain-inspired energy-efficient models.
Recent progress in training methods has enabled successful deep SNNs on large-scale tasks with low latency.
We propose online training through time (OTTT) for SNNs, which is derived from BPTT to enable forward-in-time learning.
arXiv Detail & Related papers (2022-10-09T07:47:56Z) - Doing More by Doing Less: How Structured Partial Backpropagation
Improves Deep Learning Clusters [9.17259958324486]
Training deep learning models is resource-intensive, consuming significant compute, memory, and network resources.
We propose Structured Partial Backpropagation(SPB), a technique that controls the amount of backpropagation at individual workers in distributed training.
We find that JigSaw can improve large scale cluster efficiency by as high as 28%.
arXiv Detail & Related papers (2021-11-20T20:34:26Z) - Faster Improvement Rate Population Based Training [7.661301899629696]
This paper presents Faster Improvement Rate PBT (FIRE PBT) which addresses the problem of Population Based Training (PBT)
We derive a novel fitness metric and use it to make some of the population members focus on long-term performance.
Experiments show that FIRE PBT is able to outperform PBT on the ImageNet benchmark and match the performance of networks that were trained with a hand-tuned learning rate schedule.
arXiv Detail & Related papers (2021-09-28T15:30:55Z) - BN-NAS: Neural Architecture Search with Batch Normalization [116.47802796784386]
We present BN-NAS, neural architecture search with Batch Normalization (BN-NAS), to accelerate neural architecture search (NAS)
BN-NAS can significantly reduce the time required by model training and evaluation in NAS.
arXiv Detail & Related papers (2021-08-16T23:23:21Z) - EPE-NAS: Efficient Performance Estimation Without Training for Neural
Architecture Search [1.1470070927586016]
We propose EPE-NAS, an efficient performance estimation strategy, that mitigates the problem of evaluating networks.
We show that EPE-NAS can produce a robust correlation and that by incorporating it into a simple random sampling strategy, we are able to search for competitive networks, without requiring any training, in a matter of seconds using a single GPU.
arXiv Detail & Related papers (2021-02-16T11:47:05Z) - Binarized Neural Architecture Search for Efficient Object Recognition [120.23378346337311]
Binarized neural architecture search (BNAS) produces extremely compressed models to reduce huge computational cost on embedded devices for edge computing.
An accuracy of $96.53%$ vs. $97.22%$ is achieved on the CIFAR-10 dataset, but with a significantly compressed model, and a $40%$ faster search than the state-of-the-art PC-DARTS.
arXiv Detail & Related papers (2020-09-08T15:51:23Z) - Regularized Evolutionary Population-Based Training [11.624954122221562]
This paper presents an algorithm called Population-Based Training (EPBT) that interleaves the training of a DNN's weights with the metalearning of loss functions.
EPBT results in faster, more accurate learning on image classification benchmarks.
arXiv Detail & Related papers (2020-02-11T06:28:13Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.