BOMP-NAS: Bayesian Optimization Mixed Precision NAS
- URL: http://arxiv.org/abs/2301.11810v1
- Date: Fri, 27 Jan 2023 16:04:34 GMT
- Title: BOMP-NAS: Bayesian Optimization Mixed Precision NAS
- Authors: David van Son, Floran de Putter, Sebastian Vogel, Henk Corporaal
- Abstract summary: BOMP-NAS is able to find neural networks that achieve state of the art performance at much lower design costs.
BOMP-NAS can find these neural networks at a 6x shorter search time compared to the closest related work.
- Score: 2.53488567499651
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Bayesian Optimization Mixed-Precision Neural Architecture Search (BOMP-NAS)
is an approach to quantization-aware neural architecture search (QA-NAS) that
leverages both Bayesian optimization (BO) and mixed-precision quantization (MP)
to efficiently search for compact, high performance deep neural networks. The
results show that integrating quantization-aware fine-tuning (QAFT) into the
NAS loop is a necessary step to find networks that perform well under
low-precision quantization: integrating it allows a model size reduction of
nearly 50\% on the CIFAR-10 dataset. BOMP-NAS is able to find neural networks
that achieve state of the art performance at much lower design costs. This
study shows that BOMP-NAS can find these neural networks at a 6x shorter search
time compared to the closest related work.
Related papers
- RNC: Efficient RRAM-aware NAS and Compilation for DNNs on Resource-Constrained Edge Devices [0.30458577208819987]
We aim to develop edge-friendly deep neural networks (DNNs) for accelerators based on resistive random-access memory (RRAM)
We propose an edge compilation and resource-constrained RRAM-aware neural architecture search (NAS) framework to search for optimized neural networks meeting specific hardware constraints.
The resulting model from NAS optimized for speed achieved 5x-30x speedup.
arXiv Detail & Related papers (2024-09-27T15:35:36Z) - LitE-SNN: Designing Lightweight and Efficient Spiking Neural Network through Spatial-Temporal Compressive Network Search and Joint Optimization [48.41286573672824]
Spiking Neural Networks (SNNs) mimic the information-processing mechanisms of the human brain and are highly energy-efficient.
We propose a new approach named LitE-SNN that incorporates both spatial and temporal compression into the automated network design process.
arXiv Detail & Related papers (2024-01-26T05:23:11Z) - Low Precision Quantization-aware Training in Spiking Neural Networks
with Differentiable Quantization Function [0.5046831208137847]
This work aims to bridge the gap between recent progress in quantized neural networks and spiking neural networks.
It presents an extensive study on the performance of the quantization function, represented as a linear combination of sigmoid functions.
The presented quantization function demonstrates the state-of-the-art performance on four popular benchmarks.
arXiv Detail & Related papers (2023-05-30T09:42:05Z) - NAS-PRNet: Neural Architecture Search generated Phase Retrieval Net for
Off-axis Quantitative Phase Imaging [5.943105097884823]
We propose Neural Architecture Search (NAS) generated Phase Retrieval Net (NAS-PRNet)
NAS-PRNet is an encoder-decoder style neural network, automatically found from a large neural network architecture search space.
NAS-PRNet has achieved a Peak Signal-to-Noise Ratio (PSNR) of 36.1 dB, outperforming the widely used U-Net and original SparseMask-generated neural network.
arXiv Detail & Related papers (2022-10-25T16:16:41Z) - Neural Architecture Search for Improving Latency-Accuracy Trade-off in
Split Computing [5.516431145236317]
Split computing is an emerging machine-learning inference technique that addresses the privacy and latency challenges of deploying deep learning in IoT systems.
In split computing, neural network models are separated and cooperatively processed using edge servers and IoT devices via networks.
This paper proposes a neural architecture search (NAS) method for split computing.
arXiv Detail & Related papers (2022-08-30T03:15:43Z) - Neural Architecture Search of SPD Manifold Networks [79.45110063435617]
We propose a new neural architecture search (NAS) problem of Symmetric Positive Definite (SPD) manifold networks.
We first introduce a geometrically rich and diverse SPD neural architecture search space for an efficient SPD cell design.
We exploit a differentiable NAS algorithm on our relaxed continuous search space for SPD neural architecture search.
arXiv Detail & Related papers (2020-10-27T18:08:57Z) - MS-RANAS: Multi-Scale Resource-Aware Neural Architecture Search [94.80212602202518]
We propose Multi-Scale Resource-Aware Neural Architecture Search (MS-RANAS)
We employ a one-shot architecture search approach in order to obtain a reduced search cost.
We achieve state-of-the-art results in terms of accuracy-speed trade-off.
arXiv Detail & Related papers (2020-09-29T11:56:01Z) - Searching for Low-Bit Weights in Quantized Neural Networks [129.8319019563356]
Quantized neural networks with low-bit weights and activations are attractive for developing AI accelerators.
We present to regard the discrete weights in an arbitrary quantized neural network as searchable variables, and utilize a differential method to search them accurately.
arXiv Detail & Related papers (2020-09-18T09:13:26Z) - FBNetV3: Joint Architecture-Recipe Search using Predictor Pretraining [65.39532971991778]
We present an accuracy predictor that scores architecture and training recipes jointly, guiding both sample selection and ranking.
We run fast evolutionary searches in just CPU minutes to generate architecture-recipe pairs for a variety of resource constraints.
FBNetV3 makes up a family of state-of-the-art compact neural networks that outperform both automatically and manually-designed competitors.
arXiv Detail & Related papers (2020-06-03T05:20:21Z) - Widening and Squeezing: Towards Accurate and Efficient QNNs [125.172220129257]
Quantization neural networks (QNNs) are very attractive to the industry because their extremely cheap calculation and storage overhead, but their performance is still worse than that of networks with full-precision parameters.
Most of existing methods aim to enhance performance of QNNs especially binary neural networks by exploiting more effective training techniques.
We address this problem by projecting features in original full-precision networks to high-dimensional quantization features.
arXiv Detail & Related papers (2020-02-03T04:11:13Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.