BayesFT: Bayesian Optimization for Fault Tolerant Neural Network
Architecture
- URL: http://arxiv.org/abs/2210.01795v1
- Date: Fri, 30 Sep 2022 20:13:05 GMT
- Title: BayesFT: Bayesian Optimization for Fault Tolerant Neural Network
Architecture
- Authors: Nanyang Ye, Jingbiao Mei, Zhicheng Fang, Yuwen Zhang, Ziqing Zhang,
Huaying Wu, Xiaoyao Liang
- Abstract summary: We propose a novel Bayesian optimization method for fault tolerant neural network architecture (BayesFT)
Our framework has outperformed the state-of-the-art methods by up to 10 times on various tasks, such as image classification and object detection.
- Score: 8.005491953251541
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: To deploy deep learning algorithms on resource-limited scenarios, an emerging
device-resistive random access memory (ReRAM) has been regarded as promising
via analog computing. However, the practicability of ReRAM is primarily limited
due to the weight drifting of ReRAM neural networks due to multi-factor
reasons, including manufacturing, thermal noises, and etc. In this paper, we
propose a novel Bayesian optimization method for fault tolerant neural network
architecture (BayesFT). For neural architecture search space design, instead of
conducting neural architecture search on the whole feasible neural architecture
search space, we first systematically explore the weight drifting tolerance of
different neural network components, such as dropout, normalization, number of
layers, and activation functions in which dropout is found to be able to
improve the neural network robustness to weight drifting. Based on our
analysis, we propose an efficient search space by only searching for dropout
rates for each layer. Then, we use Bayesian optimization to search for the
optimal neural architecture robust to weight drifting. Empirical experiments
demonstrate that our algorithmic framework has outperformed the
state-of-the-art methods by up to 10 times on various tasks, such as image
classification and object detection.
Related papers
- Simultaneous Weight and Architecture Optimization for Neural Networks [6.2241272327831485]
We introduce a novel neural network training framework that transforms the process by learning architecture and parameters simultaneously with gradient descent.
Central to our approach is a multi-scale encoder-decoder, in which the encoder embeds pairs of neural networks with similar functionalities close to each other.
Experiments demonstrate that our framework can discover sparse and compact neural networks maintaining a high performance.
arXiv Detail & Related papers (2024-10-10T19:57:36Z) - Neural Networks for Vehicle Routing Problem [0.0]
Route optimization can be viewed as a new challenge for neural networks.
Recent developments in machine learning provide a new toolset, for tackling complex problems.
The main area of application of neural networks is the area of classification and regression.
arXiv Detail & Related papers (2024-09-17T15:45:30Z) - NeuralStagger: Accelerating Physics-constrained Neural PDE Solver with
Spatial-temporal Decomposition [67.46012350241969]
This paper proposes a general acceleration methodology called NeuralStagger.
It decomposing the original learning tasks into several coarser-resolution subtasks.
We demonstrate the successful application of NeuralStagger on 2D and 3D fluid dynamics simulations.
arXiv Detail & Related papers (2023-02-20T19:36:52Z) - Towards Theoretically Inspired Neural Initialization Optimization [66.04735385415427]
We propose a differentiable quantity, named GradCosine, with theoretical insights to evaluate the initial state of a neural network.
We show that both the training and test performance of a network can be improved by maximizing GradCosine under norm constraint.
Generalized from the sample-wise analysis into the real batch setting, NIO is able to automatically look for a better initialization with negligible cost.
arXiv Detail & Related papers (2022-10-12T06:49:16Z) - D-DARTS: Distributed Differentiable Architecture Search [75.12821786565318]
Differentiable ARchiTecture Search (DARTS) is one of the most trending Neural Architecture Search (NAS) methods.
We propose D-DARTS, a novel solution that addresses this problem by nesting several neural networks at cell-level.
arXiv Detail & Related papers (2021-08-20T09:07:01Z) - Differentiable Neural Architecture Learning for Efficient Neural Network
Design [31.23038136038325]
We introduce a novel emph architecture parameterisation based on scaled sigmoid function.
We then propose a general emphiable Neural Architecture Learning (DNAL) method to optimize the neural architecture without the need to evaluate candidate neural networks.
arXiv Detail & Related papers (2021-03-03T02:03:08Z) - NAS-Navigator: Visual Steering for Explainable One-Shot Deep Neural
Network Synthesis [53.106414896248246]
We present a framework that allows analysts to effectively build the solution sub-graph space and guide the network search by injecting their domain knowledge.
Applying this technique in an iterative manner allows analysts to converge to the best performing neural network architecture for a given application.
arXiv Detail & Related papers (2020-09-28T01:48:45Z) - NAS-DIP: Learning Deep Image Prior with Neural Architecture Search [65.79109790446257]
Recent work has shown that the structure of deep convolutional neural networks can be used as a structured image prior.
We propose to search for neural architectures that capture stronger image priors.
We search for an improved network by leveraging an existing neural architecture search algorithm.
arXiv Detail & Related papers (2020-08-26T17:59:36Z) - VINNAS: Variational Inference-based Neural Network Architecture Search [2.685668802278155]
We present a differentiable variational inference-based NAS method for searching sparse convolutional neural networks.
Our method finds diverse network cells, while showing state-of-the-art accuracy with up to almost 2 times fewer non-zero parameters.
arXiv Detail & Related papers (2020-07-12T21:47:35Z) - Multi-fidelity Neural Architecture Search with Knowledge Distillation [69.09782590880367]
We propose a bayesian multi-fidelity method for neural architecture search: MF-KD.
Knowledge distillation adds to a loss function a term forcing a network to mimic some teacher network.
We show that training for a few epochs with such a modified loss function leads to a better selection of neural architectures than training for a few epochs with a logistic loss.
arXiv Detail & Related papers (2020-06-15T12:32:38Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.