Hyperparameter Optimization in Binary Communication Networks for
Neuromorphic Deployment
- URL: http://arxiv.org/abs/2005.04171v1
- Date: Tue, 21 Apr 2020 01:15:45 GMT
- Title: Hyperparameter Optimization in Binary Communication Networks for
Neuromorphic Deployment
- Authors: Maryam Parsa, Catherine D. Schuman, Prasanna Date, Derek C. Rose, Bill
Kay, J. Parker Mitchell, Steven R. Young, Ryan Dellana, William Severa,
Thomas E. Potok, Kaushik Roy
- Abstract summary: Training neural networks for neuromorphic deployment is non-trivial.
We introduce a Bayesian approach for optimizing the hyper parameters of an algorithm for training binary communication networks that can be deployed to neuromorphic hardware.
We show that by optimizing the hyper parameters on this algorithm for each dataset, we can achieve improvements in accuracy over the previous state-of-the-art for this algorithm on each dataset.
- Score: 4.280642750854163
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Training neural networks for neuromorphic deployment is non-trivial. There
have been a variety of approaches proposed to adapt back-propagation or
back-propagation-like algorithms appropriate for training. Considering that
these networks often have very different performance characteristics than
traditional neural networks, it is often unclear how to set either the network
topology or the hyperparameters to achieve optimal performance. In this work,
we introduce a Bayesian approach for optimizing the hyperparameters of an
algorithm for training binary communication networks that can be deployed to
neuromorphic hardware. We show that by optimizing the hyperparameters on this
algorithm for each dataset, we can achieve improvements in accuracy over the
previous state-of-the-art for this algorithm on each dataset (by up to 15
percent). This jump in performance continues to emphasize the potential when
converting traditional neural networks to binary communication applicable to
neuromorphic hardware.
Related papers
- Training Hamiltonian neural networks without backpropagation [0.0]
We present a backpropagation-free algorithm to accelerate the training of neural networks for approximating Hamiltonian systems.
We show that our approach is more than 100 times faster with CPUs than the traditionally trained Hamiltonian Neural Networks.
arXiv Detail & Related papers (2024-11-26T15:22:30Z) - Peer-to-Peer Learning Dynamics of Wide Neural Networks [10.179711440042123]
We provide an explicit, non-asymptotic characterization of the learning dynamics of wide neural networks trained using popularDGD algorithms.
We validate our analytical results by accurately predicting error and error and for classification tasks.
arXiv Detail & Related papers (2024-09-23T17:57:58Z) - Graph Neural Networks for Learning Equivariant Representations of Neural Networks [55.04145324152541]
We propose to represent neural networks as computational graphs of parameters.
Our approach enables a single model to encode neural computational graphs with diverse architectures.
We showcase the effectiveness of our method on a wide range of tasks, including classification and editing of implicit neural representations.
arXiv Detail & Related papers (2024-03-18T18:01:01Z) - Lifted Bregman Training of Neural Networks [28.03724379169264]
We introduce a novel mathematical formulation for the training of feed-forward neural networks with (potentially non-smooth) proximal maps as activation functions.
This formulation is based on Bregman and a key advantage is that its partial derivatives with respect to the network's parameters do not require the computation of derivatives of the network's activation functions.
We present several numerical results that demonstrate that these training approaches can be equally well or even better suited for the training of neural network-based classifiers and (denoising) autoencoders with sparse coding.
arXiv Detail & Related papers (2022-08-18T11:12:52Z) - Acceleration techniques for optimization over trained neural network
ensembles [1.0323063834827415]
We study optimization problems where the objective function is modeled through feedforward neural networks with rectified linear unit activation.
We present a mixed-integer linear program based on existing popular big-$M$ formulations for optimizing over a single neural network.
arXiv Detail & Related papers (2021-12-13T20:50:54Z) - LocalDrop: A Hybrid Regularization for Deep Neural Networks [98.30782118441158]
We propose a new approach for the regularization of neural networks by the local Rademacher complexity called LocalDrop.
A new regularization function for both fully-connected networks (FCNs) and convolutional neural networks (CNNs) has been developed based on the proposed upper bound of the local Rademacher complexity.
arXiv Detail & Related papers (2021-03-01T03:10:11Z) - Delta-STN: Efficient Bilevel Optimization for Neural Networks using
Structured Response Jacobians [5.33024001730262]
Self-Tuning Networks (STNs) have recently gained traction due to their ability to amortize the optimization of the inner objective.
We propose the $Delta$-STN, an improved hypernetwork architecture which stabilizes training.
arXiv Detail & Related papers (2020-10-26T12:12:23Z) - Communication-Efficient Distributed Stochastic AUC Maximization with
Deep Neural Networks [50.42141893913188]
We study a distributed variable for large-scale AUC for a neural network as with a deep neural network.
Our model requires a much less number of communication rounds and still a number of communication rounds in theory.
Our experiments on several datasets show the effectiveness of our theory and also confirm our theory.
arXiv Detail & Related papers (2020-05-05T18:08:23Z) - Understanding the Effects of Data Parallelism and Sparsity on Neural
Network Training [126.49572353148262]
We study two factors in neural network training: data parallelism and sparsity.
Despite their promising benefits, understanding of their effects on neural network training remains elusive.
arXiv Detail & Related papers (2020-03-25T10:49:22Z) - Dynamic Hierarchical Mimicking Towards Consistent Optimization
Objectives [73.15276998621582]
We propose a generic feature learning mechanism to advance CNN training with enhanced generalization ability.
Partially inspired by DSN, we fork delicately designed side branches from the intermediate layers of a given neural network.
Experiments on both category and instance recognition tasks demonstrate the substantial improvements of our proposed method.
arXiv Detail & Related papers (2020-03-24T09:56:13Z) - MSE-Optimal Neural Network Initialization via Layer Fusion [68.72356718879428]
Deep neural networks achieve state-of-the-art performance for a range of classification and inference tasks.
The use of gradient combined nonvolutionity renders learning susceptible to novel problems.
We propose fusing neighboring layers of deeper networks that are trained with random variables.
arXiv Detail & Related papers (2020-01-28T18:25:15Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.