Deep Learning in Memristive Nanowire Networks
- URL: http://arxiv.org/abs/2003.02642v1
- Date: Tue, 3 Mar 2020 20:11:33 GMT
- Title: Deep Learning in Memristive Nanowire Networks
- Authors: Jack D. Kendall, Ross D. Pantone, and Juan C. Nino
- Abstract summary: A new hardware architecture, dubbed the MN3 (Memristive Nanowire Neural Network), was recently described as an efficient architecture for simulating very wide, sparse neural network layers.
We show that the MN3 is capable of performing composition, gradient propagation, and weight updates, which together allow it to function as a deep neural network.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Analog crossbar architectures for accelerating neural network training and
inference have made tremendous progress over the past several years. These
architectures are ideal for dense layers with fewer than roughly a thousand
neurons. However, for large sparse layers, crossbar architectures are highly
inefficient. A new hardware architecture, dubbed the MN3 (Memristive Nanowire
Neural Network), was recently described as an efficient architecture for
simulating very wide, sparse neural network layers, on the order of millions of
neurons per layer. The MN3 utilizes a high-density memristive nanowire mesh to
efficiently connect large numbers of silicon neurons with modifiable weights.
Here, in order to explore the MN3's ability to function as a deep neural
network, we describe one algorithm for training deep MN3 models and benchmark
simulations of the architecture on two deep learning tasks. We utilize a simple
piecewise linear memristor model, since we seek to demonstrate that training
is, in principle, possible for randomized nanowire architectures. In future
work, we intend on utilizing more realistic memristor models, and we will adapt
the presented algorithm appropriately. We show that the MN3 is capable of
performing composition, gradient propagation, and weight updates, which
together allow it to function as a deep neural network. We show that a
simulated multilayer perceptron (MLP), built from MN3 networks, can obtain a
1.61% error rate on the popular MNIST dataset, comparable to equivalently sized
software-based network. This work represents, to the authors' knowledge, the
first randomized nanowire architecture capable of reproducing the
backpropagation algorithm.
Related papers
- Simultaneous Weight and Architecture Optimization for Neural Networks [6.2241272327831485]
We introduce a novel neural network training framework that transforms the process by learning architecture and parameters simultaneously with gradient descent.
Central to our approach is a multi-scale encoder-decoder, in which the encoder embeds pairs of neural networks with similar functionalities close to each other.
Experiments demonstrate that our framework can discover sparse and compact neural networks maintaining a high performance.
arXiv Detail & Related papers (2024-10-10T19:57:36Z) - DNA Family: Boosting Weight-Sharing NAS with Block-Wise Supervisions [121.05720140641189]
We develop a family of models with the distilling neural architecture (DNA) techniques.
Our proposed DNA models can rate all architecture candidates, as opposed to previous works that can only access a sub- search space using algorithms.
Our models achieve state-of-the-art top-1 accuracy of 78.9% and 83.6% on ImageNet for a mobile convolutional network and a small vision transformer, respectively.
arXiv Detail & Related papers (2024-03-02T22:16:47Z) - Set-based Neural Network Encoding Without Weight Tying [91.37161634310819]
We propose a neural network weight encoding method for network property prediction.
Our approach is capable of encoding neural networks in a model zoo of mixed architecture.
We introduce two new tasks for neural network property prediction: cross-dataset and cross-architecture.
arXiv Detail & Related papers (2023-05-26T04:34:28Z) - NAR-Former: Neural Architecture Representation Learning towards Holistic
Attributes Prediction [37.357949900603295]
We propose a neural architecture representation model that can be used to estimate attributes holistically.
Experiment results show that our proposed framework can be used to predict the latency and accuracy attributes of both cell architectures and whole deep neural networks.
arXiv Detail & Related papers (2022-11-15T10:15:21Z) - GLEAM: Greedy Learning for Large-Scale Accelerated MRI Reconstruction [50.248694764703714]
Unrolled neural networks have recently achieved state-of-the-art accelerated MRI reconstruction.
These networks unroll iterative optimization algorithms by alternating between physics-based consistency and neural-network based regularization.
We propose Greedy LEarning for Accelerated MRI reconstruction, an efficient training strategy for high-dimensional imaging settings.
arXiv Detail & Related papers (2022-07-18T06:01:29Z) - Noisy Heuristics NAS: A Network Morphism based Neural Architecture
Search using Heuristics [11.726528038065764]
We present a new Network Morphism based NAS called Noisy Heuristics NAS.
We add new neurons randomly and prune away some to select only the best fitting neurons.
Our method generalizes both on toy datasets and on real-world data sets such as MNIST, CIFAR-10, and CIFAR-100.
arXiv Detail & Related papers (2022-07-10T13:58:21Z) - Variable Bitrate Neural Fields [75.24672452527795]
We present a dictionary method for compressing feature grids, reducing their memory consumption by up to 100x.
We formulate the dictionary optimization as a vector-quantized auto-decoder problem which lets us learn end-to-end discrete neural representations in a space where no direct supervision is available.
arXiv Detail & Related papers (2022-06-15T17:58:34Z) - CondenseNeXt: An Ultra-Efficient Deep Neural Network for Embedded
Systems [0.0]
A Convolutional Neural Network (CNN) is a class of Deep Neural Network (DNN) widely used in the analysis of visual images captured by an image sensor.
In this paper, we propose a neoteric variant of deep convolutional neural network architecture to ameliorate the performance of existing CNN architectures for real-time inference on embedded systems.
arXiv Detail & Related papers (2021-12-01T18:20:52Z) - A quantum algorithm for training wide and deep classical neural networks [72.2614468437919]
We show that conditions amenable to classical trainability via gradient descent coincide with those necessary for efficiently solving quantum linear systems.
We numerically demonstrate that the MNIST image dataset satisfies such conditions.
We provide empirical evidence for $O(log n)$ training of a convolutional neural network with pooling.
arXiv Detail & Related papers (2021-07-19T23:41:03Z) - Differentiable Neural Architecture Learning for Efficient Neural Network
Design [31.23038136038325]
We introduce a novel emph architecture parameterisation based on scaled sigmoid function.
We then propose a general emphiable Neural Architecture Learning (DNAL) method to optimize the neural architecture without the need to evaluate candidate neural networks.
arXiv Detail & Related papers (2021-03-03T02:03:08Z) - Binary Graph Neural Networks [69.51765073772226]
Graph Neural Networks (GNNs) have emerged as a powerful and flexible framework for representation learning on irregular data.
In this paper, we present and evaluate different strategies for the binarization of graph neural networks.
We show that through careful design of the models, and control of the training process, binary graph neural networks can be trained at only a moderate cost in accuracy on challenging benchmarks.
arXiv Detail & Related papers (2020-12-31T18:48:58Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.