Low-Rank Training of Deep Neural Networks for Emerging Memory Technology
- URL: http://arxiv.org/abs/2009.03887v2
- Date: Thu, 15 Jul 2021 03:06:18 GMT
- Title: Low-Rank Training of Deep Neural Networks for Emerging Memory Technology
- Authors: Albert Gural, Phillip Nadeau, Mehul Tikekar, Boris Murmann
- Abstract summary: We address two key challenges for training on edge devices with non-volatile memory: low write density and low auxiliary memory.
We present a low-rank training scheme that addresses these challenges while maintaining computational efficiency.
- Score: 4.456122555367167
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The recent success of neural networks for solving difficult decision tasks
has incentivized incorporating smart decision making "at the edge." However,
this work has traditionally focused on neural network inference, rather than
training, due to memory and compute limitations, especially in emerging
non-volatile memory systems, where writes are energetically costly and reduce
lifespan. Yet, the ability to train at the edge is becoming increasingly
important as it enables real-time adaptability to device drift and
environmental variation, user customization, and federated learning across
devices. In this work, we address two key challenges for training on edge
devices with non-volatile memory: low write density and low auxiliary memory.
We present a low-rank training scheme that addresses these challenges while
maintaining computational efficiency. We then demonstrate the technique on a
representative convolutional neural network across several adaptation problems,
where it out-performs standard SGD both in accuracy and in number of weight
writes.
Related papers
- DASH: Warm-Starting Neural Network Training in Stationary Settings without Loss of Plasticity [11.624569521079426]
We develop a framework emulating real-world neural network training and identify noise memorization as the primary cause of plasticity loss when warm-starting on stationary data.
Motivated by this, we propose Direction-Aware SHrinking (DASH), a method aiming to mitigate plasticity loss by selectively forgetting noise while preserving learned features.
arXiv Detail & Related papers (2024-10-30T22:57:54Z) - Simple and Effective Transfer Learning for Neuro-Symbolic Integration [50.592338727912946]
A potential solution to this issue is Neuro-Symbolic Integration (NeSy), where neural approaches are combined with symbolic reasoning.
Most of these methods exploit a neural network to map perceptions to symbols and a logical reasoner to predict the output of the downstream task.
They suffer from several issues, including slow convergence, learning difficulties with complex perception tasks, and convergence to local minima.
This paper proposes a simple yet effective method to ameliorate these problems.
arXiv Detail & Related papers (2024-02-21T15:51:01Z) - Enabling On-device Continual Learning with Binary Neural Networks [3.180732240499359]
We propose a solution that combines recent advancements in the field of Continual Learning (CL) and Binary Neural Networks (BNNs)
Specifically, our approach leverages binary latent replay activations and a novel quantization scheme that significantly reduces the number of bits required for gradient computation.
arXiv Detail & Related papers (2024-01-18T11:57:05Z) - Random resistive memory-based deep extreme point learning machine for
unified visual processing [67.51600474104171]
We propose a novel hardware-software co-design, random resistive memory-based deep extreme point learning machine (DEPLM)
Our co-design system achieves huge energy efficiency improvements and training cost reduction when compared to conventional systems.
arXiv Detail & Related papers (2023-12-14T09:46:16Z) - Spiking neural network for nonlinear regression [68.8204255655161]
Spiking neural networks carry the potential for a massive reduction in memory and energy consumption.
They introduce temporal and neuronal sparsity, which can be exploited by next-generation neuromorphic hardware.
A framework for regression using spiking neural networks is proposed.
arXiv Detail & Related papers (2022-10-06T13:04:45Z) - FPGA-optimized Hardware acceleration for Spiking Neural Networks [69.49429223251178]
This work presents the development of a hardware accelerator for an SNN, with off-line training, applied to an image recognition task.
The design targets a Xilinx Artix-7 FPGA, using in total around the 40% of the available hardware resources.
It reduces the classification time by three orders of magnitude, with a small 4.5% impact on the accuracy, if compared to its software, full precision counterpart.
arXiv Detail & Related papers (2022-01-18T13:59:22Z) - Center Loss Regularization for Continual Learning [0.0]
In general, neural networks lack the ability to learn different tasks sequentially.
Our approach remembers old tasks by projecting the representations of new tasks close to that of old tasks.
We demonstrate that our approach is scalable, effective, and gives competitive performance compared to state-of-the-art continual learning methods.
arXiv Detail & Related papers (2021-10-21T17:46:44Z) - Online Training of Spiking Recurrent Neural Networks with Phase-Change
Memory Synapses [1.9809266426888898]
Training spiking neural networks (RNNs) on dedicated neuromorphic hardware is still an open challenge.
We present a simulation framework of differential-architecture arrays based on an accurate and comprehensive Phase-Change Memory (PCM) device model.
We train a spiking RNN whose weights are emulated in the presented simulation framework, using a recently proposed e-prop learning rule.
arXiv Detail & Related papers (2021-08-04T01:24:17Z) - Sparsity in Deep Learning: Pruning and growth for efficient inference
and training in neural networks [78.47459801017959]
Sparsity can reduce the memory footprint of regular networks to fit mobile devices.
We describe approaches to remove and add elements of neural networks, different training strategies to achieve model sparsity, and mechanisms to exploit sparsity in practice.
arXiv Detail & Related papers (2021-01-31T22:48:50Z) - Dynamic Hard Pruning of Neural Networks at the Edge of the Internet [11.605253906375424]
Dynamic Hard Pruning (DynHP) technique incrementally prunes the network during training.
DynHP enables a tunable size reduction of the final neural network and reduces the NN memory occupancy during training.
Freed memory is reused by a emphdynamic batch sizing approach to counterbalance the accuracy degradation caused by the hard pruning strategy.
arXiv Detail & Related papers (2020-11-17T10:23:28Z) - Spiking Neural Networks Hardware Implementations and Challenges: a
Survey [53.429871539789445]
Spiking Neural Networks are cognitive algorithms mimicking neuron and synapse operational principles.
We present the state of the art of hardware implementations of spiking neural networks.
We discuss the strategies employed to leverage the characteristics of these event-driven algorithms at the hardware level.
arXiv Detail & Related papers (2020-05-04T13:24:00Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.