Generalized Key-Value Memory to Flexibly Adjust Redundancy in
Memory-Augmented Networks
- URL: http://arxiv.org/abs/2203.06223v1
- Date: Fri, 11 Mar 2022 19:59:43 GMT
- Title: Generalized Key-Value Memory to Flexibly Adjust Redundancy in
Memory-Augmented Networks
- Authors: Denis Kleyko, Geethan Karunaratne, Jan M. Rabaey, Abu Sebastian, and
Abbas Rahimi
- Abstract summary: Memory-augmented neural networks enhance a neural network with an external key-value memory.
We propose a generalized key-value memory that decouples its dimension from the number of support vectors.
We show that adapting this parameter on demand effectively mitigates up to 44% nonidealities, at equal accuracy and number of devices.
- Score: 6.03025980398201
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Memory-augmented neural networks enhance a neural network with an external
key-value memory whose complexity is typically dominated by the number of
support vectors in the key memory. We propose a generalized key-value memory
that decouples its dimension from the number of support vectors by introducing
a free parameter that can arbitrarily add or remove redundancy to the key
memory representation. In effect, it provides an additional degree of freedom
to flexibly control the trade-off between robustness and the resources required
to store and compute the generalized key-value memory. This is particularly
useful for realizing the key memory on in-memory computing hardware where it
exploits nonideal, but extremely efficient non-volatile memory devices for
dense storage and computation. Experimental results show that adapting this
parameter on demand effectively mitigates up to 44% nonidealities, at equal
accuracy and number of devices, without any need for neural network retraining.
Related papers
- Heterogenous Memory Augmented Neural Networks [84.29338268789684]
We introduce a novel heterogeneous memory augmentation approach for neural networks.
By introducing learnable memory tokens with attention mechanism, we can effectively boost performance without huge computational overhead.
We show our approach on various image and graph-based tasks under both in-distribution (ID) and out-of-distribution (OOD) conditions.
arXiv Detail & Related papers (2023-10-17T01:05:28Z) - Universal Recurrent Event Memories for Streaming Data [0.0]
We propose a new event memory architecture (MemNet) for recurrent neural networks.
MemNet stores key-value pairs, which separate the information for addressing and for content.
MemNet architecture can be applied without modifications to scalar time series, logic operators on strings, and also to natural language processing.
arXiv Detail & Related papers (2023-07-28T17:40:58Z) - MF-NeRF: Memory Efficient NeRF with Mixed-Feature Hash Table [62.164549651134465]
We propose MF-NeRF, a memory-efficient NeRF framework that employs a Mixed-Feature hash table to improve memory efficiency and reduce training time while maintaining reconstruction quality.
Our experiments with state-of-the-art Instant-NGP, TensoRF, and DVGO, indicate our MF-NeRF could achieve the fastest training time on the same GPU hardware with similar or even higher reconstruction quality.
arXiv Detail & Related papers (2023-04-25T05:44:50Z) - Mesa: A Memory-saving Training Framework for Transformers [58.78933015299703]
We present Mesa, a memory-saving training framework for Transformers.
Mesa uses exact activations during forward pass while storing a low-precision version of activations to reduce memory consumption during training.
Experiments on ImageNet, CIFAR-100 and ADE20K demonstrate that Mesa can reduce half of the memory footprints during training.
arXiv Detail & Related papers (2021-11-22T11:23:01Z) - MCUNetV2: Memory-Efficient Patch-based Inference for Tiny Deep Learning [72.80896338009579]
We find that the memory bottleneck is due to the imbalanced memory distribution in convolutional neural network (CNN) designs.
We propose a generic patch-by-patch inference scheduling, which significantly cuts down the peak memory.
We automate the process with neural architecture search to jointly optimize the neural architecture and inference scheduling, leading to MCUNetV2.
arXiv Detail & Related papers (2021-10-28T17:58:45Z) - Neural Network Compression for Noisy Storage Devices [71.4102472611862]
Conventionally, model compression and physical storage are decoupled.
This approach forces the storage to treat each bit of the compressed model equally, and to dedicate the same amount of resources to each bit.
We propose a radically different approach that: (i) employs analog memories to maximize the capacity of each memory cell, and (ii) jointly optimize model compression and physical storage to maximize memory utility.
arXiv Detail & Related papers (2021-02-15T18:19:07Z) - CNN with large memory layers [2.368995563245609]
This work is centred around the recently proposed product key memory structure citelarge_memory, implemented for a number of computer vision applications.
The memory structure can be regarded as a simple computation primitive suitable to be augmented to nearly all neural network architectures.
arXiv Detail & Related papers (2021-01-27T20:58:20Z) - Memformer: A Memory-Augmented Transformer for Sequence Modeling [55.780849185884996]
We present Memformer, an efficient neural network for sequence modeling.
Our model achieves linear time complexity and constant memory space complexity when processing long sequences.
arXiv Detail & Related papers (2020-10-14T09:03:36Z) - Robust High-dimensional Memory-augmented Neural Networks [13.82206983716435]
Memory-augmented neural networks enhance neural networks with an explicit memory to overcome these issues.
Access to this explicit memory occurs via soft read and write operations involving every individual memory entry.
We propose a robust architecture that employs a computational memory unit as the explicit memory performing analog in-memory computation on high-dimensional (HD) vectors.
arXiv Detail & Related papers (2020-10-05T12:01:56Z) - Improving Memory Utilization in Convolutional Neural Network
Accelerators [16.340620299847384]
We propose a mapping method that allows activation layers to overlap and thus utilize the memory more efficiently.
Experiments with various real-world object detector networks show that the proposed mapping technique can decrease the activations memory by up to 32.9%.
For higher resolution de-noising networks, we achieve activation memory savings of 48.8%.
arXiv Detail & Related papers (2020-07-20T09:34:36Z) - Efficient Memory Management for Deep Neural Net Inference [0.0]
Deep neural net inference can now be moved to mobile and embedded devices, desired for various reasons ranging from latency to privacy.
These devices are not only limited by their compute power and battery, but also by their inferior physical memory and cache, and thus, an efficient memory manager becomes a crucial component for deep neural net inference at the edge.
arXiv Detail & Related papers (2020-01-10T02:45:41Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.