CNN with large memory layers
- URL: http://arxiv.org/abs/2101.11685v1
- Date: Wed, 27 Jan 2021 20:58:20 GMT
- Title: CNN with large memory layers
- Authors: Rasul Karimov, Victor Lempitsky
- Abstract summary: This work is centred around the recently proposed product key memory structure citelarge_memory, implemented for a number of computer vision applications.
The memory structure can be regarded as a simple computation primitive suitable to be augmented to nearly all neural network architectures.
- Score: 2.368995563245609
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: This work is centred around the recently proposed product key memory
structure \cite{large_memory}, implemented for a number of computer vision
applications. The memory structure can be regarded as a simple computation
primitive suitable to be augmented to nearly all neural network architectures.
The memory block allows implementing sparse access to memory with square root
complexity scaling with respect to the memory capacity. The latter scaling is
possible due to the incorporation of Cartesian product space decomposition of
the key space for the nearest neighbour search. We have tested the memory layer
on the classification, image reconstruction and relocalization problems and
found that for some of those, the memory layers can provide significant
speed/accuracy improvement with the high utilization of the key-value elements,
while others require more careful fine-tuning and suffer from dying keys. To
tackle the later problem we have introduced a simple technique of memory
re-initialization which helps us to eliminate unused key-value pairs from the
memory and engage them in training again. We have conducted various experiments
and got improvements in speed and accuracy for classification and PoseNet
relocalization models.
We showed that the re-initialization has a huge impact on a toy example of
randomly labeled data and observed some gains in performance on the image
classification task. We have also demonstrated the generalization property
perseverance of the large memory layers on the relocalization problem, while
observing the spatial correlations between the images and the selected memory
cells.
Related papers
- Associative Memories in the Feature Space [68.1903319310263]
We propose a class of memory models that only stores low-dimensional semantic embeddings, and uses them to retrieve similar, but not identical, memories.
We demonstrate a proof of concept of this method on a simple task on the MNIST dataset.
arXiv Detail & Related papers (2024-02-16T16:37:48Z) - Topology-aware Embedding Memory for Continual Learning on Expanding Networks [63.35819388164267]
We present a framework to tackle the memory explosion problem using memory replay techniques.
PDGNNs with Topology-aware Embedding Memory (TEM) significantly outperform state-of-the-art techniques.
arXiv Detail & Related papers (2024-01-24T03:03:17Z) - Improving Image Recognition by Retrieving from Web-Scale Image-Text Data [68.63453336523318]
We introduce an attention-based memory module, which learns the importance of each retrieved example from the memory.
Compared to existing approaches, our method removes the influence of the irrelevant retrieved examples, and retains those that are beneficial to the input query.
We show that it achieves state-of-the-art accuracies in ImageNet-LT, Places-LT and Webvision datasets.
arXiv Detail & Related papers (2023-04-11T12:12:05Z) - From seeing to remembering: Images with harder-to-reconstruct
representations leave stronger memory traces [4.012995481864761]
We present a sparse coding model for compressing feature embeddings of images, and show that the reconstruction residuals from this model predict how well images are encoded into memory.
In an open memorability dataset of scene images, we show that reconstruction error not only explains memory accuracy but also response latencies during retrieval, subsuming, in the latter case, all of the variance explained by powerful vision-only models.
arXiv Detail & Related papers (2023-02-21T01:40:32Z) - Space Time Recurrent Memory Network [35.06536468525509]
We propose a novel visual memory network architecture for the learning and inference problem in the spatial-temporal domain.
This architecture is benchmarked on the video object segmentation and video prediction problems.
We show that our memory architecture can achieve competitive results with state-of-the-art while maintaining constant memory capacity.
arXiv Detail & Related papers (2021-09-14T06:53:51Z) - Kanerva++: extending The Kanerva Machine with differentiable, locally
block allocated latent memory [75.65949969000596]
Episodic and semantic memory are critical components of the human memory model.
We develop a new principled Bayesian memory allocation scheme that bridges the gap between episodic and semantic memory.
We demonstrate that this allocation scheme improves performance in memory conditional image generation.
arXiv Detail & Related papers (2021-02-20T18:40:40Z) - Memformer: A Memory-Augmented Transformer for Sequence Modeling [55.780849185884996]
We present Memformer, an efficient neural network for sequence modeling.
Our model achieves linear time complexity and constant memory space complexity when processing long sequences.
arXiv Detail & Related papers (2020-10-14T09:03:36Z) - Robust High-dimensional Memory-augmented Neural Networks [13.82206983716435]
Memory-augmented neural networks enhance neural networks with an explicit memory to overcome these issues.
Access to this explicit memory occurs via soft read and write operations involving every individual memory entry.
We propose a robust architecture that employs a computational memory unit as the explicit memory performing analog in-memory computation on high-dimensional (HD) vectors.
arXiv Detail & Related papers (2020-10-05T12:01:56Z) - Video Object Segmentation with Episodic Graph Memory Networks [198.74780033475724]
A graph memory network is developed to address the novel idea of "learning to update the segmentation model"
We exploit an episodic memory network, organized as a fully connected graph, to store frames as nodes and capture cross-frame correlations by edges.
The proposed graph memory network yields a neat yet principled framework, which can generalize well both one-shot and zero-shot video object segmentation tasks.
arXiv Detail & Related papers (2020-07-14T13:19:19Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.