Related papers: Memory and attention in deep learning

Memory and attention in deep learning

URL: http://arxiv.org/abs/2107.01390v1
Date: Sat, 3 Jul 2021 09:21:13 GMT
Title: Memory and attention in deep learning
Authors: Hung Le
Abstract summary: Memory construction for machine is inevitable. Recent progresses on modeling memory in deep learning have revolved around external memory constructions. The aim of this thesis is to advance the understanding on memory and attention in deep learning.
Score: 19.70919701635945
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Intelligence necessitates memory. Without memory, humans fail to perform various nontrivial tasks such as reading novels, playing games or solving maths. As the ultimate goal of machine learning is to derive intelligent systems that learn and act automatically just like human, memory construction for machine is inevitable. Artificial neural networks model neurons and synapses in the brain by interconnecting computational units via weights, which is a typical class of machine learning algorithms that resembles memory structure. Their descendants with more complicated modeling techniques (a.k.a deep learning) have been successfully applied to many practical problems and demonstrated the importance of memory in the learning process of machinery systems. Recent progresses on modeling memory in deep learning have revolved around external memory constructions, which are highly inspired by computational Turing models and biological neuronal systems. Attention mechanisms are derived to support acquisition and retention operations on the external memory. Despite the lack of theoretical foundations, these approaches have shown promises to help machinery systems reach a higher level of intelligence. The aim of this thesis is to advance the understanding on memory and attention in deep learning. Its contributions include: (i) presenting a collection of taxonomies for memory, (ii) constructing new memory-augmented neural networks (MANNs) that support multiple control and memory units, (iii) introducing variability via memory in sequential generative models, (iv) searching for optimal writing operations to maximise the memorisation capacity in slot-based memory networks, and (v) simulating the Universal Turing Machine via Neural Stored-program Memory-a new kind of external memory for neural networks.

Related papers

Hardware-software co-exploration with racetrack memory based in-memory computing for CNN inference in embedded systems [54.045712360156024]
racetrack memory is a non-volatile technology that allows high data density fabrication.<n>In-memory arithmetic circuits with memory cells affects both the memory density and power efficiency.<n>We present an efficient in-memory convolutional neural network (CNN) accelerator optimized for use with racetrack memory.
arXiv Detail & Related papers (2025-07-02T07:29:53Z)
From Human Memory to AI Memory: A Survey on Memory Mechanisms in the Era of LLMs [34.361000444808454]
Memory is the process of encoding, storing, and retrieving information. In the era of large language models (LLMs), memory refers to the ability of an AI system to retain, recall, and use information from past interactions to improve future responses and interactions.
arXiv Detail & Related papers (2025-04-22T15:05:04Z)
Semi-parametric Memory Consolidation: Towards Brain-like Deep Continual Learning [59.35015431695172]
We propose a novel biomimetic continual learning framework that integrates semi-parametric memory and the wake-sleep consolidation mechanism. For the first time, our method enables deep neural networks to retain high performance on novel tasks while maintaining prior knowledge in real-world challenging continual learning scenarios.
arXiv Detail & Related papers (2025-04-20T19:53:13Z)
Enhancing Length Extrapolation in Sequential Models with Pointer-Augmented Neural Memory [66.88278207591294]
We propose Pointer-Augmented Neural Memory (PANM) to help neural networks understand and apply symbol processing to new, longer sequences of data. PANM integrates an external neural memory that uses novel physical addresses and pointer manipulation techniques to mimic human and computer symbol processing abilities.
arXiv Detail & Related papers (2024-04-18T03:03:46Z)
Survey on Memory-Augmented Neural Networks: Cognitive Insights to AI Applications [4.9008611361629955]
Memory-Augmented Neural Networks (MANNs) blend human-like memory processes into AI. The study investigates advanced architectures such as Hopfield Networks, Neural Turing Machines, Correlation Matrix Memories, Memformer, and Neural Attention Memory. It dives into real-world uses of MANNs across Natural Language Processing, Computer Vision, Multimodal Learning, and Retrieval Models.
arXiv Detail & Related papers (2023-12-11T06:05:09Z)
Retentive or Forgetful? Diving into the Knowledge Memorizing Mechanism of Language Models [49.39276272693035]
Large-scale pre-trained language models have shown remarkable memorizing ability. Vanilla neural networks without pre-training have been long observed suffering from the catastrophic forgetting problem. We find that 1) Vanilla language models are forgetful; 2) Pre-training leads to retentive language models; 3) Knowledge relevance and diversification significantly influence the memory formation.
arXiv Detail & Related papers (2023-05-16T03:50:38Z)
Sequence learning in a spiking neuronal network with memristive synapses [0.0]
A core concept that lies at the heart of brain computation is sequence learning and prediction. Neuromorphic hardware emulates the way the brain processes information and maps neurons and synapses directly into a physical substrate. We study the feasibility of using ReRAM devices as a replacement of the biological synapses in the sequence learning model.
arXiv Detail & Related papers (2022-11-29T21:07:23Z)
A bio-inspired implementation of a sparse-learning spike-based hippocampus memory model [0.0]
We propose a novel bio-inspired memory model based on the hippocampus. It can learn memories, recall them from a cue and even forget memories when trying to learn others with the same cue. This work presents the first hardware implementation of a fully functional bio-inspired spike-based hippocampus memory model.
arXiv Detail & Related papers (2022-06-10T07:48:29Z)
Memory-enriched computation and learning in spiking neural networks through Hebbian plasticity [9.453554184019108]
Hebbian plasticity is believed to play a pivotal role in biological memory. We introduce a novel spiking neural network architecture that is enriched by Hebbian synaptic plasticity. We show that Hebbian enrichment renders spiking neural networks surprisingly versatile in terms of their computational as well as learning capabilities.
arXiv Detail & Related papers (2022-05-23T12:48:37Z)
CogNGen: Constructing the Kernel of a Hyperdimensional Predictive Processing Cognitive Architecture [79.07468367923619]
We present a new cognitive architecture that combines two neurobiologically plausible, computational models. We aim to develop a cognitive architecture that has the power of modern machine learning techniques.
arXiv Detail & Related papers (2022-03-31T04:44:28Z)
A Neural Dynamic Model based on Activation Diffusion and a Micro-Explanation for Cognitive Operations [4.416484585765028]
The neural mechanism of memory has a very close relation with the problem of representation in artificial intelligence. A computational model was proposed to simulate the network of neurons in brain and how they process information.
arXiv Detail & Related papers (2020-11-27T01:34:08Z)
Neurocoder: Learning General-Purpose Computation Using Stored Neural Programs [64.56890245622822]
Neurocoder is an entirely new class of general-purpose conditional computational machines. It "codes" itself in a data-responsive way by composing relevant programs from a set of shareable, modular programs. We show new capacity to learn modular programs, handle severe pattern shifts and remember old programs as new ones are learnt.
arXiv Detail & Related papers (2020-09-24T01:39:16Z)
Reservoir Memory Machines as Neural Computers [70.5993855765376]
Differentiable neural computers extend artificial neural networks with an explicit memory without interference. We achieve some of the computational capabilities of differentiable neural computers with a model that can be trained very efficiently.
arXiv Detail & Related papers (2020-09-14T12:01:30Z)
Self-Attentive Associative Memory [69.40038844695917]
We propose to separate the storage of individual experiences (item memory) and their occurring relationships (relational memory) We achieve competitive results with our proposed two-memory model in a diversity of machine learning tasks.
arXiv Detail & Related papers (2020-02-10T03:27:48Z)

This list is automatically generated from the titles and abstracts of the papers in this site.