Memory and attention in deep learning
- URL: http://arxiv.org/abs/2107.01390v1
- Date: Sat, 3 Jul 2021 09:21:13 GMT
- Title: Memory and attention in deep learning
- Authors: Hung Le
- Abstract summary: Memory construction for machine is inevitable.
Recent progresses on modeling memory in deep learning have revolved around external memory constructions.
The aim of this thesis is to advance the understanding on memory and attention in deep learning.
- Score: 19.70919701635945
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Intelligence necessitates memory. Without memory, humans fail to perform
various nontrivial tasks such as reading novels, playing games or solving
maths. As the ultimate goal of machine learning is to derive intelligent
systems that learn and act automatically just like human, memory construction
for machine is inevitable. Artificial neural networks model neurons and
synapses in the brain by interconnecting computational units via weights, which
is a typical class of machine learning algorithms that resembles memory
structure. Their descendants with more complicated modeling techniques (a.k.a
deep learning) have been successfully applied to many practical problems and
demonstrated the importance of memory in the learning process of machinery
systems. Recent progresses on modeling memory in deep learning have revolved
around external memory constructions, which are highly inspired by
computational Turing models and biological neuronal systems. Attention
mechanisms are derived to support acquisition and retention operations on the
external memory. Despite the lack of theoretical foundations, these approaches
have shown promises to help machinery systems reach a higher level of
intelligence. The aim of this thesis is to advance the understanding on memory
and attention in deep learning. Its contributions include: (i) presenting a
collection of taxonomies for memory, (ii) constructing new memory-augmented
neural networks (MANNs) that support multiple control and memory units, (iii)
introducing variability via memory in sequential generative models, (iv)
searching for optimal writing operations to maximise the memorisation capacity
in slot-based memory networks, and (v) simulating the Universal Turing Machine
via Neural Stored-program Memory-a new kind of external memory for neural
networks.
Related papers
- Enhancing Length Extrapolation in Sequential Models with Pointer-Augmented Neural Memory [66.88278207591294]
We propose Pointer-Augmented Neural Memory (PANM) to help neural networks understand and apply symbol processing to new, longer sequences of data.
PANM integrates an external neural memory that uses novel physical addresses and pointer manipulation techniques to mimic human and computer symbol processing abilities.
arXiv Detail & Related papers (2024-04-18T03:03:46Z) - Survey on Memory-Augmented Neural Networks: Cognitive Insights to AI
Applications [4.9008611361629955]
Memory-Augmented Neural Networks (MANNs) blend human-like memory processes into AI.
The study investigates advanced architectures such as Hopfield Networks, Neural Turing Machines, Correlation Matrix Memories, Memformer, and Neural Attention Memory.
It dives into real-world uses of MANNs across Natural Language Processing, Computer Vision, Multimodal Learning, and Retrieval Models.
arXiv Detail & Related papers (2023-12-11T06:05:09Z) - Retentive or Forgetful? Diving into the Knowledge Memorizing Mechanism
of Language Models [49.39276272693035]
Large-scale pre-trained language models have shown remarkable memorizing ability.
Vanilla neural networks without pre-training have been long observed suffering from the catastrophic forgetting problem.
We find that 1) Vanilla language models are forgetful; 2) Pre-training leads to retentive language models; 3) Knowledge relevance and diversification significantly influence the memory formation.
arXiv Detail & Related papers (2023-05-16T03:50:38Z) - Sequence learning in a spiking neuronal network with memristive synapses [0.0]
A core concept that lies at the heart of brain computation is sequence learning and prediction.
Neuromorphic hardware emulates the way the brain processes information and maps neurons and synapses directly into a physical substrate.
We study the feasibility of using ReRAM devices as a replacement of the biological synapses in the sequence learning model.
arXiv Detail & Related papers (2022-11-29T21:07:23Z) - A bio-inspired implementation of a sparse-learning spike-based
hippocampus memory model [0.0]
We propose a novel bio-inspired memory model based on the hippocampus.
It can learn memories, recall them from a cue and even forget memories when trying to learn others with the same cue.
This work presents the first hardware implementation of a fully functional bio-inspired spike-based hippocampus memory model.
arXiv Detail & Related papers (2022-06-10T07:48:29Z) - Memory-enriched computation and learning in spiking neural networks
through Hebbian plasticity [9.453554184019108]
Hebbian plasticity is believed to play a pivotal role in biological memory.
We introduce a novel spiking neural network architecture that is enriched by Hebbian synaptic plasticity.
We show that Hebbian enrichment renders spiking neural networks surprisingly versatile in terms of their computational as well as learning capabilities.
arXiv Detail & Related papers (2022-05-23T12:48:37Z) - CogNGen: Constructing the Kernel of a Hyperdimensional Predictive
Processing Cognitive Architecture [79.07468367923619]
We present a new cognitive architecture that combines two neurobiologically plausible, computational models.
We aim to develop a cognitive architecture that has the power of modern machine learning techniques.
arXiv Detail & Related papers (2022-03-31T04:44:28Z) - A Neural Dynamic Model based on Activation Diffusion and a
Micro-Explanation for Cognitive Operations [4.416484585765028]
The neural mechanism of memory has a very close relation with the problem of representation in artificial intelligence.
A computational model was proposed to simulate the network of neurons in brain and how they process information.
arXiv Detail & Related papers (2020-11-27T01:34:08Z) - Neurocoder: Learning General-Purpose Computation Using Stored Neural
Programs [64.56890245622822]
Neurocoder is an entirely new class of general-purpose conditional computational machines.
It "codes" itself in a data-responsive way by composing relevant programs from a set of shareable, modular programs.
We show new capacity to learn modular programs, handle severe pattern shifts and remember old programs as new ones are learnt.
arXiv Detail & Related papers (2020-09-24T01:39:16Z) - Reservoir Memory Machines as Neural Computers [70.5993855765376]
Differentiable neural computers extend artificial neural networks with an explicit memory without interference.
We achieve some of the computational capabilities of differentiable neural computers with a model that can be trained very efficiently.
arXiv Detail & Related papers (2020-09-14T12:01:30Z) - Self-Attentive Associative Memory [69.40038844695917]
We propose to separate the storage of individual experiences (item memory) and their occurring relationships (relational memory)
We achieve competitive results with our proposed two-memory model in a diversity of machine learning tasks.
arXiv Detail & Related papers (2020-02-10T03:27:48Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.