Modern Hopfield Networks and Attention for Immune Repertoire
Classification
- URL: http://arxiv.org/abs/2007.13505v1
- Date: Thu, 16 Jul 2020 20:35:46 GMT
- Title: Modern Hopfield Networks and Attention for Immune Repertoire
Classification
- Authors: Michael Widrich, Bernhard Sch\"afl, Hubert Ramsauer, Milena
Pavlovi\'c, Lukas Gruber, Markus Holzleitner, Johannes Brandstetter, Geir
Kjetil Sandve, Victor Greiff, Sepp Hochreiter, G\"unter Klambauer
- Abstract summary: We show that the attention mechanism of transformer architectures is actually the update rule of modern Hopfield networks.
We exploit this high storage capacity to solve a challenging multiple instance learning (MIL) problem in computational biology.
We present our novel method DeepRC that integrates transformer-like attention, or equivalently modern Hopfield networks, into deep learning architectures.
- Score: 8.488102471604908
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: A central mechanism in machine learning is to identify, store, and recognize
patterns. How to learn, access, and retrieve such patterns is crucial in
Hopfield networks and the more recent transformer architectures. We show that
the attention mechanism of transformer architectures is actually the update
rule of modern Hopfield networks that can store exponentially many patterns. We
exploit this high storage capacity of modern Hopfield networks to solve a
challenging multiple instance learning (MIL) problem in computational biology:
immune repertoire classification. Accurate and interpretable machine learning
methods solving this problem could pave the way towards new vaccines and
therapies, which is currently a very relevant research topic intensified by the
COVID-19 crisis. Immune repertoire classification based on the vast number of
immunosequences of an individual is a MIL problem with an unprecedentedly
massive number of instances, two orders of magnitude larger than currently
considered problems, and with an extremely low witness rate. In this work, we
present our novel method DeepRC that integrates transformer-like attention, or
equivalently modern Hopfield networks, into deep learning architectures for
massive MIL such as immune repertoire classification. We demonstrate that
DeepRC outperforms all other methods with respect to predictive performance on
large-scale experiments, including simulated and real-world virus infection
data, and enables the extraction of sequence motifs that are connected to a
given disease class. Source code and datasets: https://github.com/ml-jku/DeepRC
Related papers
- CAF-YOLO: A Robust Framework for Multi-Scale Lesion Detection in Biomedical Imagery [0.0682074616451595]
CAF-YOLO is a nimble yet robust method for medical object detection that leverages the strengths of convolutional neural networks (CNNs) and transformers.
ACFM module enhances the modeling of both global and local features, enabling the capture of long-term feature dependencies.
MSNN improves multi-scale information aggregation by extracting features across diverse scales.
arXiv Detail & Related papers (2024-08-04T01:44:44Z) - Affine-Consistent Transformer for Multi-Class Cell Nuclei Detection [76.11864242047074]
We propose a novel Affine-Consistent Transformer (AC-Former), which directly yields a sequence of nucleus positions.
We introduce an Adaptive Affine Transformer (AAT) module, which can automatically learn the key spatial transformations to warp original images for local network training.
Experimental results demonstrate that the proposed method significantly outperforms existing state-of-the-art algorithms on various benchmarks.
arXiv Detail & Related papers (2023-10-22T02:27:02Z) - Keep It Simple: CNN Model Complexity Studies for Interference
Classification Tasks [7.358050500046429]
We study the trade-off amongst dataset size, CNN model complexity, and classification accuracy under various levels of classification difficulty.
Our study, based on three wireless datasets, shows that a simpler CNN model with fewer parameters can perform just as well as a more complex model.
arXiv Detail & Related papers (2023-03-06T17:53:42Z) - Parameter-Efficient Masking Networks [61.43995077575439]
Advanced network designs often contain a large number of repetitive structures (e.g., Transformer)
In this study, we are the first to investigate the representative potential of fixed random weights with limited unique values by learning masks.
It leads to a new paradigm for model compression to diminish the model size.
arXiv Detail & Related papers (2022-10-13T03:39:03Z) - Self-Supervised Masked Convolutional Transformer Block for Anomaly
Detection [122.4894940892536]
We present a novel self-supervised masked convolutional transformer block (SSMCTB) that comprises the reconstruction-based functionality at a core architectural level.
In this work, we extend our previous self-supervised predictive convolutional attentive block (SSPCAB) with a 3D masked convolutional layer, a transformer for channel-wise attention, as well as a novel self-supervised objective based on Huber loss.
arXiv Detail & Related papers (2022-09-25T04:56:10Z) - Reducing Catastrophic Forgetting in Self Organizing Maps with
Internally-Induced Generative Replay [67.50637511633212]
A lifelong learning agent is able to continually learn from potentially infinite streams of pattern sensory data.
One major historic difficulty in building agents that adapt is that neural systems struggle to retain previously-acquired knowledge when learning from new samples.
This problem is known as catastrophic forgetting (interference) and remains an unsolved problem in the domain of machine learning to this day.
arXiv Detail & Related papers (2021-12-09T07:11:14Z) - The emergence of a concept in shallow neural networks [0.0]
We consider restricted Boltzmann machine (RBMs) trained over an unstructured dataset made of blurred copies of definite but unavailable archetypes''
We show that there exists a critical sample size beyond which the RBM can learn archetypes.
arXiv Detail & Related papers (2021-09-01T15:56:38Z) - Towards an Automatic Analysis of CHO-K1 Suspension Growth in
Microfluidic Single-cell Cultivation [63.94623495501023]
We propose a novel Machine Learning architecture, which allows us to infuse a neural deep network with human-powered abstraction on the level of data.
Specifically, we train a generative model simultaneously on natural and synthetic data, so that it learns a shared representation, from which a target variable, such as the cell count, can be reliably estimated.
arXiv Detail & Related papers (2020-10-20T08:36:51Z) - AutoML-Zero: Evolving Machine Learning Algorithms From Scratch [76.83052807776276]
We show that it is possible to automatically discover complete machine learning algorithms just using basic mathematical operations as building blocks.
We demonstrate this by introducing a novel framework that significantly reduces human bias through a generic search space.
We believe these preliminary successes in discovering machine learning algorithms from scratch indicate a promising new direction in the field.
arXiv Detail & Related papers (2020-03-06T19:00:04Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.