DRILL: Dynamic Representations for Imbalanced Lifelong Learning
- URL: http://arxiv.org/abs/2105.08445v1
- Date: Tue, 18 May 2021 11:36:37 GMT
- Title: DRILL: Dynamic Representations for Imbalanced Lifelong Learning
- Authors: Kyra Ahrens, Fares Abawi, Stefan Wermter
- Abstract summary: Continual or lifelong learning has been a long-standing challenge in machine learning to date.
We introduce DRILL, a novel continual learning architecture for open-domain text classification.
- Score: 15.606651610221416
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Continual or lifelong learning has been a long-standing challenge in machine
learning to date, especially in natural language processing (NLP). Although
state-of-the-art language models such as BERT have ushered in a new era in this
field due to their outstanding performance in multitask learning scenarios,
they suffer from forgetting when being exposed to a continuous stream of data
with shifting data distributions. In this paper, we introduce DRILL, a novel
continual learning architecture for open-domain text classification. DRILL
leverages a biologically inspired self-organizing neural architecture to
selectively gate latent language representations from BERT in a
task-incremental manner. We demonstrate in our experiments that DRILL
outperforms current methods in a realistic scenario of imbalanced,
non-stationary data without prior knowledge about task boundaries. To the best
of our knowledge, DRILL is the first of its kind to use a self-organizing
neural architecture for open-domain lifelong learning in NLP.
Related papers
- Exploiting the Semantic Knowledge of Pre-trained Text-Encoders for Continual Learning [70.64617500380287]
Continual learning allows models to learn from new data while retaining previously learned knowledge.
The semantic knowledge available in the label information of the images, offers important semantic information that can be related with previously acquired knowledge of semantic classes.
We propose integrating semantic guidance within and across tasks by capturing semantic similarity using text embeddings.
arXiv Detail & Related papers (2024-08-02T07:51:44Z) - CTP: Towards Vision-Language Continual Pretraining via Compatible
Momentum Contrast and Topology Preservation [128.00940554196976]
Vision-Language Continual Pretraining (VLCP) has shown impressive results on diverse downstream tasks by offline training on large-scale datasets.
To support the study of Vision-Language Continual Pretraining (VLCP), we first contribute a comprehensive and unified benchmark dataset P9D.
The data from each industry as an independent task supports continual learning and conforms to the real-world long-tail nature to simulate pretraining on web data.
arXiv Detail & Related papers (2023-08-14T13:53:18Z) - Transfer Learning with Deep Tabular Models [66.67017691983182]
We show that upstream data gives tabular neural networks a decisive advantage over GBDT models.
We propose a realistic medical diagnosis benchmark for tabular transfer learning.
We propose a pseudo-feature method for cases where the upstream and downstream feature sets differ.
arXiv Detail & Related papers (2022-06-30T14:24:32Z) - Reducing Catastrophic Forgetting in Self Organizing Maps with
Internally-Induced Generative Replay [67.50637511633212]
A lifelong learning agent is able to continually learn from potentially infinite streams of pattern sensory data.
One major historic difficulty in building agents that adapt is that neural systems struggle to retain previously-acquired knowledge when learning from new samples.
This problem is known as catastrophic forgetting (interference) and remains an unsolved problem in the domain of machine learning to this day.
arXiv Detail & Related papers (2021-12-09T07:11:14Z) - Embedding Symbolic Temporal Knowledge into Deep Sequential Models [21.45383857094518]
Sequences and time-series often arise in robot tasks, e.g., in activity recognition and imitation learning.
Deep neural networks (DNNs) have emerged as an effective data-driven methodology for processing sequences given sufficient training data and compute resources.
We construct semantic-based embeddings of automata generated from formula via a Graph Neural Network. Experiments show that these learnt embeddings can lead to improvements in downstream robot tasks such as sequential action recognition and imitation learning.
arXiv Detail & Related papers (2021-01-28T13:17:46Z) - Fine-tuning BERT for Low-Resource Natural Language Understanding via
Active Learning [30.5853328612593]
In this work, we explore fine-tuning methods of BERT -- a pre-trained Transformer based language model.
Our experimental results show an advantage in model performance by maximizing the approximate knowledge gain of the model.
We analyze the benefits of freezing layers of the language model during fine-tuning to reduce the number of trainable parameters.
arXiv Detail & Related papers (2020-12-04T08:34:39Z) - Continual Learning for Natural Language Generation in Task-oriented
Dialog Systems [72.92029584113676]
Natural language generation (NLG) is an essential component of task-oriented dialog systems.
We study NLG in a "continual learning" setting to expand its knowledge to new domains or functionalities incrementally.
The major challenge towards this goal is catastrophic forgetting, meaning that a continually trained model tends to forget the knowledge it has learned before.
arXiv Detail & Related papers (2020-10-02T10:32:29Z) - Self-Supervised Learning Aided Class-Incremental Lifelong Learning [17.151579393716958]
We study the issue of catastrophic forgetting in class-incremental learning (Class-IL)
In training procedure of Class-IL, as the model has no knowledge about following tasks, it would only extract features necessary for tasks learned so far, whose information is insufficient for joint classification.
We propose to combine self-supervised learning, which can provide effective representations without requiring labels, with Class-IL to partly get around this problem.
arXiv Detail & Related papers (2020-06-10T15:15:27Z) - A Neural Dirichlet Process Mixture Model for Task-Free Continual
Learning [48.87397222244402]
We propose an expansion-based approach for task-free continual learning.
Our model successfully performs task-free continual learning for both discriminative and generative tasks.
arXiv Detail & Related papers (2020-01-03T02:07:31Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.