Evaluating Online Continual Learning with CALM
- URL: http://arxiv.org/abs/2004.03340v2
- Date: Mon, 1 Feb 2021 12:20:27 GMT
- Title: Evaluating Online Continual Learning with CALM
- Authors: Germ\'an Kruszewski, Ionut-Teodor Sorodoc, Tomas Mikolov
- Abstract summary: Online Continual Learning studies learning over a continuous data stream without observing any single example more than once.
We propose a new benchmark for OCL based on language modelling in which input alternates between different languages and domains without any explicit delimitation.
We also propose new metrics to study catastrophic forgetting in this setting and evaluate multiple baseline models based on compositions of experts.
- Score: 3.49781504808707
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Online Continual Learning (OCL) studies learning over a continuous data
stream without observing any single example more than once, a setting that is
closer to the experience of humans and systems that must learn "on-the-wild".
Yet, commonly available benchmarks are far from these real-world conditions,
because they explicitly signal different tasks, lack latent similarity
structure or assume temporal independence between different examples. Here, we
propose a new benchmark for OCL based on language modelling in which input
alternates between different languages and domains without any explicit
delimitation. Additionally, we propose new metrics to study catastrophic
forgetting in this setting and evaluate multiple baseline models based on
compositions of experts. Finally, we introduce a simple gating technique that
learns the latent similarities between different inputs, improving the
performance of a Products of Experts model.
Related papers
- LLMTemporalComparator: A Tool for Analysing Differences in Temporal Adaptations of Large Language Models [17.021220773165016]
This study addresses the challenges of analyzing temporal discrepancies in large language models (LLMs) trained on data from different time periods.
We propose a novel system that compares in a systematic way the outputs of two LLM versions based on user-defined queries.
arXiv Detail & Related papers (2024-10-05T15:17:07Z) - Few Shot Class Incremental Learning using Vision-Language models [24.930246674021525]
In this study, we introduce an innovative few-shot class incremental learning (FSCIL) framework that utilizes language regularizer and subspace regularizer.
Our proposed framework not only empowers the model to embrace novel classes with limited data, but also ensures the preservation of performance on base classes.
arXiv Detail & Related papers (2024-05-02T06:52:49Z) - Continual Learning with Pre-Trained Models: A Survey [61.97613090666247]
Continual Learning aims to overcome the catastrophic forgetting of former knowledge when learning new ones.
This paper presents a comprehensive survey of the latest advancements in PTM-based CL.
arXiv Detail & Related papers (2024-01-29T18:27:52Z) - Continual Contrastive Spoken Language Understanding [33.09005399967931]
COCONUT is a class-incremental learning (CIL) method that relies on the combination of experience replay and contrastive learning.
We show that COCONUT can be combined with methods that operate on the decoder side of the model, resulting in further metrics improvements.
arXiv Detail & Related papers (2023-10-04T10:09:12Z) - RAVEN: In-Context Learning with Retrieval-Augmented Encoder-Decoder Language Models [57.12888828853409]
RAVEN is a model that combines retrieval-augmented masked language modeling and prefix language modeling.
Fusion-in-Context Learning enables the model to leverage more in-context examples without requiring additional training.
Our work underscores the potential of retrieval-augmented encoder-decoder language models for in-context learning.
arXiv Detail & Related papers (2023-08-15T17:59:18Z) - OpenSTL: A Comprehensive Benchmark of Spatio-Temporal Predictive
Learning [67.07363529640784]
We propose OpenSTL to categorize prevalent approaches into recurrent-based and recurrent-free models.
We conduct standard evaluations on datasets across various domains, including synthetic moving object trajectory, human motion, driving scenes, traffic flow and forecasting weather.
We find that recurrent-free models achieve a good balance between efficiency and performance than recurrent models.
arXiv Detail & Related papers (2023-06-20T03:02:14Z) - On the Compositional Generalization Gap of In-Context Learning [73.09193595292233]
We look at the gap between the in-distribution (ID) and out-of-distribution (OOD) performance of such models in semantic parsing tasks with in-context learning.
We evaluate four model families, OPT, BLOOM, CodeGen and Codex on three semantic parsing datasets.
arXiv Detail & Related papers (2022-11-15T19:56:37Z) - A Multi-level Supervised Contrastive Learning Framework for Low-Resource
Natural Language Inference [54.678516076366506]
Natural Language Inference (NLI) is a growingly essential task in natural language understanding.
Here we propose a multi-level supervised contrastive learning framework named MultiSCL for low-resource natural language inference.
arXiv Detail & Related papers (2022-05-31T05:54:18Z) - A Survey on Contrastive Self-supervised Learning [0.0]
Self-supervised learning has gained popularity because of its ability to avoid the cost of annotating large-scale datasets.
Contrastive learning has recently become a dominant component in self-supervised learning methods for computer vision, natural language processing (NLP), and other domains.
This paper provides an extensive review of self-supervised methods that follow the contrastive approach.
arXiv Detail & Related papers (2020-10-31T21:05:04Z) - DiVA: Diverse Visual Feature Aggregation for Deep Metric Learning [83.48587570246231]
Visual Similarity plays an important role in many computer vision applications.
Deep metric learning (DML) is a powerful framework for learning such similarities.
We propose and study multiple complementary learning tasks, targeting conceptually different data relationships.
We learn a single model to aggregate their training signals, resulting in strong generalization and state-of-the-art performance.
arXiv Detail & Related papers (2020-04-28T12:26:50Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.