Continuum: Simple Management of Complex Continual Learning Scenarios
- URL: http://arxiv.org/abs/2102.06253v1
- Date: Thu, 11 Feb 2021 20:29:13 GMT
- Title: Continuum: Simple Management of Complex Continual Learning Scenarios
- Authors: Arthur Douillard and Timoth\'ee Lesort
- Abstract summary: Continual learning is a machine learning sub-field specialized in settings with non-iid data.
Continual learning's challenge is to create algorithms able to learn an ever-growing amount of knowledge while dealing with data distribution drifts.
Small errors in data loaders have a critical impact on algorithm results.
- Score: 1.52292571922932
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Continual learning is a machine learning sub-field specialized in settings
with non-iid data. Hence, the training data distribution is not static and
drifts through time. Those drifts might cause interferences in the trained
model and knowledge learned on previous states of the data distribution might
be forgotten. Continual learning's challenge is to create algorithms able to
learn an ever-growing amount of knowledge while dealing with data distribution
drifts.
One implementation difficulty in these field is to create data loaders that
simulate non-iid scenarios. Indeed, data loaders are a key component for
continual algorithms. They should be carefully designed and reproducible. Small
errors in data loaders have a critical impact on algorithm results, e.g. with
bad preprocessing, wrong order of data or bad test set. Continuum is a simple
and efficient framework with numerous data loaders that avoid researcher to
spend time on designing data loader and eliminate time-consuming errors. Using
our proposed framework, it is possible to directly focus on the model design by
using the multiple scenarios and evaluation metrics implemented. Furthermore
the framework is easily extendable to add novel settings for specific needs.
Related papers
- Beyond Data Scarcity: A Frequency-Driven Framework for Zero-Shot Forecasting [15.431513584239047]
Time series forecasting is critical in numerous real-world applications.
Traditional forecasting techniques struggle when data is scarce or not available at all.
Recent advancements often leverage large-scale foundation models for such tasks.
arXiv Detail & Related papers (2024-11-24T07:44:39Z) - Enhancing Consistency and Mitigating Bias: A Data Replay Approach for
Incremental Learning [100.7407460674153]
Deep learning systems are prone to catastrophic forgetting when learning from a sequence of tasks.
To mitigate the problem, a line of methods propose to replay the data of experienced tasks when learning new tasks.
However, it is not expected in practice considering the memory constraint or data privacy issue.
As a replacement, data-free data replay methods are proposed by inverting samples from the classification model.
arXiv Detail & Related papers (2024-01-12T12:51:12Z) - Robust Machine Learning by Transforming and Augmenting Imperfect
Training Data [6.928276018602774]
This thesis explores several data sensitivities of modern machine learning.
We first discuss how to prevent ML from codifying prior human discrimination measured in the training data.
We then discuss the problem of learning from data containing spurious features, which provide predictive fidelity during training but are unreliable upon deployment.
arXiv Detail & Related papers (2023-12-19T20:49:28Z) - Zero-shot Retrieval: Augmenting Pre-trained Models with Search Engines [83.65380507372483]
Large pre-trained models can dramatically reduce the amount of task-specific data required to solve a problem, but they often fail to capture domain-specific nuances out of the box.
This paper shows how to leverage recent advances in NLP and multi-modal learning to augment a pre-trained model with search engine retrieval.
arXiv Detail & Related papers (2023-11-29T05:33:28Z) - Task-Aware Machine Unlearning and Its Application in Load Forecasting [4.00606516946677]
This paper introduces the concept of machine unlearning which is specifically designed to remove the influence of part of the dataset on an already trained forecaster.
A performance-aware algorithm is proposed by evaluating the sensitivity of local model parameter change using influence function and sample re-weighting.
We tested the unlearning algorithms on linear, CNN, andMixer based load forecasters with a realistic load dataset.
arXiv Detail & Related papers (2023-08-28T08:50:12Z) - Federated Learning for Data Streams [12.856037831335994]
Federated learning (FL) is an effective solution to train machine learning models on the increasing amount of data generated by IoT devices and smartphones.
Most previous work on federated learning assumes that clients operate on static datasets collected before training starts.
We propose a general FL algorithm to learn from data streams through an opportune weighted empirical risk minimization.
arXiv Detail & Related papers (2023-01-04T11:10:48Z) - On Measuring the Intrinsic Few-Shot Hardness of Datasets [49.37562545777455]
We show that few-shot hardness may be intrinsic to datasets, for a given pre-trained model.
We propose a simple and lightweight metric called "Spread" that captures the intuition that few-shot learning is made possible.
Our metric better accounts for few-shot hardness compared to existing notions of hardness, and is 8-100x faster to compute.
arXiv Detail & Related papers (2022-11-16T18:53:52Z) - Discrete Key-Value Bottleneck [95.61236311369821]
Deep neural networks perform well on classification tasks where data streams are i.i.d. and labeled data is abundant.
One powerful approach that has addressed this challenge involves pre-training of large encoders on volumes of readily available data, followed by task-specific tuning.
Given a new task, however, updating the weights of these encoders is challenging as a large number of weights needs to be fine-tuned, and as a result, they forget information about the previous tasks.
We propose a model architecture to address this issue, building upon a discrete bottleneck containing pairs of separate and learnable key-value codes.
arXiv Detail & Related papers (2022-07-22T17:52:30Z) - The Challenges of Continuous Self-Supervised Learning [40.941767578622745]
Self-supervised learning (SSL) aims to eliminate one of the major bottlenecks in representation learning - the need for human annotations.
We show that a direct application of current methods to such continuous setup is inefficient both computationally and in the amount of data required.
We propose the use of replay buffers as an approach to alleviate the issues of inefficiency and temporal correlations.
arXiv Detail & Related papers (2022-03-23T20:05:06Z) - Injecting Knowledge in Data-driven Vehicle Trajectory Predictors [82.91398970736391]
Vehicle trajectory prediction tasks have been commonly tackled from two perspectives: knowledge-driven or data-driven.
In this paper, we propose to learn a "Realistic Residual Block" (RRB) which effectively connects these two perspectives.
Our proposed method outputs realistic predictions by confining the residual range and taking into account its uncertainty.
arXiv Detail & Related papers (2021-03-08T16:03:09Z) - DeGAN : Data-Enriching GAN for Retrieving Representative Samples from a
Trained Classifier [58.979104709647295]
We bridge the gap between the abundance of available data and lack of relevant data, for the future learning tasks of a trained network.
We use the available data, that may be an imbalanced subset of the original training dataset, or a related domain dataset, to retrieve representative samples.
We demonstrate that data from a related domain can be leveraged to achieve state-of-the-art performance.
arXiv Detail & Related papers (2019-12-27T02:05:45Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.