BOWLL: A Deceptively Simple Open World Lifelong Learner
- URL: http://arxiv.org/abs/2402.04814v1
- Date: Wed, 7 Feb 2024 13:04:35 GMT
- Title: BOWLL: A Deceptively Simple Open World Lifelong Learner
- Authors: Roshni Kamath, Rupert Mitchell, Subarnaduti Paul, Kristian Kersting,
Martin Mundt
- Abstract summary: We propose a deceptively simple yet highly effective way to repurpose standard models for open world lifelong learning.
Our approach should serve as a future standard for models that are able to effectively maintain their knowledge, selectively focus on informative data, and accelerate future learning.
- Score: 22.375833943808995
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The quest to improve scalar performance numbers on predetermined benchmarks
seems to be deeply engraved in deep learning. However, the real world is seldom
carefully curated and applications are seldom limited to excelling on test
sets. A practical system is generally required to recognize novel concepts,
refrain from actively including uninformative data, and retain previously
acquired knowledge throughout its lifetime. Despite these key elements being
rigorously researched individually, the study of their conjunction, open world
lifelong learning, is only a recent trend. To accelerate this multifaceted
field's exploration, we introduce its first monolithic and much-needed
baseline. Leveraging the ubiquitous use of batch normalization across deep
neural networks, we propose a deceptively simple yet highly effective way to
repurpose standard models for open world lifelong learning. Through extensive
empirical evaluation, we highlight why our approach should serve as a future
standard for models that are able to effectively maintain their knowledge,
selectively focus on informative data, and accelerate future learning.
Related papers
- Towards Few-Shot Learning in the Open World: A Review and Beyond [52.41344813375177]
Few-shot learning aims to mimic human intelligence by enabling significant generalizations and transferability.
This paper presents a review of recent advancements designed to adapt FSL for use in open-world settings.
We categorize existing methods into three distinct types of open-world few-shot learning: those involving varying instances, varying classes, and varying distributions.
arXiv Detail & Related papers (2024-08-19T06:23:21Z) - Learning to Continually Learn with the Bayesian Principle [36.75558255534538]
In this work, we adopt the meta-learning paradigm to combine the strong representational power of neural networks and simple statistical models' robustness to forgetting.
Since the neural networks remain fixed during continual learning, they are protected from catastrophic forgetting.
arXiv Detail & Related papers (2024-05-29T04:53:31Z) - Open-world Machine Learning: A Review and New Outlooks [83.6401132743407]
This paper aims to provide a comprehensive introduction to the emerging open-world machine learning paradigm.
It aims to help researchers build more powerful AI systems in their respective fields, and to promote the development of artificial general intelligence.
arXiv Detail & Related papers (2024-03-04T06:25:26Z) - Federated Learning with Projected Trajectory Regularization [65.6266768678291]
Federated learning enables joint training of machine learning models from distributed clients without sharing their local data.
One key challenge in federated learning is to handle non-identically distributed data across the clients.
We propose a novel federated learning framework with projected trajectory regularization (FedPTR) for tackling the data issue.
arXiv Detail & Related papers (2023-12-22T02:12:08Z) - The Trifecta: Three simple techniques for training deeper
Forward-Forward networks [0.0]
We propose a collection of three techniques that synergize exceptionally well and drastically improve the Forward-Forward algorithm on deeper networks.
Our experiments demonstrate that our models are on par with similarly structured, backpropagation-based models in both training speed and test accuracy on simple datasets.
arXiv Detail & Related papers (2023-11-29T22:44:32Z) - Window-based Model Averaging Improves Generalization in Heterogeneous
Federated Learning [29.140054600391917]
Federated Learning (FL) aims to learn a global model from distributed users while protecting their privacy.
We propose WIMA (Window-based Model Averaging), which aggregates global models from different rounds using a window-based approach.
Our experiments demonstrate the robustness of WIMA against distribution shifts and bad client sampling, resulting in smoother and more stable learning trends.
arXiv Detail & Related papers (2023-10-02T17:30:14Z) - Towards Robust Continual Learning with Bayesian Adaptive Moment Regularization [51.34904967046097]
Continual learning seeks to overcome the challenge of catastrophic forgetting, where a model forgets previously learnt information.
We introduce a novel prior-based method that better constrains parameter growth, reducing catastrophic forgetting.
Results show that BAdam achieves state-of-the-art performance for prior-based methods on challenging single-headed class-incremental experiments.
arXiv Detail & Related papers (2023-09-15T17:10:51Z) - Conditional Online Learning for Keyword Spotting [0.0]
This work investigates a simple but effective online continual learning method that updates a keyword spotter on-device via SGD as new data becomes available.
Experiments demonstrate that, compared to a naive online learning implementation, conditional model updates based on its performance in a small hold-out set drawn from the training distribution mitigate catastrophic forgetting.
arXiv Detail & Related papers (2023-05-19T15:46:31Z) - Label-efficient Time Series Representation Learning: A Review [19.218833228063392]
Label-efficient time series representation learning is crucial for deploying deep learning models in real-world applications.
To address the scarcity of labeled time series data, various strategies, e.g., transfer learning, self-supervised learning, and semi-supervised learning, have been developed.
We introduce a novel taxonomy for the first time, categorizing existing approaches as in-domain or cross-domain, based on their reliance on external data sources.
arXiv Detail & Related papers (2023-02-13T15:12:15Z) - DITTO: Offline Imitation Learning with World Models [21.419536711242962]
DITTO is an offline imitation learning algorithm which addresses all three of these problems.
We optimize this multi-step latent divergence using standard reinforcement learning algorithms.
Our results show how creative use of world models can lead to a simple, robust, and highly-performant policy-learning framework.
arXiv Detail & Related papers (2023-02-06T19:41:18Z) - Learning and Retrieval from Prior Data for Skill-based Imitation
Learning [47.59794569496233]
We develop a skill-based imitation learning framework that extracts temporally extended sensorimotor skills from prior data.
We identify several key design choices that significantly improve performance on novel tasks.
arXiv Detail & Related papers (2022-10-20T17:34:59Z) - HyperImpute: Generalized Iterative Imputation with Automatic Model
Selection [77.86861638371926]
We propose a generalized iterative imputation framework for adaptively and automatically configuring column-wise models.
We provide a concrete implementation with out-of-the-box learners, simulators, and interfaces.
arXiv Detail & Related papers (2022-06-15T19:10:35Z) - BERT WEAVER: Using WEight AVERaging to enable lifelong learning for
transformer-based models in biomedical semantic search engines [49.75878234192369]
We present WEAVER, a simple, yet efficient post-processing method that infuses old knowledge into the new model.
We show that applying WEAVER in a sequential manner results in similar word embedding distributions as doing a combined training on all data at once.
arXiv Detail & Related papers (2022-02-21T10:34:41Z) - What Makes Good Contrastive Learning on Small-Scale Wearable-based
Tasks? [59.51457877578138]
We study contrastive learning on the wearable-based activity recognition task.
This paper presents an open-source PyTorch library textttCL-HAR, which can serve as a practical tool for researchers.
arXiv Detail & Related papers (2022-02-12T06:10:15Z) - CMW-Net: Learning a Class-Aware Sample Weighting Mapping for Robust Deep
Learning [55.733193075728096]
Modern deep neural networks can easily overfit to biased training data containing corrupted labels or class imbalance.
Sample re-weighting methods are popularly used to alleviate this data bias issue.
We propose a meta-model capable of adaptively learning an explicit weighting scheme directly from data.
arXiv Detail & Related papers (2022-02-11T13:49:51Z) - Online Continual Learning with Natural Distribution Shifts: An Empirical
Study with Visual Data [101.6195176510611]
"Online" continual learning enables evaluating both information retention and online learning efficacy.
In online continual learning, each incoming small batch of data is first used for testing and then added to the training set, making the problem truly online.
We introduce a new benchmark for online continual visual learning that exhibits large scale and natural distribution shifts.
arXiv Detail & Related papers (2021-08-20T06:17:20Z) - Reasoning-Modulated Representations [85.08205744191078]
We study a common setting where our task is not purely opaque.
Our approach paves the way for a new class of data-efficient representation learning.
arXiv Detail & Related papers (2021-07-19T13:57:13Z) - Self-Damaging Contrastive Learning [92.34124578823977]
Unlabeled data in reality is commonly imbalanced and shows a long-tail distribution.
This paper proposes a principled framework called Self-Damaging Contrastive Learning to automatically balance the representation learning without knowing the classes.
Our experiments show that SDCLR significantly improves not only overall accuracies but also balancedness.
arXiv Detail & Related papers (2021-06-06T00:04:49Z) - Generalising via Meta-Examples for Continual Learning in the Wild [24.09600678738403]
We develop a novel strategy to deal with neural networks that "learn in the wild"
We equip it with MEML - Meta-Example Meta-Learning - a new module that simultaneously alleviates catastrophic forgetting.
We extend it by adopting a technique that creates various augmented tasks and optimises over the hardest.
arXiv Detail & Related papers (2021-01-28T15:51:54Z) - Deep Bayesian Active Learning, A Brief Survey on Recent Advances [6.345523830122166]
Active learning starts training the model with a small size of labeled data.
Deep learning methods are not capable of either representing or manipulating model uncertainty.
Deep Bayesian active learning frameworks provide practical consideration in the model.
arXiv Detail & Related papers (2020-12-15T02:06:07Z) - A Wholistic View of Continual Learning with Deep Neural Networks:
Forgotten Lessons and the Bridge to Active and Open World Learning [8.188575923130662]
We argue that notable lessons from open set recognition, the identification of statistically deviating data outside of the observed dataset, and the adjacent field of active learning, are frequently overlooked in the deep learning era.
Our results show that this not only benefits each individual paradigm, but highlights the natural synergies in a common framework.
arXiv Detail & Related papers (2020-09-03T16:56:36Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.