Encoders and Ensembles for Task-Free Continual Learning
- URL: http://arxiv.org/abs/2105.13327v1
- Date: Thu, 27 May 2021 17:34:31 GMT
- Title: Encoders and Ensembles for Task-Free Continual Learning
- Authors: Murray Shanahan and Christos Kaplanis and Jovana Mitrovi\'c
- Abstract summary: We present an architecture that is effective for continual learning in an especially demanding setting, where task boundaries do not exist or are unknown.
We show that models trained with the architecture are state-of-the-art for the task-free setting on standard image classification continual learning benchmarks.
We also show that the architecture learns well in a fully incremental setting, where one class is learned at a time, and we demonstrate its effectiveness in this setting with up to 100 classes.
- Score: 15.831773437720429
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We present an architecture that is effective for continual learning in an
especially demanding setting, where task boundaries do not exist or are
unknown. Our architecture comprises an encoder, pre-trained on a separate
dataset, and an ensemble of simple one-layer classifiers. Two main innovations
are required to make this combination work. First, the provision of suitably
generic pre-trained encoders has been made possible thanks to recent progress
in self-supervised training methods. Second, pairing each classifier in the
ensemble with a key, where the key-space is identical to the latent space of
the encoder, allows them to be used collectively, yet selectively, via
k-nearest neighbour lookup. We show that models trained with the
encoders-and-ensembles architecture are state-of-the-art for the task-free
setting on standard image classification continual learning benchmarks, and
improve on prior state-of-the-art by a large margin in the most challenging
cases. We also show that the architecture learns well in a fully incremental
setting, where one class is learned at a time, and we demonstrate its
effectiveness in this setting with up to 100 classes. Finally, we show that the
architecture works in a task-free continual learning context where the data
distribution changes gradually, and existing approaches requiring knowledge of
task boundaries cannot be applied.
Related papers
- Task agnostic continual learning with Pairwise layer architecture [0.0]
We show that we can improve the continual learning performance by replacing the final layer of our networks with our pairwise interaction layer.
The networks using this architecture show competitive performance in MNIST and FashionMNIST-based continual image classification experiments.
arXiv Detail & Related papers (2024-05-22T13:30:01Z) - Reusable Architecture Growth for Continual Stereo Matching [92.36221737921274]
We introduce a Reusable Architecture Growth (RAG) framework to learn new scenes continually in both supervised and self-supervised manners.
RAG can maintain high reusability during growth by reusing previous units while obtaining good performance.
We also present a Scene Router module to adaptively select the scene-specific architecture path at inference.
arXiv Detail & Related papers (2024-03-30T13:24:58Z) - Semi-supervised Multimodal Representation Learning through a Global Workspace [2.8948274245812335]
"Global Workspace" is a shared representation for two input modalities.
This architecture is amenable to self-supervised training via cycle-consistency.
We show that such an architecture can be trained to align and translate between two modalities with very little need for matched data.
arXiv Detail & Related papers (2023-06-27T12:41:36Z) - Improving Time Series Encoding with Noise-Aware Self-Supervised Learning and an Efficient Encoder [15.39384259348351]
We propose an innovative training strategy that promotes consistent representation learning, accounting for the presence of noise-prone signals in natural time series.
We also propose an encoder architecture that incorporates dilated convolution within the Inception block, resulting in a scalable and robust network with a wide receptive field.
arXiv Detail & Related papers (2023-06-11T04:00:11Z) - Learning from Mistakes: Self-Regularizing Hierarchical Representations
in Point Cloud Semantic Segmentation [15.353256018248103]
LiDAR semantic segmentation has gained attention to accomplish fine-grained scene understanding.
We present a coarse-to-fine setup that LEArns from classification mistaKes (LEAK) derived from a standard model.
Our LEAK approach is very general and can be seamlessly applied on top of any segmentation architecture.
arXiv Detail & Related papers (2023-01-26T14:52:30Z) - Improving Cross-task Generalization of Unified Table-to-text Models with
Compositional Task Configurations [63.04466647849211]
Methods typically encode task information with a simple dataset name as a prefix to the encoder.
We propose compositional task configurations, a set of prompts prepended to the encoder to improve cross-task generalization.
We show this not only allows the model to better learn shared knowledge across different tasks at training, but also allows us to control the model by composing new configurations.
arXiv Detail & Related papers (2022-12-17T02:20:14Z) - Real-World Compositional Generalization with Disentangled
Sequence-to-Sequence Learning [81.24269148865555]
A recently proposed Disentangled sequence-to-sequence model (Dangle) shows promising generalization capability.
We introduce two key modifications to this model which encourage more disentangled representations and improve its compute and memory efficiency.
Specifically, instead of adaptively re-encoding source keys and values at each time step, we disentangle their representations and only re-encode keys periodically.
arXiv Detail & Related papers (2022-12-12T15:40:30Z) - Policy Architectures for Compositional Generalization in Control [71.61675703776628]
We introduce a framework for modeling entity-based compositional structure in tasks.
Our policies are flexible and can be trained end-to-end without requiring any action primitives.
arXiv Detail & Related papers (2022-03-10T06:44:24Z) - Neural Ensemble Search for Uncertainty Estimation and Dataset Shift [67.57720300323928]
Ensembles of neural networks achieve superior performance compared to stand-alone networks in terms of accuracy, uncertainty calibration and robustness to dataset shift.
We propose two methods for automatically constructing ensembles with emphvarying architectures.
We show that the resulting ensembles outperform deep ensembles not only in terms of accuracy but also uncertainty calibration and robustness to dataset shift.
arXiv Detail & Related papers (2020-06-15T17:38:15Z) - Adversarial Continual Learning [99.56738010842301]
We propose a hybrid continual learning framework that learns a disjoint representation for task-invariant and task-specific features.
Our model combines architecture growth to prevent forgetting of task-specific skills and an experience replay approach to preserve shared skills.
arXiv Detail & Related papers (2020-03-21T02:08:17Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.