Centroids Matching: an efficient Continual Learning approach operating
in the embedding space
- URL: http://arxiv.org/abs/2208.02048v1
- Date: Wed, 3 Aug 2022 13:17:16 GMT
- Title: Centroids Matching: an efficient Continual Learning approach operating
in the embedding space
- Authors: Jary Pomponi, Simone Scardapane, Aurelio Uncini
- Abstract summary: Catastrophic forgetting (CF) occurs when a neural network loses the information previously learned while training on a set of samples from a different distribution.
We propose a novel regularization method called Centroids Matching, that fights CF by operating in the feature space produced by the neural network.
- Score: 15.705568893476947
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Catastrophic forgetting (CF) occurs when a neural network loses the
information previously learned while training on a set of samples from a
different distribution, i.e., a new task. Existing approaches have achieved
remarkable results in mitigating CF, especially in a scenario called task
incremental learning. However, this scenario is not realistic, and limited work
has been done to achieve good results on more realistic scenarios. In this
paper, we propose a novel regularization method called Centroids Matching,
that, inspired by meta-learning approaches, fights CF by operating in the
feature space produced by the neural network, achieving good results while
requiring a small memory footprint. Specifically, the approach classifies the
samples directly using the feature vectors produced by the neural network, by
matching those vectors with the centroids representing the classes from the
current task, or all the tasks up to that point. Centroids Matching is faster
than competing baselines, and it can be exploited to efficiently mitigate CF,
by preserving the distances between the embedding space produced by the model
when past tasks were over, and the one currently produced, leading to a method
that achieves high accuracy on all the tasks, without using an external memory
when operating on easy scenarios, or using a small one for more realistic ones.
Extensive experiments demonstrate that Centroids Matching achieves accuracy
gains on multiple datasets and scenarios.
Related papers
- Diffusion Generative Flow Samplers: Improving learning signals through
partial trajectory optimization [87.21285093582446]
Diffusion Generative Flow Samplers (DGFS) is a sampling-based framework where the learning process can be tractably broken down into short partial trajectory segments.
Our method takes inspiration from the theory developed for generative flow networks (GFlowNets)
arXiv Detail & Related papers (2023-10-04T09:39:05Z) - Complementary Learning Subnetworks for Parameter-Efficient
Class-Incremental Learning [40.13416912075668]
We propose a rehearsal-free CIL approach that learns continually via the synergy between two Complementary Learning Subnetworks.
Our method achieves competitive results against state-of-the-art methods, especially in accuracy gain, memory cost, training efficiency, and task-order.
arXiv Detail & Related papers (2023-06-21T01:43:25Z) - TADIL: Task-Agnostic Domain-Incremental Learning through Task-ID
Inference using Transformer Nearest-Centroid Embeddings [0.0]
We propose a novel pipeline for identifying tasks in domain-incremental learning scenarios without supervision.
We leverage the lightweight computational requirements of the pipeline to devise an algorithm that decides in an online fashion when to learn a new task.
arXiv Detail & Related papers (2023-06-21T00:55:02Z) - Multi-Level Contrastive Learning for Dense Prediction Task [59.591755258395594]
We present Multi-Level Contrastive Learning for Dense Prediction Task (MCL), an efficient self-supervised method for learning region-level feature representation for dense prediction tasks.
Our method is motivated by the three key factors in detection: localization, scale consistency and recognition.
Our method consistently outperforms the recent state-of-the-art methods on various datasets with significant margins.
arXiv Detail & Related papers (2023-04-04T17:59:04Z) - Optimal transfer protocol by incremental layer defrosting [66.76153955485584]
Transfer learning is a powerful tool enabling model training with limited amounts of data.
The simplest transfer learning protocol is based on freezing" the feature-extractor layers of a network pre-trained on a data-rich source task.
We show that this protocol is often sub-optimal and the largest performance gain may be achieved when smaller portions of the pre-trained network are kept frozen.
arXiv Detail & Related papers (2023-03-02T17:32:11Z) - Compare Where It Matters: Using Layer-Wise Regularization To Improve
Federated Learning on Heterogeneous Data [0.0]
Federated Learning is a widely adopted method to train neural networks over distributed data.
One main limitation is the performance degradation that occurs when data is heterogeneously distributed.
We present FedCKA: a framework that out-performs previous state-of-the-art methods on various deep learning tasks.
arXiv Detail & Related papers (2021-12-01T10:46:13Z) - Dense Unsupervised Learning for Video Segmentation [49.46930315961636]
We present a novel approach to unsupervised learning for video object segmentation (VOS)
Unlike previous work, our formulation allows to learn dense feature representations directly in a fully convolutional regime.
Our approach exceeds the segmentation accuracy of previous work despite using significantly less training data and compute power.
arXiv Detail & Related papers (2021-11-11T15:15:11Z) - Hyperdimensional Computing for Efficient Distributed Classification with
Randomized Neural Networks [5.942847925681103]
We study distributed classification, which can be employed in situations were data cannot be stored at a central location nor shared.
We propose a more efficient solution for distributed classification by making use of a lossy compression approach applied when sharing the local classifiers with other agents.
arXiv Detail & Related papers (2021-06-02T01:33:56Z) - Conditional Meta-Learning of Linear Representations [57.90025697492041]
Standard meta-learning for representation learning aims to find a common representation to be shared across multiple tasks.
In this work we overcome this issue by inferring a conditioning function, mapping the tasks' side information into a representation tailored to the task at hand.
We propose a meta-algorithm capable of leveraging this advantage in practice.
arXiv Detail & Related papers (2021-03-30T12:02:14Z) - Few-shot Weakly-Supervised Object Detection via Directional Statistics [55.97230224399744]
We propose a probabilistic multiple instance learning approach for few-shot Common Object Localization (COL) and few-shot Weakly Supervised Object Detection (WSOD)
Our model simultaneously learns the distribution of the novel objects and localizes them via expectation-maximization steps.
Our experiments show that the proposed method, despite being simple, outperforms strong baselines in few-shot COL and WSOD, as well as large-scale WSOD tasks.
arXiv Detail & Related papers (2021-03-25T22:34:16Z) - SpaceNet: Make Free Space For Continual Learning [15.914199054779438]
We propose a novel architectural-based method referred as SpaceNet for class incremental learning scenario.
SpaceNet trains sparse deep neural networks from scratch in an adaptive way that compresses the sparse connections of each task in a compact number of neurons.
Experimental results show the robustness of our proposed method against catastrophic forgetting old tasks and the efficiency of SpaceNet in utilizing the available capacity of the model.
arXiv Detail & Related papers (2020-07-15T11:21:31Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.