Efficient Feature Transformations for Discriminative and Generative
Continual Learning
- URL: http://arxiv.org/abs/2103.13558v1
- Date: Thu, 25 Mar 2021 01:48:14 GMT
- Title: Efficient Feature Transformations for Discriminative and Generative
Continual Learning
- Authors: Vinay Kumar Verma, Kevin J Liang, Nikhil Mehta, Piyush Rai, Lawrence
Carin
- Abstract summary: We propose a simple task-specific feature map transformation strategy for continual learning.
Theses provide powerful flexibility for learning new tasks, achieved with minimal parameters added to the base architecture.
We demonstrate the efficacy and efficiency of our method with an extensive set of experiments in discriminative (CIFAR-100 and ImageNet-1K) and generative sequences of tasks.
- Score: 98.10425163678082
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: As neural networks are increasingly being applied to real-world applications,
mechanisms to address distributional shift and sequential task learning without
forgetting are critical. Methods incorporating network expansion have shown
promise by naturally adding model capacity for learning new tasks while
simultaneously avoiding catastrophic forgetting. However, the growth in the
number of additional parameters of many of these types of methods can be
computationally expensive at larger scales, at times prohibitively so. Instead,
we propose a simple task-specific feature map transformation strategy for
continual learning, which we call Efficient Feature Transformations (EFTs).
These EFTs provide powerful flexibility for learning new tasks, achieved with
minimal parameters added to the base architecture. We further propose a feature
distance maximization strategy, which significantly improves task prediction in
class incremental settings, without needing expensive generative models. We
demonstrate the efficacy and efficiency of our method with an extensive set of
experiments in discriminative (CIFAR-100 and ImageNet-1K) and generative (LSUN,
CUB-200, Cats) sequences of tasks. Even with low single-digit parameter growth
rates, EFTs can outperform many other continual learning methods in a wide
range of settings.
Related papers
- Diffusion-Based Neural Network Weights Generation [80.89706112736353]
D2NWG is a diffusion-based neural network weights generation technique that efficiently produces high-performing weights for transfer learning.
Our method extends generative hyper-representation learning to recast the latent diffusion paradigm for neural network weights generation.
Our approach is scalable to large architectures such as large language models (LLMs), overcoming the limitations of current parameter generation techniques.
arXiv Detail & Related papers (2024-02-28T08:34:23Z) - Dynamic Transformer Architecture for Continual Learning of Multimodal
Tasks [27.59758964060561]
Transformer neural networks are increasingly replacing prior architectures in a wide range of applications in different data modalities.
Continual learning (CL) emerges as a solution by facilitating the transfer of knowledge across tasks that arrive sequentially for an autonomously learning agent.
We propose a transformer-based CL framework focusing on learning tasks that involve both vision and language.
arXiv Detail & Related papers (2024-01-27T03:03:30Z) - Efficient Expansion and Gradient Based Task Inference for Replay Free
Incremental Learning [5.760774528950479]
Recent expansion based models show promising results for task incremental learning (TIL)
For class incremental learning (CIL), prediction of task id is a crucial challenge.
We propose a robust task prediction method that leverages entropy weighted data augmentations and the models gradient using pseudo labels.
arXiv Detail & Related papers (2023-12-02T17:28:52Z) - Complementary Learning Subnetworks for Parameter-Efficient
Class-Incremental Learning [40.13416912075668]
We propose a rehearsal-free CIL approach that learns continually via the synergy between two Complementary Learning Subnetworks.
Our method achieves competitive results against state-of-the-art methods, especially in accuracy gain, memory cost, training efficiency, and task-order.
arXiv Detail & Related papers (2023-06-21T01:43:25Z) - E2-AEN: End-to-End Incremental Learning with Adaptively Expandable
Network [57.87240860624937]
We propose an end-to-end trainable adaptively expandable network named E2-AEN.
It dynamically generates lightweight structures for new tasks without any accuracy drop in previous tasks.
E2-AEN reduces cost and can be built upon any feed-forward architectures in an end-to-end manner.
arXiv Detail & Related papers (2022-07-14T09:04:51Z) - An Evolutionary Approach to Dynamic Introduction of Tasks in Large-scale
Multitask Learning Systems [4.675744559395732]
Multitask learning assumes that models capable of learning from multiple tasks can achieve better quality and efficiency via knowledge transfer.
State of the art ML models rely on high customization for each task and leverage size and data scale rather than scaling the number of tasks.
We propose an evolutionary method that can generate a large scale multitask model and can support the dynamic and continuous addition of new tasks.
arXiv Detail & Related papers (2022-05-25T13:10:47Z) - FOSTER: Feature Boosting and Compression for Class-Incremental Learning [52.603520403933985]
Deep neural networks suffer from catastrophic forgetting when learning new categories.
We propose a novel two-stage learning paradigm FOSTER, empowering the model to learn new categories adaptively.
arXiv Detail & Related papers (2022-04-10T11:38:33Z) - Learning to Continuously Optimize Wireless Resource In Episodically
Dynamic Environment [55.91291559442884]
This work develops a methodology that enables data-driven methods to continuously learn and optimize in a dynamic environment.
We propose to build the notion of continual learning into the modeling process of learning wireless systems.
Our design is based on a novel min-max formulation which ensures certain fairness" across different data samples.
arXiv Detail & Related papers (2020-11-16T08:24:34Z) - Parameter-Efficient Transfer from Sequential Behaviors for User Modeling
and Recommendation [111.44445634272235]
In this paper, we develop a parameter efficient transfer learning architecture, termed as PeterRec.
PeterRec allows the pre-trained parameters to remain unaltered during fine-tuning by injecting a series of re-learned neural networks.
We perform extensive experimental ablation to show the effectiveness of the learned user representation in five downstream tasks.
arXiv Detail & Related papers (2020-01-13T14:09:54Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.