DLCFT: Deep Linear Continual Fine-Tuning for General Incremental
Learning
- URL: http://arxiv.org/abs/2208.08112v1
- Date: Wed, 17 Aug 2022 06:58:14 GMT
- Title: DLCFT: Deep Linear Continual Fine-Tuning for General Incremental
Learning
- Authors: Hyounguk Shon, Janghyeon Lee, Seung Hwan Kim, Junmo Kim
- Abstract summary: We propose an alternative framework to incremental learning where we continually fine-tune the model from a pre-trained representation.
Our method takes advantage of linearization technique of a pre-trained neural network for simple and effective continual learning.
We show that our method can be applied to general continual learning settings, we evaluate our method in data-incremental, task-incremental, and class-incremental learning problems.
- Score: 29.80680408934347
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Pre-trained representation is one of the key elements in the success of
modern deep learning. However, existing works on continual learning methods
have mostly focused on learning models incrementally from scratch. In this
paper, we explore an alternative framework to incremental learning where we
continually fine-tune the model from a pre-trained representation. Our method
takes advantage of linearization technique of a pre-trained neural network for
simple and effective continual learning. We show that this allows us to design
a linear model where quadratic parameter regularization method is placed as the
optimal continual learning policy, and at the same time enjoying the high
performance of neural networks. We also show that the proposed algorithm
enables parameter regularization methods to be applied to class-incremental
problems. Additionally, we provide a theoretical reason why the existing
parameter-space regularization algorithms such as EWC underperform on neural
networks trained with cross-entropy loss. We show that the proposed method can
prevent forgetting while achieving high continual fine-tuning performance on
image classification tasks. To show that our method can be applied to general
continual learning settings, we evaluate our method in data-incremental,
task-incremental, and class-incremental learning problems.
Related papers
- Active Learning of Deep Neural Networks via Gradient-Free Cutting Planes [40.68266398473983]
In this work, we investigate an active learning scheme via a novel cutting-plane method for ReLULU networks of arbitrary depth.
We demonstrate that these algorithms can be extended to deep neural networks despite their non-linear convergence.
We exemplify the effectiveness of our proposed active learning method against popular deep active learning baselines via both data experiments and classification on real datasets.
arXiv Detail & Related papers (2024-10-03T02:11:35Z) - SLCA++: Unleash the Power of Sequential Fine-tuning for Continual Learning with Pre-training [68.7896349660824]
We present an in-depth analysis of the progressive overfitting problem from the lens of Seq FT.
Considering that the overly fast representation learning and the biased classification layer constitute this particular problem, we introduce the advanced Slow Learner with Alignment (S++) framework.
Our approach involves a Slow Learner to selectively reduce the learning rate of backbone parameters, and a Alignment to align the disjoint classification layers in a post-hoc fashion.
arXiv Detail & Related papers (2024-08-15T17:50:07Z) - On Newton's Method to Unlearn Neural Networks [44.85793893441989]
We seek approximate unlearning algorithms for neural networks (NNs) that return identical models to the retrained oracle.
We propose CureNewton's method, a principle approach that leverages cubic regularization to handle the Hessian degeneracy effectively.
Experiments across different models and datasets show that our method can achieve competitive unlearning performance to the state-of-the-art algorithm in practical unlearning settings.
arXiv Detail & Related papers (2024-06-20T17:12:20Z) - RanDumb: A Simple Approach that Questions the Efficacy of Continual Representation Learning [68.42776779425978]
We show that existing online continually trained deep networks produce inferior representations compared to a simple pre-defined random transforms.
We then train a simple linear classifier on top without storing any exemplars, processing one sample at a time in an online continual learning setting.
Our study reveals the significant limitations of representation learning, particularly in low-exemplar and online continual learning scenarios.
arXiv Detail & Related papers (2024-02-13T22:07:29Z) - PILOT: A Pre-Trained Model-Based Continual Learning Toolbox [71.63186089279218]
This paper introduces a pre-trained model-based continual learning toolbox known as PILOT.
On the one hand, PILOT implements some state-of-the-art class-incremental learning algorithms based on pre-trained models, such as L2P, DualPrompt, and CODA-Prompt.
On the other hand, PILOT fits typical class-incremental learning algorithms within the context of pre-trained models to evaluate their effectiveness.
arXiv Detail & Related papers (2023-09-13T17:55:11Z) - Complementary Learning Subnetworks for Parameter-Efficient
Class-Incremental Learning [40.13416912075668]
We propose a rehearsal-free CIL approach that learns continually via the synergy between two Complementary Learning Subnetworks.
Our method achieves competitive results against state-of-the-art methods, especially in accuracy gain, memory cost, training efficiency, and task-order.
arXiv Detail & Related papers (2023-06-21T01:43:25Z) - Continual Learning with Pretrained Backbones by Tuning in the Input
Space [44.97953547553997]
The intrinsic difficulty in adapting deep learning models to non-stationary environments limits the applicability of neural networks to real-world tasks.
We propose a novel strategy to make the fine-tuning procedure more effective, by avoiding to update the pre-trained part of the network and learning not only the usual classification head, but also a set of newly-introduced learnable parameters.
arXiv Detail & Related papers (2023-06-05T15:11:59Z) - Neural Architecture for Online Ensemble Continual Learning [6.241435193861262]
We present a fully differentiable ensemble method that allows us to efficiently train an ensemble of neural networks in the end-to-end regime.
The proposed technique achieves SOTA results without a memory buffer and clearly outperforms the reference methods.
arXiv Detail & Related papers (2022-11-27T23:17:08Z) - Incremental Embedding Learning via Zero-Shot Translation [65.94349068508863]
Current state-of-the-art incremental learning methods tackle catastrophic forgetting problem in traditional classification networks.
We propose a novel class-incremental method for embedding network, named as zero-shot translation class-incremental method (ZSTCI)
In addition, ZSTCI can easily be combined with existing regularization-based incremental learning methods to further improve performance of embedding networks.
arXiv Detail & Related papers (2020-12-31T08:21:37Z) - AdaS: Adaptive Scheduling of Stochastic Gradients [50.80697760166045]
We introduce the notions of textit"knowledge gain" and textit"mapping condition" and propose a new algorithm called Adaptive Scheduling (AdaS)
Experimentation reveals that, using the derived metrics, AdaS exhibits: (a) faster convergence and superior generalization over existing adaptive learning methods; and (b) lack of dependence on a validation set to determine when to stop training.
arXiv Detail & Related papers (2020-06-11T16:36:31Z) - Continual Deep Learning by Functional Regularisation of Memorable Past [95.97578574330934]
Continually learning new skills is important for intelligent systems, yet standard deep learning methods suffer from catastrophic forgetting of the past.
We propose a new functional-regularisation approach that utilises a few memorable past examples crucial to avoid forgetting.
Our method achieves state-of-the-art performance on standard benchmarks and opens a new direction for life-long learning where regularisation and memory-based methods are naturally combined.
arXiv Detail & Related papers (2020-04-29T10:47:54Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.