Related papers: DLCFT: Deep Linear Continual Fine-Tuning for General Incremental Learning

DLCFT: Deep Linear Continual Fine-Tuning for General Incremental Learning

URL: http://arxiv.org/abs/2208.08112v1
Date: Wed, 17 Aug 2022 06:58:14 GMT
Title: DLCFT: Deep Linear Continual Fine-Tuning for General Incremental Learning
Authors: Hyounguk Shon, Janghyeon Lee, Seung Hwan Kim, Junmo Kim
Abstract summary: We propose an alternative framework to incremental learning where we continually fine-tune the model from a pre-trained representation. Our method takes advantage of linearization technique of a pre-trained neural network for simple and effective continual learning. We show that our method can be applied to general continual learning settings, we evaluate our method in data-incremental, task-incremental, and class-incremental learning problems.
Score: 29.80680408934347
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Pre-trained representation is one of the key elements in the success of modern deep learning. However, existing works on continual learning methods have mostly focused on learning models incrementally from scratch. In this paper, we explore an alternative framework to incremental learning where we continually fine-tune the model from a pre-trained representation. Our method takes advantage of linearization technique of a pre-trained neural network for simple and effective continual learning. We show that this allows us to design a linear model where quadratic parameter regularization method is placed as the optimal continual learning policy, and at the same time enjoying the high performance of neural networks. We also show that the proposed algorithm enables parameter regularization methods to be applied to class-incremental problems. Additionally, we provide a theoretical reason why the existing parameter-space regularization algorithms such as EWC underperform on neural networks trained with cross-entropy loss. We show that the proposed method can prevent forgetting while achieving high continual fine-tuning performance on image classification tasks. To show that our method can be applied to general continual learning settings, we evaluate our method in data-incremental, task-incremental, and class-incremental learning problems.

Related papers

Sample Compression for Continual Learning [4.354838732412981]
Continual learning algorithms aim to learn from a sequence of tasks, making the training distribution non-stationary. We present a new method called 'Continual Pick-to-Learn' (CoP2L), which is able to retain the most representative samples for each task in an efficient way.
arXiv Detail & Related papers (2025-03-13T16:05:56Z)
Active Learning of Deep Neural Networks via Gradient-Free Cutting Planes [40.68266398473983]
In this work, we investigate an active learning scheme via a novel cutting-plane method for ReLULU networks of arbitrary depth. We demonstrate that these algorithms can be extended to deep neural networks despite their non-linear convergence. We exemplify the effectiveness of our proposed active learning method against popular deep active learning baselines via both data experiments and classification on real datasets.
arXiv Detail & Related papers (2024-10-03T02:11:35Z)
SLCA++: Unleash the Power of Sequential Fine-tuning for Continual Learning with Pre-training [68.7896349660824]
We present an in-depth analysis of the progressive overfitting problem from the lens of Seq FT. Considering that the overly fast representation learning and the biased classification layer constitute this particular problem, we introduce the advanced Slow Learner with Alignment (S++) framework. Our approach involves a Slow Learner to selectively reduce the learning rate of backbone parameters, and a Alignment to align the disjoint classification layers in a post-hoc fashion.
arXiv Detail & Related papers (2024-08-15T17:50:07Z)
On Newton's Method to Unlearn Neural Networks [44.85793893441989]
We seek approximate unlearning algorithms for neural networks (NNs) that return identical models to the retrained oracle. We propose CureNewton's method, a principle approach that leverages cubic regularization to handle the Hessian degeneracy effectively. Experiments across different models and datasets show that our method can achieve competitive unlearning performance to the state-of-the-art algorithm in practical unlearning settings.
arXiv Detail & Related papers (2024-06-20T17:12:20Z)
PILOT: A Pre-Trained Model-Based Continual Learning Toolbox [71.63186089279218]
This paper introduces a pre-trained model-based continual learning toolbox known as PILOT. On the one hand, PILOT implements some state-of-the-art class-incremental learning algorithms based on pre-trained models, such as L2P, DualPrompt, and CODA-Prompt. On the other hand, PILOT fits typical class-incremental learning algorithms within the context of pre-trained models to evaluate their effectiveness.
arXiv Detail & Related papers (2023-09-13T17:55:11Z)
Complementary Learning Subnetworks for Parameter-Efficient Class-Incremental Learning [40.13416912075668]
We propose a rehearsal-free CIL approach that learns continually via the synergy between two Complementary Learning Subnetworks. Our method achieves competitive results against state-of-the-art methods, especially in accuracy gain, memory cost, training efficiency, and task-order.
arXiv Detail & Related papers (2023-06-21T01:43:25Z)
Continual Learning with Pretrained Backbones by Tuning in the Input Space [44.97953547553997]
The intrinsic difficulty in adapting deep learning models to non-stationary environments limits the applicability of neural networks to real-world tasks. We propose a novel strategy to make the fine-tuning procedure more effective, by avoiding to update the pre-trained part of the network and learning not only the usual classification head, but also a set of newly-introduced learnable parameters.
arXiv Detail & Related papers (2023-06-05T15:11:59Z)
Neural Architecture for Online Ensemble Continual Learning [6.241435193861262]
We present a fully differentiable ensemble method that allows us to efficiently train an ensemble of neural networks in the end-to-end regime. The proposed technique achieves SOTA results without a memory buffer and clearly outperforms the reference methods.
arXiv Detail & Related papers (2022-11-27T23:17:08Z)
Towards Diverse Evaluation of Class Incremental Learning: A Representation Learning Perspective [67.45111837188685]
Class incremental learning (CIL) algorithms aim to continually learn new object classes from incrementally arriving data. We experimentally analyze neural network models trained by CIL algorithms using various evaluation protocols in representation learning.
arXiv Detail & Related papers (2022-06-16T11:44:11Z)
Incremental Embedding Learning via Zero-Shot Translation [65.94349068508863]
Current state-of-the-art incremental learning methods tackle catastrophic forgetting problem in traditional classification networks. We propose a novel class-incremental method for embedding network, named as zero-shot translation class-incremental method (ZSTCI) In addition, ZSTCI can easily be combined with existing regularization-based incremental learning methods to further improve performance of embedding networks.
arXiv Detail & Related papers (2020-12-31T08:21:37Z)
AdaS: Adaptive Scheduling of Stochastic Gradients [50.80697760166045]
We introduce the notions of textit"knowledge gain" and textit"mapping condition" and propose a new algorithm called Adaptive Scheduling (AdaS) Experimentation reveals that, using the derived metrics, AdaS exhibits: (a) faster convergence and superior generalization over existing adaptive learning methods; and (b) lack of dependence on a validation set to determine when to stop training.
arXiv Detail & Related papers (2020-06-11T16:36:31Z)
Continual Deep Learning by Functional Regularisation of Memorable Past [95.97578574330934]
Continually learning new skills is important for intelligent systems, yet standard deep learning methods suffer from catastrophic forgetting of the past. We propose a new functional-regularisation approach that utilises a few memorable past examples crucial to avoid forgetting. Our method achieves state-of-the-art performance on standard benchmarks and opens a new direction for life-long learning where regularisation and memory-based methods are naturally combined.
arXiv Detail & Related papers (2020-04-29T10:47:54Z)

This list is automatically generated from the titles and abstracts of the papers in this site.