Continual Learning with Guarantees via Weight Interval Constraints
- URL: http://arxiv.org/abs/2206.07996v1
- Date: Thu, 16 Jun 2022 08:28:37 GMT
- Title: Continual Learning with Guarantees via Weight Interval Constraints
- Authors: Maciej Wo{\l}czyk, Karol J. Piczak, Bartosz W\'ojcik, {\L}ukasz
Pustelnik, Pawe{\l} Morawiecki, Jacek Tabor, Tomasz Trzci\'nski,
Przemys{\l}aw Spurek
- Abstract summary: We introduce a new training paradigm that enforces interval constraints on neural network parameter space to control forgetting.
We show how to put bounds on forgetting by reformulating continual learning of a model as a continual contraction of its parameter space.
- Score: 18.791232422083265
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We introduce a new training paradigm that enforces interval constraints on
neural network parameter space to control forgetting. Contemporary Continual
Learning (CL) methods focus on training neural networks efficiently from a
stream of data, while reducing the negative impact of catastrophic forgetting,
yet they do not provide any firm guarantees that network performance will not
deteriorate uncontrollably over time. In this work, we show how to put bounds
on forgetting by reformulating continual learning of a model as a continual
contraction of its parameter space. To that end, we propose Hyperrectangle
Training, a new training methodology where each task is represented by a
hyperrectangle in the parameter space, fully contained in the hyperrectangles
of the previous tasks. This formulation reduces the NP-hard CL problem back to
polynomial time while providing full resilience against forgetting. We validate
our claim by developing InterContiNet (Interval Continual Learning) algorithm
which leverages interval arithmetic to effectively model parameter regions as
hyperrectangles. Through experimental results, we show that our approach
performs well in a continual learning setup without storing data from previous
tasks.
Related papers
- Towards Continual Learning Desiderata via HSIC-Bottleneck
Orthogonalization and Equiangular Embedding [55.107555305760954]
We propose a conceptually simple yet effective method that attributes forgetting to layer-wise parameter overwriting and the resulting decision boundary distortion.
Our method achieves competitive accuracy performance, even with absolute superiority of zero exemplar buffer and 1.02x the base model.
arXiv Detail & Related papers (2024-01-17T09:01:29Z) - Continual Learning via Sequential Function-Space Variational Inference [65.96686740015902]
We propose an objective derived by formulating continual learning as sequential function-space variational inference.
Compared to objectives that directly regularize neural network predictions, the proposed objective allows for more flexible variational distributions.
We demonstrate that, across a range of task sequences, neural networks trained via sequential function-space variational inference achieve better predictive accuracy than networks trained with related methods.
arXiv Detail & Related papers (2023-12-28T18:44:32Z) - Robust Stochastically-Descending Unrolled Networks [85.6993263983062]
Deep unrolling is an emerging learning-to-optimize method that unrolls a truncated iterative algorithm in the layers of a trainable neural network.
We show that convergence guarantees and generalizability of the unrolled networks are still open theoretical problems.
We numerically assess unrolled architectures trained under the proposed constraints in two different applications.
arXiv Detail & Related papers (2023-12-25T18:51:23Z) - Towards Robust Continual Learning with Bayesian Adaptive Moment Regularization [51.34904967046097]
Continual learning seeks to overcome the challenge of catastrophic forgetting, where a model forgets previously learnt information.
We introduce a novel prior-based method that better constrains parameter growth, reducing catastrophic forgetting.
Results show that BAdam achieves state-of-the-art performance for prior-based methods on challenging single-headed class-incremental experiments.
arXiv Detail & Related papers (2023-09-15T17:10:51Z) - DLCFT: Deep Linear Continual Fine-Tuning for General Incremental
Learning [29.80680408934347]
We propose an alternative framework to incremental learning where we continually fine-tune the model from a pre-trained representation.
Our method takes advantage of linearization technique of a pre-trained neural network for simple and effective continual learning.
We show that our method can be applied to general continual learning settings, we evaluate our method in data-incremental, task-incremental, and class-incremental learning problems.
arXiv Detail & Related papers (2022-08-17T06:58:14Z) - Learning Bayesian Sparse Networks with Full Experience Replay for
Continual Learning [54.7584721943286]
Continual Learning (CL) methods aim to enable machine learning models to learn new tasks without catastrophic forgetting of those that have been previously mastered.
Existing CL approaches often keep a buffer of previously-seen samples, perform knowledge distillation, or use regularization techniques towards this goal.
We propose to only activate and select sparse neurons for learning current and past tasks at any stage.
arXiv Detail & Related papers (2022-02-21T13:25:03Z) - FFNB: Forgetting-Free Neural Blocks for Deep Continual Visual Learning [14.924672048447338]
We devise a dynamic network architecture for continual learning based on a novel forgetting-free neural block (FFNB)
Training FFNB features on new tasks is achieved using a novel procedure that constrains the underlying parameters in the null-space of the previous tasks.
arXiv Detail & Related papers (2021-11-22T17:23:34Z) - Training Networks in Null Space of Feature Covariance for Continual
Learning [34.095874368589904]
We propose a novel network training algorithm called Adam-NSCL, which sequentially optimize network parameters in the null space of previous tasks.
We apply our approach to training networks for continual learning on benchmark datasets of CIFAR-100 and TinyImageNet.
arXiv Detail & Related papers (2021-03-12T07:21:48Z) - Online Reinforcement Learning Control by Direct Heuristic Dynamic
Programming: from Time-Driven to Event-Driven [80.94390916562179]
Time-driven learning refers to the machine learning method that updates parameters in a prediction model continuously as new data arrives.
It is desirable to prevent the time-driven dHDP from updating due to insignificant system event such as noise.
We show how the event-driven dHDP algorithm works in comparison to the original time-driven dHDP.
arXiv Detail & Related papers (2020-06-16T05:51:25Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.