Towards Continual Learning Desiderata via HSIC-Bottleneck
Orthogonalization and Equiangular Embedding
- URL: http://arxiv.org/abs/2401.09067v1
- Date: Wed, 17 Jan 2024 09:01:29 GMT
- Title: Towards Continual Learning Desiderata via HSIC-Bottleneck
Orthogonalization and Equiangular Embedding
- Authors: Depeng Li, Tianqi Wang, Junwei Chen, Qining Ren, Kenji Kawaguchi,
Zhigang Zeng
- Abstract summary: We propose a conceptually simple yet effective method that attributes forgetting to layer-wise parameter overwriting and the resulting decision boundary distortion.
Our method achieves competitive accuracy performance, even with absolute superiority of zero exemplar buffer and 1.02x the base model.
- Score: 55.107555305760954
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Deep neural networks are susceptible to catastrophic forgetting when trained
on sequential tasks. Various continual learning (CL) methods often rely on
exemplar buffers or/and network expansion for balancing model stability and
plasticity, which, however, compromises their practical value due to privacy
and memory concerns. Instead, this paper considers a strict yet realistic
setting, where the training data from previous tasks is unavailable and the
model size remains relatively constant during sequential training. To achieve
such desiderata, we propose a conceptually simple yet effective method that
attributes forgetting to layer-wise parameter overwriting and the resulting
decision boundary distortion. This is achieved by the synergy between two key
components: HSIC-Bottleneck Orthogonalization (HBO) implements non-overwritten
parameter updates mediated by Hilbert-Schmidt independence criterion in an
orthogonal space and EquiAngular Embedding (EAE) enhances decision boundary
adaptation between old and new tasks with predefined basis vectors. Extensive
experiments demonstrate that our method achieves competitive accuracy
performance, even with absolute superiority of zero exemplar buffer and 1.02x
the base model.
Related papers
- Dynamic Position Transformation and Boundary Refinement Network for Left Atrial Segmentation [17.09918110723713]
Left atrial (LA) segmentation is a crucial technique for irregular heartbeat (i.e., atrial fibrillation) diagnosis.
Most current methods for LA segmentation strictly assume that the input data is acquired using object-oriented center cropping.
We propose a novel Dynamic Position transformation and Boundary refinement Network (DPBNet) to tackle these issues.
arXiv Detail & Related papers (2024-07-07T22:09:35Z) - Test-Time Training for Semantic Segmentation with Output Contrastive
Loss [12.535720010867538]
Deep learning-based segmentation models have achieved impressive performance on public benchmarks, but generalizing well to unseen environments remains a major challenge.
This paper introduces Contrastive Loss (OCL), known for its capability to learn robust and generalized representations, to stabilize the adaptation process.
Our method excels even when applied to models initially pre-trained using domain adaptation methods on test domain data, showcasing its resilience and adaptability.
arXiv Detail & Related papers (2023-11-14T03:13:47Z) - Towards Robust Continual Learning with Bayesian Adaptive Moment Regularization [51.34904967046097]
Continual learning seeks to overcome the challenge of catastrophic forgetting, where a model forgets previously learnt information.
We introduce a novel prior-based method that better constrains parameter growth, reducing catastrophic forgetting.
Results show that BAdam achieves state-of-the-art performance for prior-based methods on challenging single-headed class-incremental experiments.
arXiv Detail & Related papers (2023-09-15T17:10:51Z) - Complementary Learning Subnetworks for Parameter-Efficient
Class-Incremental Learning [40.13416912075668]
We propose a rehearsal-free CIL approach that learns continually via the synergy between two Complementary Learning Subnetworks.
Our method achieves competitive results against state-of-the-art methods, especially in accuracy gain, memory cost, training efficiency, and task-order.
arXiv Detail & Related papers (2023-06-21T01:43:25Z) - TWINS: A Fine-Tuning Framework for Improved Transferability of
Adversarial Robustness and Generalization [89.54947228958494]
This paper focuses on the fine-tuning of an adversarially pre-trained model in various classification tasks.
We propose a novel statistics-based approach, Two-WIng NormliSation (TWINS) fine-tuning framework.
TWINS is shown to be effective on a wide range of image classification datasets in terms of both generalization and robustness.
arXiv Detail & Related papers (2023-03-20T14:12:55Z) - Contextual Squeeze-and-Excitation for Efficient Few-Shot Image
Classification [57.36281142038042]
We present a new adaptive block called Contextual Squeeze-and-Excitation (CaSE) that adjusts a pretrained neural network on a new task to significantly improve performance.
We also present a new training protocol based on Coordinate-Descent called UpperCaSE that exploits meta-trained CaSE blocks and fine-tuning routines for efficient adaptation.
arXiv Detail & Related papers (2022-06-20T15:25:08Z) - Continual Learning with Guarantees via Weight Interval Constraints [18.791232422083265]
We introduce a new training paradigm that enforces interval constraints on neural network parameter space to control forgetting.
We show how to put bounds on forgetting by reformulating continual learning of a model as a continual contraction of its parameter space.
arXiv Detail & Related papers (2022-06-16T08:28:37Z) - Stabilizing Equilibrium Models by Jacobian Regularization [151.78151873928027]
Deep equilibrium networks (DEQs) are a new class of models that eschews traditional depth in favor of finding the fixed point of a single nonlinear layer.
We propose a regularization scheme for DEQ models that explicitly regularizes the Jacobian of the fixed-point update equations to stabilize the learning of equilibrium models.
We show that this regularization adds only minimal computational cost, significantly stabilizes the fixed-point convergence in both forward and backward passes, and scales well to high-dimensional, realistic domains.
arXiv Detail & Related papers (2021-06-28T00:14:11Z) - Robust Implicit Networks via Non-Euclidean Contractions [63.91638306025768]
Implicit neural networks show improved accuracy and significant reduction in memory consumption.
They can suffer from ill-posedness and convergence instability.
This paper provides a new framework to design well-posed and robust implicit neural networks.
arXiv Detail & Related papers (2021-06-06T18:05:02Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.