Continual Learning in Linear Classification on Separable Data
- URL: http://arxiv.org/abs/2306.03534v1
- Date: Tue, 6 Jun 2023 09:34:11 GMT
- Title: Continual Learning in Linear Classification on Separable Data
- Authors: Itay Evron, Edward Moroshko, Gon Buzaglo, Maroun Khriesh, Badea
Marjieh, Nathan Srebro, Daniel Soudry
- Abstract summary: We show that learning with weak regularization reduces to solving a sequential max-margin problem.
We then develop upper bounds on the forgetting and other quantities of interest under various settings.
We discuss several practical implications to popular training practices like regularization scheduling and weighting.
- Score: 34.78569443156924
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We analyze continual learning on a sequence of separable linear
classification tasks with binary labels. We show theoretically that learning
with weak regularization reduces to solving a sequential max-margin problem,
corresponding to a special case of the Projection Onto Convex Sets (POCS)
framework. We then develop upper bounds on the forgetting and other quantities
of interest under various settings with recurring tasks, including cyclic and
random orderings of tasks. We discuss several practical implications to popular
training practices like regularization scheduling and weighting. We point out
several theoretical differences between our continual classification setting
and a recently studied continual regression setting.
Related papers
- On the Convergence of Continual Learning with Adaptive Methods [4.351356718501137]
We propose an adaptive sequential method for non continual learning (NCCL)
We demonstrate that the proposed method improves the performance of continual learning existing methods for several image classification tasks.
arXiv Detail & Related papers (2024-04-08T14:28:27Z) - Neural Collapse Terminus: A Unified Solution for Class Incremental
Learning and Its Variants [166.916517335816]
In this paper, we offer a unified solution to the misalignment dilemma in the three tasks.
We propose neural collapse terminus that is a fixed structure with the maximal equiangular inter-class separation for the whole label space.
Our method holds the neural collapse optimality in an incremental fashion regardless of data imbalance or data scarcity.
arXiv Detail & Related papers (2023-08-03T13:09:59Z) - Mitigating Catastrophic Forgetting in Task-Incremental Continual
Learning with Adaptive Classification Criterion [50.03041373044267]
We propose a Supervised Contrastive learning framework with adaptive classification criterion for Continual Learning.
Experiments show that CFL achieves state-of-the-art performance and has a stronger ability to overcome compared with the classification baselines.
arXiv Detail & Related papers (2023-05-20T19:22:40Z) - Multi-Task Learning with Prior Information [5.770309971945476]
We propose a multi-task learning framework, where we utilize prior knowledge about the relations between features.
We also impose a penalty on the coefficients changing for each specific feature to ensure related tasks have similar coefficients on common features shared among them.
arXiv Detail & Related papers (2023-01-04T12:48:05Z) - Towards Practical Few-Shot Query Sets: Transductive Minimum Description
Length Inference [0.0]
We introduce a PrimAl Dual Minimum Description LEngth (PADDLE) formulation, which balances data-fitting accuracy and model complexity for a given few-shot task.
Our constrained MDL-like objective promotes competition among a large set of possible classes, preserving only effective classes that befit better the data of a few-shot task.
arXiv Detail & Related papers (2022-10-26T08:06:57Z) - A Multi-label Continual Learning Framework to Scale Deep Learning
Approaches for Packaging Equipment Monitoring [57.5099555438223]
We study multi-label classification in the continual scenario for the first time.
We propose an efficient approach that has a logarithmic complexity with regard to the number of tasks.
We validate our approach on a real-world multi-label Forecasting problem from the packaging industry.
arXiv Detail & Related papers (2022-08-08T15:58:39Z) - Contrastive learning of strong-mixing continuous-time stochastic
processes [53.82893653745542]
Contrastive learning is a family of self-supervised methods where a model is trained to solve a classification task constructed from unlabeled data.
We show that a properly constructed contrastive learning task can be used to estimate the transition kernel for small-to-mid-range intervals in the diffusion case.
arXiv Detail & Related papers (2021-03-03T23:06:47Z) - Theoretical Insights Into Multiclass Classification: A High-dimensional
Asymptotic View [82.80085730891126]
We provide the first modernally precise analysis of linear multiclass classification.
Our analysis reveals that the classification accuracy is highly distribution-dependent.
The insights gained may pave the way for a precise understanding of other classification algorithms.
arXiv Detail & Related papers (2020-11-16T05:17:29Z) - Continual Learning in Low-rank Orthogonal Subspaces [86.36417214618575]
In continual learning (CL), a learner is faced with a sequence of tasks, arriving one after the other, and the goal is to remember all the tasks once the learning experience is finished.
The prior art in CL uses episodic memory, parameter regularization or network structures to reduce interference among tasks, but in the end, all the approaches learn different tasks in a joint vector space.
We propose to learn tasks in different (low-rank) vector subspaces that are kept orthogonal to each other in order to minimize interference.
arXiv Detail & Related papers (2020-10-22T12:07:43Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.