Continual Learning with Distributed Optimization: Does CoCoA Forget?
- URL: http://arxiv.org/abs/2211.16994v4
- Date: Tue, 5 Dec 2023 07:18:56 GMT
- Title: Continual Learning with Distributed Optimization: Does CoCoA Forget?
- Authors: Martin Hellkvist and Ay\c{c}a \"Oz\c{c}elikkale and Anders Ahl\'en
- Abstract summary: We focus on the continual learning problem where the tasks arrive sequentially.
The aim is to perform well on the newly arrived task without performance degradation on the previously seen tasks.
We consider the well-established distributed learning algorithm COCOA.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We focus on the continual learning problem where the tasks arrive
sequentially and the aim is to perform well on the newly arrived task without
performance degradation on the previously seen tasks. In contrast to the
continual learning literature focusing on the centralized setting, we
investigate the distributed estimation framework. We consider the
well-established distributed learning algorithm COCOA. We derive closed form
expressions for the iterations for the overparametrized case. We illustrate the
convergence and the error performance of the algorithm based on the
over/under-parameterization of the problem. Our results show that depending on
the problem dimensions and data generation assumptions, COCOA can perform
continual learning over a sequence of tasks, i.e., it can learn a new task
without forgetting previously learned tasks, with access only to one task at a
time.
Related papers
- Continual Learning of Numerous Tasks from Long-tail Distributions [17.706669222987273]
Continual learning focuses on developing models that learn and adapt to new tasks while retaining previously acquired knowledge.
Existing continual learning algorithms usually involve a small number of tasks with uniform sizes and may not accurately represent real-world learning scenarios.
We propose a method that reuses the states in Adam by maintaining a weighted average of the second moments from previous tasks.
We demonstrate that our method, compatible with most existing continual learning algorithms, effectively reduces forgetting with only a small amount of additional computational or memory costs.
arXiv Detail & Related papers (2024-04-03T13:56:33Z) - Data-CUBE: Data Curriculum for Instruction-based Sentence Representation
Learning [85.66907881270785]
We propose a data curriculum method, namely Data-CUBE, that arranges the orders of all the multi-task data for training.
In the task level, we aim to find the optimal task order to minimize the total cross-task interference risk.
In the instance level, we measure the difficulty of all instances per task, then divide them into the easy-to-difficult mini-batches for training.
arXiv Detail & Related papers (2024-01-07T18:12:20Z) - Distributed Continual Learning with CoCoA in High-dimensional Linear
Regression [0.0]
We consider estimation under scenarios where the signals of interest exhibit change of characteristics over time.
In particular, we consider the continual learning problem where different tasks, e.g., data with different distributions, arrive sequentially.
We consider the well-established distributed learning algorithm COCOA, which distributes the model parameters and the corresponding features over the network.
arXiv Detail & Related papers (2023-12-04T10:35:46Z) - Minimax Forward and Backward Learning of Evolving Tasks with Performance
Guarantees [6.008132390640294]
The incremental learning of a growing sequence of tasks holds promise to enable accurate classification.
This paper presents incremental minimax risk classifiers (IMRCs) that effectively exploit forward and backward learning.
IMRCs can result in a significant performance improvement, especially for reduced sample sizes.
arXiv Detail & Related papers (2023-10-24T16:21:41Z) - Clustering-based Domain-Incremental Learning [4.835091081509403]
Key challenge in continual learning is the so-called "catastrophic forgetting problem"
We propose an online clustering-based approach on a dynamically updated finite pool of samples or gradients.
We demonstrate the effectiveness of the proposed strategy and its promising performance compared to state-of-the-art methods.
arXiv Detail & Related papers (2023-09-21T13:49:05Z) - Self-paced Weight Consolidation for Continual Learning [39.27729549041708]
Continual learning algorithms are popular in preventing catastrophic forgetting in sequential task learning settings.
We propose a self-paced Weight Consolidation (spWC) framework to attain continual learning.
arXiv Detail & Related papers (2023-07-20T13:07:41Z) - ForkMerge: Mitigating Negative Transfer in Auxiliary-Task Learning [59.08197876733052]
Auxiliary-Task Learning (ATL) aims to improve the performance of the target task by leveraging the knowledge obtained from related tasks.
Sometimes, learning multiple tasks simultaneously results in lower accuracy than learning only the target task, known as negative transfer.
ForkMerge is a novel approach that periodically forks the model into multiple branches, automatically searches the varying task weights.
arXiv Detail & Related papers (2023-01-30T02:27:02Z) - Reinforcement Learning with Success Induced Task Prioritization [68.8204255655161]
We introduce Success Induced Task Prioritization (SITP), a framework for automatic curriculum learning.
The algorithm selects the order of tasks that provide the fastest learning for agents.
We demonstrate that SITP matches or surpasses the results of other curriculum design methods.
arXiv Detail & Related papers (2022-12-30T12:32:43Z) - Composite Learning for Robust and Effective Dense Predictions [81.2055761433725]
Multi-task learning promises better model generalization on a target task by jointly optimizing it with an auxiliary task.
We find that jointly training a dense prediction (target) task with a self-supervised (auxiliary) task can consistently improve the performance of the target task, while eliminating the need for labeling auxiliary tasks.
arXiv Detail & Related papers (2022-10-13T17:59:16Z) - Rectification-based Knowledge Retention for Continual Learning [49.1447478254131]
Deep learning models suffer from catastrophic forgetting when trained in an incremental learning setting.
We propose a novel approach to address the task incremental learning problem, which involves training a model on new tasks that arrive in an incremental manner.
Our approach can be used in both the zero-shot and non zero-shot task incremental learning settings.
arXiv Detail & Related papers (2021-03-30T18:11:30Z) - Multi-task Supervised Learning via Cross-learning [102.64082402388192]
We consider a problem known as multi-task learning, consisting of fitting a set of regression functions intended for solving different tasks.
In our novel formulation, we couple the parameters of these functions, so that they learn in their task specific domains while staying close to each other.
This facilitates cross-fertilization in which data collected across different domains help improving the learning performance at each other task.
arXiv Detail & Related papers (2020-10-24T21:35:57Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.