Related papers: Towards a General Framework for Continual Learning with Pre-training

Towards a General Framework for Continual Learning with Pre-training

URL: http://arxiv.org/abs/2310.13888v2
Date: Tue, 9 Jul 2024 00:56:12 GMT
Title: Towards a General Framework for Continual Learning with Pre-training
Authors: Liyuan Wang, Jingyi Xie, Xingxing Zhang, Hang Su, Jun Zhu,
Abstract summary: We present a general framework for continual learning of sequentially arrived tasks with the use of pre-training. We decompose its objective into three hierarchical components, including within-task prediction, task-identity inference, and task-adaptive prediction. We propose an innovative approach to explicitly optimize these components with parameter-efficient fine-tuning (PEFT) techniques and representation statistics.
Score: 55.88910947643436
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: In this work, we present a general framework for continual learning of sequentially arrived tasks with the use of pre-training, which has emerged as a promising direction for artificial intelligence systems to accommodate real-world dynamics. From a theoretical perspective, we decompose its objective into three hierarchical components, including within-task prediction, task-identity inference, and task-adaptive prediction. Then we propose an innovative approach to explicitly optimize these components with parameter-efficient fine-tuning (PEFT) techniques and representation statistics. We empirically demonstrate the superiority and generality of our approach in downstream continual learning, and further explore the applicability of PEFT techniques in upstream continual learning. We also discuss the biological basis of the proposed framework with recent advances in neuroscience.

Related papers

Sequencing to Mitigate Catastrophic Forgetting in Continual Learning [1.1724961392643483]
Catastrophic forgetting (CF) is a major challenge to the progress of Continual Learning approaches.<n>We consider the role of task sequencing in mitigating CF and propose a method for determining the optimal task order.<n>Results demonstrate that intelligent task sequencing can substantially reduce CF.
arXiv Detail & Related papers (2025-12-18T18:40:58Z)
Forecast-Then-Optimize Deep Learning Methods [10.067896857251162]
Time series forecasting underpins vital decision-making across various sectors, yet raw predictions from sophisticated models often harbor systematic errors and biases.<n>We examine the Forecast-Then-Then (FTO) framework, pioneering its systematic synopsis.<n>Deep learning and large language models have established superiority over traditional parametric forecasting models for most enterprise applications.
arXiv Detail & Related papers (2025-06-16T02:02:30Z)
Latenrgy: Model Agnostic Latency and Energy Consumption Prediction for Binary Classifiers [0.0]
Machine learning systems increasingly drive innovation across scientific fields and industry. Yet challenges in compute overhead, specifically during inference, limit their scalability and sustainability. This study addresses critical gaps in the literature, chiefly the lack of generalized predictive techniques for latency and energy consumption.
arXiv Detail & Related papers (2024-12-26T14:51:24Z)
Exploring the Precise Dynamics of Single-Layer GAN Models: Leveraging Multi-Feature Discriminators for High-Dimensional Subspace Learning [0.0]
We study the training dynamics of a single-layer GAN model from the perspective of subspace learning. By bridging our analysis to the realm of subspace learning, we systematically compare the efficacy of GAN-based methods against conventional approaches.
arXiv Detail & Related papers (2024-11-01T10:21:12Z)
Parameter-Efficient Fine-Tuning for Continual Learning: A Neural Tangent Kernel Perspective [125.00228936051657]
We introduce NTK-CL, a novel framework that eliminates task-specific parameter storage while adaptively generating task-relevant features. By fine-tuning optimizable parameters with appropriate regularization, NTK-CL achieves state-of-the-art performance on established PEFT-CL benchmarks.
arXiv Detail & Related papers (2024-07-24T09:30:04Z)
HiDe-PET: Continual Learning via Hierarchical Decomposition of Parameter-Efficient Tuning [55.88910947643436]
We propose a unified framework for continual learning (CL) with pre-trained models (PTMs) and parameter-efficient tuning (PET) We present Hierarchical Decomposition PET (HiDe-PET), an innovative approach that explicitly optimize the objective through incorporating task-specific and task-shared knowledge. Our approach demonstrates remarkably superior performance over a broad spectrum of recent strong baselines.
arXiv Detail & Related papers (2024-07-07T01:50:25Z)
On the Generalization Ability of Unsupervised Pretraining [53.06175754026037]
Recent advances in unsupervised learning have shown that unsupervised pre-training, followed by fine-tuning, can improve model generalization. This paper introduces a novel theoretical framework that illuminates the critical factor influencing the transferability of knowledge acquired during unsupervised pre-training to the subsequent fine-tuning phase. Our results contribute to a better understanding of unsupervised pre-training and fine-tuning paradigm, and can shed light on the design of more effective pre-training algorithms.
arXiv Detail & Related papers (2024-03-11T16:23:42Z)
Hierarchical Decomposition of Prompt-Based Continual Learning: Rethinking Obscured Sub-optimality [55.88910947643436]
Self-supervised pre-training is essential for handling vast quantities of unlabeled data in practice. HiDe-Prompt is an innovative approach that explicitly optimize the hierarchical components with an ensemble of task-specific prompts and statistics. Our experiments demonstrate the superior performance of HiDe-Prompt and its robustness to pre-training paradigms in continual learning.
arXiv Detail & Related papers (2023-10-11T06:51:46Z)
A Novel Neural-symbolic System under Statistical Relational Learning [50.747658038910565]
We propose a general bi-level probabilistic graphical reasoning framework called GBPGR. In GBPGR, the results of symbolic reasoning are utilized to refine and correct the predictions made by the deep learning models. Our approach achieves high performance and exhibits effective generalization in both transductive and inductive tasks.
arXiv Detail & Related papers (2023-09-16T09:15:37Z)
Predictive Experience Replay for Continual Visual Control and Forecasting [62.06183102362871]
We present a new continual learning approach for visual dynamics modeling and explore its efficacy in visual control and forecasting. We first propose the mixture world model that learns task-specific dynamics priors with a mixture of Gaussians, and then introduce a new training strategy to overcome catastrophic forgetting. Our model remarkably outperforms the naive combinations of existing continual learning and visual RL algorithms on DeepMind Control and Meta-World benchmarks with continual visual control tasks.
arXiv Detail & Related papers (2023-03-12T05:08:03Z)
Target-Embedding Autoencoders for Supervised Representation Learning [111.07204912245841]
This paper analyzes a framework for improving generalization in a purely supervised setting, where the target space is high-dimensional. We motivate and formalize the general framework of target-embedding autoencoders (TEA) for supervised prediction, learning intermediate latent representations jointly optimized to be both predictable from features as well as predictive of targets.
arXiv Detail & Related papers (2020-01-23T02:37:10Z)

This list is automatically generated from the titles and abstracts of the papers in this site.