Hyperparameters in Continual Learning: a Reality Check
- URL: http://arxiv.org/abs/2403.09066v1
- Date: Thu, 14 Mar 2024 03:13:01 GMT
- Title: Hyperparameters in Continual Learning: a Reality Check
- Authors: Sungmin Cha, Kyunghyun Cho,
- Abstract summary: It has been common practice to train a CL algorithm on a CL scenario constructed with a benchmark dataset.
In this paper, we contend that this evaluation protocol is not only impractical but also incapable of effectively assessing the CL capability of a CL algorithm.
- Score: 53.30082523545212
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Various algorithms for continual learning (CL) have been designed with the goal of effectively alleviating the trade-off between stability and plasticity during the CL process. To achieve this goal, tuning appropriate hyperparameters for each algorithm is essential. As an evaluation protocol, it has been common practice to train a CL algorithm using diverse hyperparameter values on a CL scenario constructed with a benchmark dataset. Subsequently, the best performance attained with the optimal hyperparameter value serves as the criterion for evaluating the CL algorithm. In this paper, we contend that this evaluation protocol is not only impractical but also incapable of effectively assessing the CL capability of a CL algorithm. Returning to the fundamental principles of model evaluation in machine learning, we propose an evaluation protocol that involves Hyperparameter Tuning and Evaluation phases. Those phases consist of different datasets but share the same CL scenario. In the Hyperparameter Tuning phase, each algorithm is iteratively trained with different hyperparameter values to find the optimal hyperparameter values. Subsequently, in the Evaluation phase, the optimal hyperparameter values is directly applied for training each algorithm, and their performance in the Evaluation phase serves as the criterion for evaluating them. Through experiments on CIFAR-100 and ImageNet-100 based on the proposed protocol in class-incremental learning, we not only observed that the existing evaluation method fail to properly assess the CL capability of each algorithm but also observe that some recently proposed state-of-the-art algorithms, which reported superior performance, actually exhibit inferior performance compared to the previous algorithm.
Related papers
- CLoRA: Parameter-Efficient Continual Learning with Low-Rank Adaptation [14.2843647693986]
Low-Rank Adaptation (LoRA) is a parameter-efficient fine-tuning method for class-incremental semantic segmentation.<n>CLoRA significantly reduces the hardware requirements for training, making it well-suited for CL in resource-constrained environments after deployment.
arXiv Detail & Related papers (2025-07-26T09:36:05Z) - ICL-TSVD: Bridging Theory and Practice in Continual Learning with Pre-trained Models [103.45785408116146]
Continual learning (CL) aims to train a model that can solve multiple tasks presented sequentially.
Recent CL approaches have achieved strong performance by leveraging large pre-trained models that generalize well to downstream tasks.
However, such methods lack theoretical guarantees, making them prone to unexpected failures.
We bridge this gap by integrating an empirically strong approach into a principled framework, designed to prevent forgetting.
arXiv Detail & Related papers (2024-10-01T12:58:37Z) - CLAP4CLIP: Continual Learning with Probabilistic Finetuning for Vision-Language Models [23.398619576886375]
Continual learning (CL) aims to help deep neural networks learn new knowledge while retaining what has been learned.
Our work proposes Continual LeArning with Probabilistic finetuning (CLAP) - a probabilistic modeling framework over visual-guided text features per task.
arXiv Detail & Related papers (2024-03-28T04:15:58Z) - Density Distribution-based Learning Framework for Addressing Online
Continual Learning Challenges [4.715630709185073]
We introduce a density distribution-based learning framework for online Continual Learning.
Our framework achieves superior average accuracy and time-space efficiency.
Our method outperforms popular CL approaches by a significant margin.
arXiv Detail & Related papers (2023-11-22T09:21:28Z) - Continual Learning with Dynamic Sparse Training: Exploring Algorithms
for Effective Model Updates [13.983410740333788]
Continual learning (CL) refers to the ability of an intelligent system to sequentially acquire and retain knowledge from a stream of data with as little computational overhead as possible.
Dynamic Sparse Training (DST) is a prominent way to find these sparse networks and isolate them for each task.
This paper is the first empirical study investigating the effect of different DST components under the CL paradigm.
arXiv Detail & Related papers (2023-08-28T18:31:09Z) - Ada-QPacknet -- adaptive pruning with bit width reduction as an
efficient continual learning method without forgetting [0.8681331155356999]
In this work new architecture based approach Ada-QPacknet is described.
It incorporates the pruning for extracting the sub-network for each task.
Results show that proposed approach outperforms most of the CL strategies in task and class incremental scenarios.
arXiv Detail & Related papers (2023-08-14T12:17:11Z) - Optimizing Hyperparameters with Conformal Quantile Regression [7.316604052864345]
We propose to leverage conformalized quantile regression which makes minimal assumptions about the observation noise.
This translates to quicker HPO convergence on empirical benchmarks.
arXiv Detail & Related papers (2023-05-05T15:33:39Z) - Computationally Budgeted Continual Learning: What Does Matter? [128.0827987414154]
Continual Learning (CL) aims to sequentially train models on streams of incoming data that vary in distribution by preserving previous knowledge while adapting to new data.
Current CL literature focuses on restricted access to previously seen data, while imposing no constraints on the computational budget for training.
We revisit this problem with a large-scale benchmark and analyze the performance of traditional CL approaches in a compute-constrained setting.
arXiv Detail & Related papers (2023-03-20T14:50:27Z) - From MNIST to ImageNet and Back: Benchmarking Continual Curriculum
Learning [9.104068727716294]
Continual learning (CL) is one of the most promising trends in machine learning research.
We introduce two novel CL benchmarks that involve multiple heterogeneous tasks from six image datasets.
We additionally structure our benchmarks so that tasks are presented in increasing and decreasing order of complexity.
arXiv Detail & Related papers (2023-03-16T18:11:19Z) - Real-Time Evaluation in Online Continual Learning: A New Hope [104.53052316526546]
We evaluate current Continual Learning (CL) methods with respect to their computational costs.
A simple baseline outperforms state-of-the-art CL methods under this evaluation.
This surprisingly suggests that the majority of existing CL literature is tailored to a specific class of streams that is not practical.
arXiv Detail & Related papers (2023-02-02T12:21:10Z) - Do Pre-trained Models Benefit Equally in Continual Learning? [25.959813589169176]
Existing work on continual learning (CL) is primarily devoted to developing algorithms for models trained from scratch.
Despite their encouraging performance on contrived benchmarks, these algorithms show dramatic performance drops in real-world scenarios.
This paper advocates the systematic introduction of pre-training to CL.
arXiv Detail & Related papers (2022-10-27T18:03:37Z) - Actor-Critic based Improper Reinforcement Learning [61.430513757337486]
We consider an improper reinforcement learning setting where a learner is given $M$ base controllers for an unknown Markov decision process.
We propose two algorithms: (1) a Policy Gradient-based approach; and (2) an algorithm that can switch between a simple Actor-Critic scheme and a Natural Actor-Critic scheme.
arXiv Detail & Related papers (2022-07-19T05:55:02Z) - Using Representation Expressiveness and Learnability to Evaluate
Self-Supervised Learning Methods [61.49061000562676]
We introduce Cluster Learnability (CL) to assess learnability.
CL is measured in terms of the performance of a KNN trained to predict labels obtained by clustering the representations with K-means.
We find that CL better correlates with in-distribution model performance than other competing recent evaluation schemes.
arXiv Detail & Related papers (2022-06-02T19:05:13Z) - The CLEAR Benchmark: Continual LEArning on Real-World Imagery [77.98377088698984]
Continual learning (CL) is widely regarded as crucial challenge for lifelong AI.
We introduce CLEAR, the first continual image classification benchmark dataset with a natural temporal evolution of visual concepts.
We find that a simple unsupervised pre-training step can already boost state-of-the-art CL algorithms.
arXiv Detail & Related papers (2022-01-17T09:09:09Z) - Continual Learning for Recurrent Neural Networks: a Review and Empirical
Evaluation [12.27992745065497]
Continual Learning with recurrent neural networks could pave the way to a large number of applications where incoming data is non stationary.
We organize the literature on CL for sequential data processing by providing a categorization of the contributions and a review of the benchmarks.
We propose two new benchmarks for CL with sequential data based on existing datasets, whose characteristics resemble real-world applications.
arXiv Detail & Related papers (2021-03-12T19:25:28Z) - Phase Retrieval using Expectation Consistent Signal Recovery Algorithm
based on Hypernetwork [73.94896986868146]
Phase retrieval is an important component in modern computational imaging systems.
Recent advances in deep learning have opened up a new possibility for robust and fast PR.
We develop a novel framework for deep unfolding to overcome the existing limitations.
arXiv Detail & Related papers (2021-01-12T08:36:23Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.