Related papers: Optimal Continual Learning has Perfect Memory and is NP-hard

Optimal Continual Learning has Perfect Memory and is NP-hard

URL: http://arxiv.org/abs/2006.05188v1
Date: Tue, 9 Jun 2020 11:20:38 GMT
Title: Optimal Continual Learning has Perfect Memory and is NP-hard
Authors: Jeremias Knoblauch, Hisham Husain, Tom Diethe
Abstract summary: Continual Learning (CL) algorithms incrementally learn a predictor or representation across multiple sequentially observed tasks. The current paper develops a theoretical approach that explains why. We derive the computational properties which CL algorithms would have to possess in order to avoid catastrophic forgetting.
Score: 19.629732320437856
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Continual Learning (CL) algorithms incrementally learn a predictor or representation across multiple sequentially observed tasks. Designing CL algorithms that perform reliably and avoid so-called catastrophic forgetting has proven a persistent challenge. The current paper develops a theoretical approach that explains why. In particular, we derive the computational properties which CL algorithms would have to possess in order to avoid catastrophic forgetting. Our main finding is that such optimal CL algorithms generally solve an NP-hard problem and will require perfect memory to do so. The findings are of theoretical interest, but also explain the excellent performance of CL algorithms using experience replay, episodic memory and core sets relative to regularization-based approaches.

Related papers

A General Online Algorithm for Optimizing Complex Performance Metrics [5.726378955570775]
We introduce and analyze a general online algorithm that can be used in a straightforward way with a variety of complex performance metrics in binary, multi-class, and multi-label classification problems. The algorithm's update and prediction rules are appealingly simple and computationally efficient without the need to store any past data.
arXiv Detail & Related papers (2024-06-20T21:24:47Z)
Continual Learning on a Diet: Learning from Sparsely Labeled Streams Under Constrained Computation [123.4883806344334]
We study a realistic Continual Learning setting where learning algorithms are granted a restricted computational budget per time step while training. We apply this setting to large-scale semi-supervised Continual Learning scenarios with sparse label rates. Our extensive analysis and ablations demonstrate that DietCL is stable under a full spectrum of label sparsity, computational budget, and various other ablations.
arXiv Detail & Related papers (2024-04-19T10:10:39Z)
Hyperparameters in Continual Learning: A Reality Check [53.30082523545212]
Continual learning (CL) aims to train a model on a sequence of tasks while balancing the trade-off between plasticity (learning new tasks) and stability (retaining prior knowledge)
arXiv Detail & Related papers (2024-03-14T03:13:01Z)
Sketch and shift: a robust decoder for compressive clustering [17.627195350266796]
Compressive learning is an emerging approach to drastically reduce the memory footprint of large-scale learning. We propose an alternative decoder offering substantial improvements over CL-OMPR. The proposed algorithm can extract clustering information from a sketch of the MNIST dataset that is 10 times smaller than previously.
arXiv Detail & Related papers (2023-12-15T16:53:55Z)
SALSA-CLRS: A Sparse and Scalable Benchmark for Algorithmic Reasoning [20.706469085872516]
We introduce an extension to the CLRS algorithmic learning benchmark, prioritizing scalability and the utilization of sparse representations. Our approach includes adapted algorithms from the original CLRS benchmark and introduces new problems from distributed and randomized algorithms.
arXiv Detail & Related papers (2023-09-21T16:57:09Z)
Do Pre-trained Models Benefit Equally in Continual Learning? [25.959813589169176]
Existing work on continual learning (CL) is primarily devoted to developing algorithms for models trained from scratch. Despite their encouraging performance on contrived benchmarks, these algorithms show dramatic performance drops in real-world scenarios. This paper advocates the systematic introduction of pre-training to CL.
arXiv Detail & Related papers (2022-10-27T18:03:37Z)
Lifelong Bandit Optimization: No Prior and No Regret [70.94238868711952]
We develop LIBO, an algorithm which adapts to the environment by learning from past experience. We assume a kernelized structure where the kernel is unknown but shared across all tasks. Our algorithm can be paired with any kernelized or linear bandit algorithm and guarantees optimal performance.
arXiv Detail & Related papers (2022-10-27T14:48:49Z)
Improved Algorithms for Neural Active Learning [74.89097665112621]
We improve the theoretical and empirical performance of neural-network(NN)-based active learning algorithms for the non-parametric streaming setting. We introduce two regret metrics by minimizing the population loss that are more suitable in active learning than the one used in state-of-the-art (SOTA) related work.
arXiv Detail & Related papers (2022-10-02T05:03:38Z)
Learning to Hash Robustly, with Guarantees [79.68057056103014]
In this paper, we design an NNS algorithm for the Hamming space that has worst-case guarantees essentially matching that of theoretical algorithms. We evaluate the algorithm's ability to optimize for a given dataset both theoretically and practically. Our algorithm has a 1.8x and 2.1x better recall on the worst-performing queries to the MNIST and ImageNet datasets.
arXiv Detail & Related papers (2021-08-11T20:21:30Z)
Double Coverage with Machine-Learned Advice [100.23487145400833]
We study the fundamental online $k$-server problem in a learning-augmented setting. We show that our algorithm achieves for any k an almost optimal consistency-robustness tradeoff.
arXiv Detail & Related papers (2021-03-02T11:04:33Z)
Towards Optimally Efficient Tree Search with Deep Learning [76.64632985696237]
This paper investigates the classical integer least-squares problem which estimates signals integer from linear models. The problem is NP-hard and often arises in diverse applications such as signal processing, bioinformatics, communications and machine learning. We propose a general hyper-accelerated tree search (HATS) algorithm by employing a deep neural network to estimate the optimal estimation for the underlying simplified memory-bounded A* algorithm.
arXiv Detail & Related papers (2021-01-07T08:00:02Z)

This list is automatically generated from the titles and abstracts of the papers in this site.