More Is Better: An Analysis of Instance Quantity/Quality Trade-off in
Rehearsal-based Continual Learning
- URL: http://arxiv.org/abs/2105.14106v1
- Date: Fri, 28 May 2021 21:05:51 GMT
- Title: More Is Better: An Analysis of Instance Quantity/Quality Trade-off in
Rehearsal-based Continual Learning
- Authors: Francesco Pelosin and Andrea Torsello
- Abstract summary: Continual Learning has become that of addressing the stability-plasticity dilemma of connectionist systems.
We propose an analysis of the memory quantity/quality trade-off adopting various data reduction approaches to increase the number of instances storable in memory.
Our findings suggest that the optimal trade-off is severely skewed toward instance quantity, where rehearsal approaches with several heavily compressed instances easily outperform state-of-the-art approaches.
- Score: 3.9596068699962315
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The design of machines and algorithms capable of learning in a dynamically
changing environment has become an increasingly topical problem with the
increase of the size and heterogeneity of data available to learning systems.
As a consequence, the key issue of Continual Learning has become that of
addressing the stability-plasticity dilemma of connectionist systems, as they
need to adapt their model without forgetting previously acquired knowledge.
Within this context, rehearsal-based methods i.e., solutions in where the
learner exploits memory to revisit past data, has proven to be very effective,
leading to performance at the state-of-the-art. In our study, we propose an
analysis of the memory quantity/quality trade-off adopting various data
reduction approaches to increase the number of instances storable in memory. In
particular, we investigate complex instance compression techniques such as deep
encoders, but also trivial approaches such as image resizing and linear
dimensionality reduction. Our findings suggest that the optimal trade-off is
severely skewed toward instance quantity, where rehearsal approaches with
several heavily compressed instances easily outperform state-of-the-art
approaches with the same amount of memory at their disposal. Further, in high
memory configurations, deep approaches extracting spatial structure combined
with extreme resizing (of the order of $8\times8$ images) yield the best
results, while in memory-constrained configurations where deep approaches
cannot be used due to their memory requirement in training, Extreme Learning
Machines (ELM) offer a clear advantage.
Related papers
- On Improving the Algorithm-, Model-, and Data- Efficiency of Self-Supervised Learning [18.318758111829386]
We propose an efficient single-branch SSL method based on non-parametric instance discrimination.
We also propose a novel self-distillation loss that minimizes the KL divergence between the probability distribution and its square root version.
arXiv Detail & Related papers (2024-04-30T06:39:04Z) - Random resistive memory-based deep extreme point learning machine for
unified visual processing [67.51600474104171]
We propose a novel hardware-software co-design, random resistive memory-based deep extreme point learning machine (DEPLM)
Our co-design system achieves huge energy efficiency improvements and training cost reduction when compared to conventional systems.
arXiv Detail & Related papers (2023-12-14T09:46:16Z) - Heterogenous Memory Augmented Neural Networks [84.29338268789684]
We introduce a novel heterogeneous memory augmentation approach for neural networks.
By introducing learnable memory tokens with attention mechanism, we can effectively boost performance without huge computational overhead.
We show our approach on various image and graph-based tasks under both in-distribution (ID) and out-of-distribution (OOD) conditions.
arXiv Detail & Related papers (2023-10-17T01:05:28Z) - Large-scale Fully-Unsupervised Re-Identification [78.47108158030213]
We propose two strategies to learn from large-scale unlabeled data.
The first strategy performs a local neighborhood sampling to reduce the dataset size in each without violating neighborhood relationships.
A second strategy leverages a novel Re-Ranking technique, which has a lower time upper bound complexity and reduces the memory complexity from O(n2) to O(kn) with k n.
arXiv Detail & Related papers (2023-07-26T16:19:19Z) - Learning Large-scale Neural Fields via Context Pruned Meta-Learning [60.93679437452872]
We introduce an efficient optimization-based meta-learning technique for large-scale neural field training.
We show how gradient re-scaling at meta-test time allows the learning of extremely high-quality neural fields.
Our framework is model-agnostic, intuitive, straightforward to implement, and shows significant reconstruction improvements for a wide range of signals.
arXiv Detail & Related papers (2023-02-01T17:32:16Z) - Deep invariant networks with differentiable augmentation layers [87.22033101185201]
Methods for learning data augmentation policies require held-out data and are based on bilevel optimization problems.
We show that our approach is easier and faster to train than modern automatic data augmentation techniques.
arXiv Detail & Related papers (2022-02-04T14:12:31Z) - Simple Stochastic and Online Gradient DescentAlgorithms for Pairwise
Learning [65.54757265434465]
Pairwise learning refers to learning tasks where the loss function depends on a pair instances.
Online descent (OGD) is a popular approach to handle streaming data in pairwise learning.
In this paper, we propose simple and online descent to methods for pairwise learning.
arXiv Detail & Related papers (2021-11-23T18:10:48Z) - A Variational Information Bottleneck Based Method to Compress Sequential
Networks for Human Action Recognition [9.414818018857316]
We propose a method to effectively compress Recurrent Neural Networks (RNNs) used for Human Action Recognition (HAR)
We use a Variational Information Bottleneck (VIB) theory-based pruning approach to limit the information flow through the sequential cells of RNNs to a small subset.
We combine our pruning method with a specific group-lasso regularization technique that significantly improves compression.
It is shown that our method achieves over 70 times greater compression than the nearest competitor with comparable accuracy for the task of action recognition on UCF11.
arXiv Detail & Related papers (2020-10-03T12:41:51Z) - Active Importance Sampling for Variational Objectives Dominated by Rare
Events: Consequences for Optimization and Generalization [12.617078020344618]
We introduce an approach that combines rare events sampling techniques with neural network optimization to optimize objective functions dominated by rare events.
We show that importance sampling reduces the variance of the solution to a learning problem, suggesting benefits for generalization.
Our numerical experiments demonstrate that we can successfully learn even with the compounding difficulties of high-dimensional and rare data.
arXiv Detail & Related papers (2020-08-11T23:38:09Z) - Neuromodulated Neural Architectures with Local Error Signals for
Memory-Constrained Online Continual Learning [4.2903672492917755]
We develop a biologically-inspired light weight neural network architecture that incorporates local learning and neuromodulation.
We demonstrate the efficacy of our approach on both single task and continual learning setting.
arXiv Detail & Related papers (2020-07-16T07:41:23Z) - Dataset Condensation with Gradient Matching [36.14340188365505]
We propose a training set synthesis technique for data-efficient learning, called dataset Condensation, that learns to condense large dataset into a small set of informative synthetic samples for training deep neural networks from scratch.
We rigorously evaluate its performance in several computer vision benchmarks and demonstrate that it significantly outperforms the state-of-the-art methods.
arXiv Detail & Related papers (2020-06-10T16:30:52Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.