Related papers: Overcoming Catastrophic Forgetting with Gaussian Mixture Replay

Overcoming Catastrophic Forgetting with Gaussian Mixture Replay

URL: http://arxiv.org/abs/2104.09220v1
Date: Mon, 19 Apr 2021 11:41:34 GMT
Title: Overcoming Catastrophic Forgetting with Gaussian Mixture Replay
Authors: Benedikt Pf\"ulb, Alexander Gepperth
Abstract summary: We present a rehearsal-based approach for continual learning (CL) based on Gaussian Mixture Models (GMM) We mitigate catastrophic forgetting (CF) by generating samples from previous tasks and merging them with current training data. We evaluate GMR on multiple image datasets, which are divided into class-disjoint sub-tasks.
Score: 79.0660895390689
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: We present Gaussian Mixture Replay (GMR), a rehearsal-based approach for continual learning (CL) based on Gaussian Mixture Models (GMM). CL approaches are intended to tackle the problem of catastrophic forgetting (CF), which occurs for Deep Neural Networks (DNNs) when sequentially training them on successive sub-tasks. GMR mitigates CF by generating samples from previous tasks and merging them with current training data. GMMs serve several purposes here: sample generation, density estimation (e.g., for detecting outliers or recognizing task boundaries) and providing a high-level feature representation for classification. GMR has several conceptual advantages over existing replay-based CL approaches. First of all, GMR achieves sample generation, classification and density estimation in a single network structure with strongly reduced memory requirements. Secondly, it can be trained at constant time complexity w.r.t. the number of sub-tasks, making it particularly suitable for life-long learning. Furthermore, GMR minimizes a differentiable loss function and seems to avoid mode collapse. In addition, task boundaries can be detected by applying GMM density estimation. Lastly, GMR does not require access to sub-tasks lying in the future for hyper-parameter tuning, allowing CL under real-world constraints. We evaluate GMR on multiple image datasets, which are divided into class-disjoint sub-tasks.

Related papers

RGM: A Robust Generalizable Matching Model [49.60975442871967]
We propose a deep model for sparse and dense matching, termed RGM (Robust Generalist Matching) To narrow the gap between synthetic training samples and real-world scenarios, we build a new, large-scale dataset with sparse correspondence ground truth. We are able to mix up various dense and sparse matching datasets, significantly improving the training diversity.
arXiv Detail & Related papers (2023-10-18T07:30:08Z)
Uncertainty Quantification using Generative Approach [4.4858968464373845]
We present the Incremental Generative Monte Carlo (IGMC) method to measure uncertainty in deep neural networks. IGMC iteratively trains generative models, adding their output to the dataset, to compute the posterior distribution of the expectation of a random variable. We empirically study the behavior of IGMC on the MNIST digit classification task.
arXiv Detail & Related papers (2023-10-13T18:05:25Z)
DOCTOR: A Multi-Disease Detection Continual Learning Framework Based on Wearable Medical Sensors [3.088223994180069]
We propose DOCTOR, a multi-disease detection continual learning framework based on wearable medical sensors (WMSs) It employs a multi-headed deep neural network (DNN) and a replay-style CL algorithm. It achieves 1.43 times better average test accuracy, 1.25 times better F1-score, and 0.41 higher backward transfer than the naive fine-tuning framework.
arXiv Detail & Related papers (2023-05-09T19:33:17Z)
Compound Batch Normalization for Long-tailed Image Classification [77.42829178064807]
We propose a compound batch normalization method based on a Gaussian mixture. It can model the feature space more comprehensively and reduce the dominance of head classes. The proposed method outperforms existing methods on long-tailed image classification.
arXiv Detail & Related papers (2022-12-02T07:31:39Z)
Reconstruction of Sentinel-2 Time Series Using Robust Gaussian Mixture Models -- Application to the Detection of Anomalous Crop Development in wheat and rapeseed crops [14.849671539394272]
Missing data is a recurrent problem in remote sensing, mainly due to cloud coverage for multispectral images and acquisition problems. This paper proposes a Gaussian Mixture Model (GMM) for the reconstruction of parcel-level features extracted from multispectral images. Additional features extracted from Synthetic Aperture Radar (SAR) images using Sentinel-1 data are also used to provide complementary information.
arXiv Detail & Related papers (2021-10-22T13:35:54Z)
Score-based Generative Modeling in Latent Space [93.8985523558869]
Score-based generative models (SGMs) have recently demonstrated impressive results in terms of both sample quality and distribution coverage. Here, we propose the Latent Score-based Generative Model (LSGM), a novel approach that trains SGMs in a latent space. Moving from data to latent space allows us to train more expressive generative models, apply SGMs to non-continuous data, and learn smoother SGMs in a smaller space.
arXiv Detail & Related papers (2021-06-10T17:26:35Z)
Continual Learning with Fully Probabilistic Models [70.3497683558609]
We present an approach for continual learning based on fully probabilistic (or generative) models of machine learning. We propose a pseudo-rehearsal approach using a Gaussian Mixture Model (GMM) instance for both generator and classifier functionalities. We show that GMR achieves state-of-the-art performance on common class-incremental learning problems at very competitive time and memory complexity.
arXiv Detail & Related papers (2021-04-19T12:26:26Z)
Gradient Coding with Dynamic Clustering for Straggler-Tolerant Distributed Learning [55.052517095437]
gradient descent (GD) is widely employed to parallelize the learning task by distributing the dataset across multiple workers. A significant performance bottleneck for the per-iteration completion time in distributed synchronous GD is $straggling$ workers. Coded distributed techniques have been introduced recently to mitigate stragglers and to speed up GD iterations by assigning redundant computations to workers. We propose a novel dynamic GC scheme, which assigns redundant data to workers to acquire the flexibility to choose from among a set of possible codes depending on the past straggling behavior.
arXiv Detail & Related papers (2021-03-01T18:51:29Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.