Continual Learning with Optimal Transport based Mixture Model
- URL: http://arxiv.org/abs/2211.16780v1
- Date: Wed, 30 Nov 2022 06:40:29 GMT
- Title: Continual Learning with Optimal Transport based Mixture Model
- Authors: Quyen Tran, Hoang Phan, Khoat Than, Dinh Phung, Trung Le
- Abstract summary: We propose an online mixture model learning approach based on nice properties of the mature optimal transport theory (OT-MM)
Our proposed method can significantly outperform the current state-of-the-art baselines.
- Score: 17.398605698033656
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: Online Class Incremental learning (CIL) is a challenging setting in Continual
Learning (CL), wherein data of new tasks arrive in incoming streams and online
learning models need to handle incoming data streams without revisiting
previous ones. Existing works used a single centroid adapted with incoming data
streams to characterize a class. This approach possibly exposes limitations
when the incoming data stream of a class is naturally multimodal. To address
this issue, in this work, we first propose an online mixture model learning
approach based on nice properties of the mature optimal transport theory
(OT-MM). Specifically, the centroids and covariance matrices of the mixture
model are adapted incrementally according to incoming data streams. The
advantages are two-fold: (i) we can characterize more accurately complex data
streams and (ii) by using centroids for each class produced by OT-MM, we can
estimate the similarity of an unseen example to each class more reasonably when
doing inference. Moreover, to combat the catastrophic forgetting in the CIL
scenario, we further propose Dynamic Preservation. Particularly, after
performing the dynamic preservation technique across data streams, the latent
representations of the classes in the old and new tasks become more condensed
themselves and more separate from each other. Together with a contraction
feature extractor, this technique facilitates the model in mitigating the
catastrophic forgetting. The experimental results on real-world datasets show
that our proposed method can significantly outperform the current
state-of-the-art baselines.
Related papers
- Continual Learning for Multimodal Data Fusion of a Soft Gripper [1.0589208420411014]
A model trained on one data modality often fails when tested with a different modality.
We introduce a continual learning algorithm capable of incrementally learning different data modalities.
We evaluate the algorithm's effectiveness on a challenging custom multimodal dataset.
arXiv Detail & Related papers (2024-09-20T09:53:27Z) - Few-Shot Class-Incremental Learning with Non-IID Decentralized Data [12.472285188772544]
Few-shot class-incremental learning is crucial for developing scalable and adaptive intelligent systems.
This paper introduces federated few-shot class-incremental learning, a decentralized machine learning paradigm.
We present a synthetic data-driven framework that leverages replay buffer data to maintain existing knowledge and facilitate the acquisition of new knowledge.
arXiv Detail & Related papers (2024-09-18T02:48:36Z) - DANCE: Dual-View Distribution Alignment for Dataset Condensation [39.08022095906364]
We propose a new DM-based method named Dual-view distribution AligNment for dataset CondEnsation (DANCE)
Specifically, from the inner-class view, we construct multiple "middle encoders" to perform pseudo long-term distribution alignment.
While from the inter-class view, we use the expert models to perform distribution calibration.
arXiv Detail & Related papers (2024-06-03T07:22:17Z) - Task-customized Masked AutoEncoder via Mixture of Cluster-conditional
Experts [104.9871176044644]
Masked Autoencoder(MAE) is a prevailing self-supervised learning method that achieves promising results in model pre-training.
We propose a novel MAE-based pre-training paradigm, Mixture of Cluster-conditional Experts (MoCE)
MoCE trains each expert only with semantically relevant images by using cluster-conditional gates.
arXiv Detail & Related papers (2024-02-08T03:46:32Z) - AdaMerging: Adaptive Model Merging for Multi-Task Learning [68.75885518081357]
This paper introduces an innovative technique called Adaptive Model Merging (AdaMerging)
It aims to autonomously learn the coefficients for model merging, either in a task-wise or layer-wise manner, without relying on the original training data.
Compared to the current state-of-the-art task arithmetic merging scheme, AdaMerging showcases a remarkable 11% improvement in performance.
arXiv Detail & Related papers (2023-10-04T04:26:33Z) - Multi-View Class Incremental Learning [57.14644913531313]
Multi-view learning (MVL) has gained great success in integrating information from multiple perspectives of a dataset to improve downstream task performance.
This paper investigates a novel paradigm called multi-view class incremental learning (MVCIL), where a single model incrementally classifies new classes from a continual stream of views.
arXiv Detail & Related papers (2023-06-16T08:13:41Z) - CLIPood: Generalizing CLIP to Out-of-Distributions [73.86353105017076]
Contrastive language-image pre-training (CLIP) models have shown impressive zero-shot ability, but the further adaptation of CLIP on downstream tasks undesirably degrades OOD performances.
We propose CLIPood, a fine-tuning method that can adapt CLIP models to OOD situations where both domain shifts and open classes may occur on unseen test data.
Experiments on diverse datasets with different OOD scenarios show that CLIPood consistently outperforms existing generalization techniques.
arXiv Detail & Related papers (2023-02-02T04:27:54Z) - Beyond Transfer Learning: Co-finetuning for Action Localisation [64.07196901012153]
We propose co-finetuning -- simultaneously training a single model on multiple upstream'' and downstream'' tasks.
We demonstrate that co-finetuning outperforms traditional transfer learning when using the same total amount of data.
We also show how we can easily extend our approach to multiple upstream'' datasets to further improve performance.
arXiv Detail & Related papers (2022-07-08T10:25:47Z) - Lightweight Conditional Model Extrapolation for Streaming Data under
Class-Prior Shift [27.806085423595334]
We introduce LIMES, a new method for learning with non-stationary streaming data.
We learn a single set of model parameters from which a specific classifier for any specific data distribution is derived.
Experiments on a set of exemplary tasks using Twitter data show that LIMES achieves higher accuracy than alternative approaches.
arXiv Detail & Related papers (2022-06-10T15:19:52Z) - Tackling Online One-Class Incremental Learning by Removing Negative
Contrasts [12.048166025000976]
Distinct from other continual learning settings the learner is presented new samples only once.
ER-AML achieved strong performance in this setting by applying an asymmetric loss based on contrastive learning to the incoming data and replayed data.
We adapt a recently proposed approach from self-supervised learning to the supervised learning setting, unlocking the constraint on contrasts.
arXiv Detail & Related papers (2022-03-24T19:17:29Z) - Task-agnostic Continual Learning with Hybrid Probabilistic Models [75.01205414507243]
We propose HCL, a Hybrid generative-discriminative approach to Continual Learning for classification.
The flow is used to learn the data distribution, perform classification, identify task changes, and avoid forgetting.
We demonstrate the strong performance of HCL on a range of continual learning benchmarks such as split-MNIST, split-CIFAR, and SVHN-MNIST.
arXiv Detail & Related papers (2021-06-24T05:19:26Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.