Mixture of basis for interpretable continual learning with distribution
shifts
- URL: http://arxiv.org/abs/2201.01853v1
- Date: Wed, 5 Jan 2022 22:53:15 GMT
- Title: Mixture of basis for interpretable continual learning with distribution
shifts
- Authors: Mengda Xu, Sumitra Ganesh, Pranay Pasula
- Abstract summary: Continual learning in environments with shifting data distributions is a challenging problem with several real-world applications.
We propose a novel approach called mixture of Basismodels (MoB) for addressing this problem setting.
- Score: 1.6114012813668934
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Continual learning in environments with shifting data distributions is a
challenging problem with several real-world applications. In this paper we
consider settings in which the data distribution(task) shifts abruptly and the
timing of these shifts are not known. Furthermore, we consider a
semi-supervised task-agnostic setting in which the learning algorithm has
access to both task-segmented and unsegmented data for offline training. We
propose a novel approach called mixture of Basismodels (MoB) for addressing
this problem setting. The core idea is to learn a small set of basis models and
to construct a dynamic, task-dependent mixture of the models to predict for the
current task. We also propose a new methodology to detect observations that are
out-of-distribution with respect to the existing basis models and to
instantiate new models as needed. We test our approach in multiple domains and
show that it attains better prediction error than existing methods in most
cases while using fewer models than other multiple model approaches. Moreover,
we analyze the latent task representations learned by MoB and show that similar
tasks tend to cluster in the latent space and that the latent representation
shifts at the task boundaries when tasks are dissimilar.
Related papers
- The Non-Local Model Merging Problem: Permutation Symmetries and Variance Collapse [25.002218722102505]
Model merging aims to efficiently combine the weights of multiple expert models, each trained on a specific task, into a single multi-task model.
This work explores the more challenging scenario of "non-local" merging.
Standard merging techniques often fail to generalize effectively in this non-local setting.
We propose a multi-task technique to re-scale and shift the output activations of the merged model for each task, aligning its output statistics with those of the corresponding task-specific expert models.
arXiv Detail & Related papers (2024-10-16T17:41:59Z) - Task Groupings Regularization: Data-Free Meta-Learning with Heterogeneous Pre-trained Models [83.02797560769285]
Data-Free Meta-Learning (DFML) aims to derive knowledge from a collection of pre-trained models without accessing their original data.
Current methods often overlook the heterogeneity among pre-trained models, which leads to performance degradation due to task conflicts.
We propose Task Groupings Regularization, a novel approach that benefits from model heterogeneity by grouping and aligning conflicting tasks.
arXiv Detail & Related papers (2024-05-26T13:11:55Z) - Continuous Unsupervised Domain Adaptation Using Stabilized
Representations and Experience Replay [23.871860648919593]
We introduce an algorithm for tackling the problem of unsupervised domain adaptation (UDA) in continual learning (CL) scenarios.
Our solution is based on stabilizing the learned internal distribution to enhances the model generalization on new domains.
We leverage experience replay to overcome the problem of catastrophic forgetting, where the model loses previously acquired knowledge when learning new tasks.
arXiv Detail & Related papers (2024-01-31T05:09:14Z) - AdaMerging: Adaptive Model Merging for Multi-Task Learning [68.75885518081357]
This paper introduces an innovative technique called Adaptive Model Merging (AdaMerging)
It aims to autonomously learn the coefficients for model merging, either in a task-wise or layer-wise manner, without relying on the original training data.
Compared to the current state-of-the-art task arithmetic merging scheme, AdaMerging showcases a remarkable 11% improvement in performance.
arXiv Detail & Related papers (2023-10-04T04:26:33Z) - Generalization Properties of Retrieval-based Models [50.35325326050263]
Retrieval-based machine learning methods have enjoyed success on a wide range of problems.
Despite growing literature showcasing the promise of these models, the theoretical underpinning for such models remains underexplored.
We present a formal treatment of retrieval-based models to characterize their generalization ability.
arXiv Detail & Related papers (2022-10-06T00:33:01Z) - On Generalizing Beyond Domains in Cross-Domain Continual Learning [91.56748415975683]
Deep neural networks often suffer from catastrophic forgetting of previously learned knowledge after learning a new task.
Our proposed approach learns new tasks under domain shift with accuracy boosts up to 10% on challenging datasets such as DomainNet and OfficeHome.
arXiv Detail & Related papers (2022-03-08T09:57:48Z) - Combat Data Shift in Few-shot Learning with Knowledge Graph [42.59886121530736]
In real-world applications, few-shot learning paradigm often suffers from data shift.
Most existing few-shot learning approaches are not designed with the consideration of data shift.
We propose a novel metric-based meta-learning framework to extract task-specific representations and task-shared representations.
arXiv Detail & Related papers (2021-01-27T12:35:18Z) - Goal-Aware Prediction: Learning to Model What Matters [105.43098326577434]
One of the fundamental challenges in using a learned forward dynamics model is the mismatch between the objective of the learned model and that of the downstream planner or policy.
We propose to direct prediction towards task relevant information, enabling the model to be aware of the current task and encouraging it to only model relevant quantities of the state space.
We find that our method more effectively models the relevant parts of the scene conditioned on the goal, and as a result outperforms standard task-agnostic dynamics models and model-free reinforcement learning.
arXiv Detail & Related papers (2020-07-14T16:42:59Z) - Task-Agnostic Online Reinforcement Learning with an Infinite Mixture of
Gaussian Processes [25.513074215377696]
This paper proposes a continual online model-based reinforcement learning approach.
It does not require pre-training to solve task-agnostic problems with unknown task boundaries.
In experiments, our approach outperforms alternative methods in non-stationary tasks.
arXiv Detail & Related papers (2020-06-19T23:52:45Z) - Learning Diverse Representations for Fast Adaptation to Distribution
Shift [78.83747601814669]
We present a method for learning multiple models, incorporating an objective that pressures each to learn a distinct way to solve the task.
We demonstrate our framework's ability to facilitate rapid adaptation to distribution shift.
arXiv Detail & Related papers (2020-06-12T12:23:50Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.