NPCL: Neural Processes for Uncertainty-Aware Continual Learning
- URL: http://arxiv.org/abs/2310.19272v1
- Date: Mon, 30 Oct 2023 05:10:00 GMT
- Title: NPCL: Neural Processes for Uncertainty-Aware Continual Learning
- Authors: Saurav Jha and Dong Gong and He Zhao and Lina Yao
- Abstract summary: Continual learning (CL) aims to train deep neural networks efficiently on streaming data while limiting the forgetting caused by new tasks.
We propose handling CL tasks with neural processes (NPs), a class of meta-learners that encode different tasks into probabilistic distributions over functions.
- Score: 26.642662729915234
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Continual learning (CL) aims to train deep neural networks efficiently on
streaming data while limiting the forgetting caused by new tasks. However,
learning transferable knowledge with less interference between tasks is
difficult, and real-world deployment of CL models is limited by their inability
to measure predictive uncertainties. To address these issues, we propose
handling CL tasks with neural processes (NPs), a class of meta-learners that
encode different tasks into probabilistic distributions over functions all
while providing reliable uncertainty estimates. Specifically, we propose an
NP-based CL approach (NPCL) with task-specific modules arranged in a
hierarchical latent variable model. We tailor regularizers on the learned
latent distributions to alleviate forgetting. The uncertainty estimation
capabilities of the NPCL can also be used to handle the task head/module
inference challenge in CL. Our experiments show that the NPCL outperforms
previous CL approaches. We validate the effectiveness of uncertainty estimation
in the NPCL for identifying novel data and evaluating instance-level model
confidence. Code is available at \url{https://github.com/srvCodes/NPCL}.
Related papers
- Mind the Interference: Retaining Pre-trained Knowledge in Parameter Efficient Continual Learning of Vision-Language Models [79.28821338925947]
Domain-Class Incremental Learning is a realistic but challenging continual learning scenario.
To handle these diverse tasks, pre-trained Vision-Language Models (VLMs) are introduced for their strong generalizability.
This incurs a new problem: the knowledge encoded in the pre-trained VLMs may be disturbed when adapting to new tasks, compromising their inherent zero-shot ability.
Existing methods tackle it by tuning VLMs with knowledge distillation on extra datasets, which demands heavy overhead.
We propose the Distribution-aware Interference-free Knowledge Integration (DIKI) framework, retaining pre-trained knowledge of
arXiv Detail & Related papers (2024-07-07T12:19:37Z) - Federated Continual Learning Goes Online: Leveraging Uncertainty for Modality-Agnostic Class-Incremental Learning [13.867793835583463]
We propose a new modality-agnostic approach to deal with the online scenario where new data arrive in streams of mini-batches that can only be processed once.
In particular, we suggest using an estimator based on the Bregman Information (BI) to compute the model's variance at the sample level.
arXiv Detail & Related papers (2024-05-29T09:29:39Z) - CLAP4CLIP: Continual Learning with Probabilistic Finetuning for Vision-Language Models [23.398619576886375]
Continual learning (CL) aims to help deep neural networks to learn new knowledge while retaining what has been learned.
Recently, pre-trained vision-language models such as CLIP, with powerful generalizability, have been gaining traction as practical CL candidates.
Our work proposes Continual LeArning with Probabilistic finetuning (CLAP)
arXiv Detail & Related papers (2024-03-28T04:15:58Z) - Decomposing Uncertainty for Large Language Models through Input Clarification Ensembling [69.83976050879318]
In large language models (LLMs), identifying sources of uncertainty is an important step toward improving reliability, trustworthiness, and interpretability.
In this paper, we introduce an uncertainty decomposition framework for LLMs, called input clarification ensembling.
Our approach generates a set of clarifications for the input, feeds them into an LLM, and ensembles the corresponding predictions.
arXiv Detail & Related papers (2023-11-15T05:58:35Z) - Complementary Learning Subnetworks for Parameter-Efficient
Class-Incremental Learning [40.13416912075668]
We propose a rehearsal-free CIL approach that learns continually via the synergy between two Complementary Learning Subnetworks.
Our method achieves competitive results against state-of-the-art methods, especially in accuracy gain, memory cost, training efficiency, and task-order.
arXiv Detail & Related papers (2023-06-21T01:43:25Z) - Uncertainty Estimation by Fisher Information-based Evidential Deep
Learning [61.94125052118442]
Uncertainty estimation is a key factor that makes deep learning reliable in practical applications.
We propose a novel method, Fisher Information-based Evidential Deep Learning ($mathcalI$-EDL)
In particular, we introduce Fisher Information Matrix (FIM) to measure the informativeness of evidence carried by each sample, according to which we can dynamically reweight the objective loss terms to make the network more focused on the representation learning of uncertain classes.
arXiv Detail & Related papers (2023-03-03T16:12:59Z) - Task Agnostic Representation Consolidation: a Self-supervised based
Continual Learning Approach [14.674494335647841]
We propose a two-stage training paradigm for CL that intertwines task-agnostic and task-specific learning.
We show that our training paradigm can be easily added to memory- or regularization-based approaches.
arXiv Detail & Related papers (2022-07-13T15:16:51Z) - Semantic Probabilistic Layers for Neuro-Symbolic Learning [83.25785999205932]
We design a predictive layer for structured-output prediction (SOP)
It can be plugged into any neural network guaranteeing its predictions are consistent with a set of predefined symbolic constraints.
Our Semantic Probabilistic Layer (SPL) can model intricate correlations, and hard constraints, over a structured output space.
arXiv Detail & Related papers (2022-06-01T12:02:38Z) - Learning Bayesian Sparse Networks with Full Experience Replay for
Continual Learning [54.7584721943286]
Continual Learning (CL) methods aim to enable machine learning models to learn new tasks without catastrophic forgetting of those that have been previously mastered.
Existing CL approaches often keep a buffer of previously-seen samples, perform knowledge distillation, or use regularization techniques towards this goal.
We propose to only activate and select sparse neurons for learning current and past tasks at any stage.
arXiv Detail & Related papers (2022-02-21T13:25:03Z) - Continual Learning in Recurrent Neural Networks [67.05499844830231]
We evaluate the effectiveness of continual learning methods for processing sequential data with recurrent neural networks (RNNs)
We shed light on the particularities that arise when applying weight-importance methods, such as elastic weight consolidation, to RNNs.
We show that the performance of weight-importance methods is not directly affected by the length of the processed sequences, but rather by high working memory requirements.
arXiv Detail & Related papers (2020-06-22T10:05:12Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.