Interactive Continual Learning: Fast and Slow Thinking
- URL: http://arxiv.org/abs/2403.02628v2
- Date: Tue, 19 Mar 2024 02:19:52 GMT
- Title: Interactive Continual Learning: Fast and Slow Thinking
- Authors: Biqing Qi, Xingquan Chen, Junqi Gao, Dong Li, Jianxing Liu, Ligang Wu, Bowen Zhou,
- Abstract summary: This paper presents a novel Interactive Continual Learning framework, enabled by collaborative interactions among models of various sizes.
To improve memory retrieval in System1, we introduce the CL-vMF mechanism, based on the von Mises-Fisher (vMF) distribution.
Comprehensive evaluation of our proposed ICL demonstrates significant resistance to forgetting and superior performance relative to existing methods.
- Score: 19.253164551254734
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Advanced life forms, sustained by the synergistic interaction of neural cognitive mechanisms, continually acquire and transfer knowledge throughout their lifespan. In contrast, contemporary machine learning paradigms exhibit limitations in emulating the facets of continual learning (CL). Nonetheless, the emergence of large language models (LLMs) presents promising avenues for realizing CL via interactions with these models. Drawing on Complementary Learning System theory, this paper presents a novel Interactive Continual Learning (ICL) framework, enabled by collaborative interactions among models of various sizes. Specifically, we assign the ViT model as System1 and multimodal LLM as System2. To enable the memory module to deduce tasks from class information and enhance Set2Set retrieval, we propose the Class-Knowledge-Task Multi-Head Attention (CKT-MHA). Additionally, to improve memory retrieval in System1 through enhanced geometric representation, we introduce the CL-vMF mechanism, based on the von Mises-Fisher (vMF) distribution. Meanwhile, we introduce the von Mises-Fisher Outlier Detection and Interaction (vMF-ODI) strategy to identify hard examples, thus enhancing collaboration between System1 and System2 for complex reasoning realization. Comprehensive evaluation of our proposed ICL demonstrates significant resistance to forgetting and superior performance relative to existing methods. Code is available at github.com/ICL.
Related papers
- RA-BLIP: Multimodal Adaptive Retrieval-Augmented Bootstrapping Language-Image Pre-training [55.54020926284334]
Multimodal Large Language Models (MLLMs) have recently received substantial interest, which shows their emerging potential as general-purpose models for various vision-language tasks.
Retrieval augmentation techniques have proven to be effective plugins for both LLMs and MLLMs.
In this study, we propose multimodal adaptive Retrieval-Augmented Bootstrapping Language-Image Pre-training (RA-BLIP), a novel retrieval-augmented framework for various MLLMs.
arXiv Detail & Related papers (2024-10-18T03:45:19Z) - Multi-Modality Co-Learning for Efficient Skeleton-based Action Recognition [12.382193259575805]
We propose a novel multi-modality co-learning (MMCL) framework for efficient skeleton-based action recognition.
Our MMCL framework engages in multi-modality co-learning during the training stage and keeps efficiency by employing only concise skeletons in inference.
arXiv Detail & Related papers (2024-07-22T15:16:47Z) - Fast and Slow Generating: An Empirical Study on Large and Small Language Models Collaborative Decoding [27.004817441034795]
Collaborative decoding between large and small language models (SLMs) presents a promising strategy to mitigate these issues.
Inspired by dual-process cognitive theory, we propose a unified framework, termed Fast and Slow Generating (FS-GEN)
Within this framework, LLMs are categorized as System 2 (slow and deliberate), while independent SLMs are designated as System 1.
arXiv Detail & Related papers (2024-06-18T05:59:28Z) - SAPT: A Shared Attention Framework for Parameter-Efficient Continual Learning of Large Language Models [71.78800549517298]
Continual learning (CL) ability is vital for deploying large language models (LLMs) in the dynamic world.
Existing methods devise the learning module to acquire task-specific knowledge with parameter-efficient tuning (PET) block and the selection module to pick out the corresponding one for the testing input.
We propose a novel Shared Attention Framework (SAPT) to align the PET learning and selection via the Shared Attentive Learning & Selection module.
arXiv Detail & Related papers (2024-01-16T11:45:03Z) - Iterative Forward Tuning Boosts In-Context Learning in Language Models [88.25013390669845]
In this study, we introduce a novel two-stage framework to boost in-context learning in large language models (LLMs)
Specifically, our framework delineates the ICL process into two distinct stages: Deep-Thinking and test stages.
The Deep-Thinking stage incorporates a unique attention mechanism, i.e., iterative enhanced attention, which enables multiple rounds of information accumulation.
arXiv Detail & Related papers (2023-05-22T13:18:17Z) - USER: Unified Semantic Enhancement with Momentum Contrast for Image-Text
Retrieval [115.28586222748478]
Image-Text Retrieval (ITR) aims at searching for the target instances that are semantically relevant to the given query from the other modality.
Existing approaches typically suffer from two major limitations.
arXiv Detail & Related papers (2023-01-17T12:42:58Z) - Continual Learning, Fast and Slow [75.53144246169346]
According to the Complementary Learning Systems theory, humans do effective emphcontinual learning through two complementary systems.
We propose emphDualNets (for Dual Networks), a general continual learning framework comprising a fast learning system for supervised learning of specific tasks and a slow learning system for representation learning of task-agnostic general representation via Self-Supervised Learning (SSL)
We demonstrate the promising results of DualNets on a wide range of continual learning protocols, ranging from the standard offline, task-aware setting to the challenging online, task-free scenario.
arXiv Detail & Related papers (2022-09-06T10:48:45Z) - Learning an evolved mixture model for task-free continual learning [11.540150938141034]
We address the Task-Free Continual Learning (TFCL) in which a model is trained on non-stationary data streams with no explicit task information.
We introduce two simple dropout mechanisms to selectively remove stored examples in order to avoid memory overload.
arXiv Detail & Related papers (2022-07-11T16:01:27Z) - Mixture-of-Variational-Experts for Continual Learning [0.0]
We propose an optimality principle that facilitates a trade-off between learning and forgetting.
We propose a neural network layer for continual learning, called Mixture-of-Variational-Experts (MoVE)
Our experiments on variants of the MNIST and CIFAR10 datasets demonstrate the competitive performance of MoVE layers.
arXiv Detail & Related papers (2021-10-25T06:32:06Z) - Self-organizing Democratized Learning: Towards Large-scale Distributed
Learning Systems [71.14339738190202]
democratized learning (Dem-AI) lays out a holistic philosophy with underlying principles for building large-scale distributed and democratized machine learning systems.
Inspired by Dem-AI philosophy, a novel distributed learning approach is proposed in this paper.
The proposed algorithms demonstrate better results in the generalization performance of learning models in agents compared to the conventional FL algorithms.
arXiv Detail & Related papers (2020-07-07T08:34:48Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.