KOPPA: Improving Prompt-based Continual Learning with Key-Query
Orthogonal Projection and Prototype-based One-Versus-All
- URL: http://arxiv.org/abs/2311.15414v2
- Date: Thu, 30 Nov 2023 15:26:20 GMT
- Title: KOPPA: Improving Prompt-based Continual Learning with Key-Query
Orthogonal Projection and Prototype-based One-Versus-All
- Authors: Quyen Tran, Lam Tran, Khoat Than, Toan Tran, Dinh Phung, Trung Le
- Abstract summary: We introduce a novel key-query learning strategy to enhance prompt matching efficiency and address the challenge of shifting features.
Our method empowers the model to achieve results surpassing those of current state-of-the-art approaches by a large margin of up to 20%.
- Score: 26.506535205897443
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Drawing inspiration from prompt tuning techniques applied to Large Language
Models, recent methods based on pre-trained ViT networks have achieved
remarkable results in the field of Continual Learning. Specifically, these
approaches propose to maintain a set of prompts and allocate a subset of them
to learn each task using a key-query matching strategy. However, they may
encounter limitations when lacking control over the correlations between old
task queries and keys of future tasks, the shift of features in the latent
space, and the relative separation of latent vectors learned in independent
tasks. In this work, we introduce a novel key-query learning strategy based on
orthogonal projection, inspired by model-agnostic meta-learning, to enhance
prompt matching efficiency and address the challenge of shifting features.
Furthermore, we introduce a One-Versus-All (OVA) prototype-based component that
enhances the classification head distinction. Experimental results on benchmark
datasets demonstrate that our method empowers the model to achieve results
surpassing those of current state-of-the-art approaches by a large margin of up
to 20%.
Related papers
- Task Consistent Prototype Learning for Incremental Few-shot Semantic Segmentation [20.49085411104439]
Incremental Few-Shot Semantic (iFSS) tackles a task that requires a model to continually expand its segmentation capability on novel classes.
This study introduces a meta-learning-based prototype approach that encourages the model to learn how to adapt quickly while preserving previous knowledge.
Experiments on iFSS datasets built upon PASCAL and COCO benchmarks show the advanced performance of the proposed approach.
arXiv Detail & Related papers (2024-10-16T23:42:27Z) - Multi-View Class Incremental Learning [57.14644913531313]
Multi-view learning (MVL) has gained great success in integrating information from multiple perspectives of a dataset to improve downstream task performance.
This paper investigates a novel paradigm called multi-view class incremental learning (MVCIL), where a single model incrementally classifies new classes from a continual stream of views.
arXiv Detail & Related papers (2023-06-16T08:13:41Z) - Improving Feature Generalizability with Multitask Learning in Class
Incremental Learning [12.632121107536843]
Many deep learning applications, like keyword spotting, require the incorporation of new concepts (classes) over time, referred to as Class Incremental Learning (CIL)
The major challenge in CIL is catastrophic forgetting, i.e., preserving as much of the old knowledge as possible while learning new tasks.
We propose multitask learning during base model training to improve the feature generalizability.
Our approach enhances the average incremental learning accuracy by up to 5.5%, which enables more reliable and accurate keyword spotting over time.
arXiv Detail & Related papers (2022-04-26T07:47:54Z) - Dual Path Structural Contrastive Embeddings for Learning Novel Objects [6.979491536753043]
Recent research shows that gaining information on a good feature space can be an effective solution to achieve favorable performance on few-shot tasks.
We propose a simple but effective paradigm that decouples the tasks of learning feature representations and classifiers.
Our method can still achieve promising results for both standard and generalized few-shot problems in either an inductive or transductive inference setting.
arXiv Detail & Related papers (2021-12-23T04:43:31Z) - Contrastive Prototype Learning with Augmented Embeddings for Few-Shot
Learning [58.2091760793799]
We propose a novel contrastive prototype learning with augmented embeddings (CPLAE) model.
With a class prototype as an anchor, CPL aims to pull the query samples of the same class closer and those of different classes further away.
Extensive experiments on several benchmarks demonstrate that our proposed CPLAE achieves new state-of-the-art.
arXiv Detail & Related papers (2021-01-23T13:22:44Z) - Few-shot Action Recognition with Prototype-centered Attentive Learning [88.10852114988829]
Prototype-centered Attentive Learning (PAL) model composed of two novel components.
First, a prototype-centered contrastive learning loss is introduced to complement the conventional query-centered learning objective.
Second, PAL integrates a attentive hybrid learning mechanism that can minimize the negative impacts of outliers.
arXiv Detail & Related papers (2021-01-20T11:48:12Z) - Lifelong Learning Without a Task Oracle [13.331659934508764]
Supervised deep neural networks are known to undergo a sharp decline in the accuracy of older tasks when new tasks are learned.
We propose and compare several candidate task-assigning mappers which require very little memory overhead.
Best-performing variants only impose an average cost of 1.7% parameter memory increase.
arXiv Detail & Related papers (2020-11-09T21:30:31Z) - Concept Learners for Few-Shot Learning [76.08585517480807]
We propose COMET, a meta-learning method that improves generalization ability by learning to learn along human-interpretable concept dimensions.
We evaluate our model on few-shot tasks from diverse domains, including fine-grained image classification, document categorization and cell type annotation.
arXiv Detail & Related papers (2020-07-14T22:04:17Z) - Prototypical Contrastive Learning of Unsupervised Representations [171.3046900127166]
Prototypical Contrastive Learning (PCL) is an unsupervised representation learning method.
PCL implicitly encodes semantic structures of the data into the learned embedding space.
PCL outperforms state-of-the-art instance-wise contrastive learning methods on multiple benchmarks.
arXiv Detail & Related papers (2020-05-11T09:53:36Z) - Task-Feature Collaborative Learning with Application to Personalized
Attribute Prediction [166.87111665908333]
We propose a novel multi-task learning method called Task-Feature Collaborative Learning (TFCL)
Specifically, we first propose a base model with a heterogeneous block-diagonal structure regularizer to leverage the collaborative grouping of features and tasks.
As a practical extension, we extend the base model by allowing overlapping features and differentiating the hard tasks.
arXiv Detail & Related papers (2020-04-29T02:32:04Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.