Related papers: Few-shot Tuning of Foundation Models for Class-incremental Learning

Few-shot Tuning of Foundation Models for Class-incremental Learning

URL: http://arxiv.org/abs/2405.16625v1
Date: Sun, 26 May 2024 16:41:03 GMT
Title: Few-shot Tuning of Foundation Models for Class-incremental Learning
Authors: Shuvendu Roy, Elham Dolatabadi, Arash Afkanpour, Ali Etemad,
Abstract summary: We propose a new approach to continually tune foundation models for new classes in few-shot settings. CoACT shows up to 13.5% improvement in standard FSCIL over the current SOTA on benchmark evaluations.
Score: 19.165004570789755
License: http://creativecommons.org/licenses/by-sa/4.0/
Abstract: For the first time, we explore few-shot tuning of vision foundation models for class-incremental learning. Unlike existing few-shot class incremental learning (FSCIL) methods, which train an encoder on a base session to ensure forward compatibility for future continual learning, foundation models are generally trained on large unlabelled data without such considerations. This renders prior methods from traditional FSCIL incompatible for FSCIL with the foundation model. To this end, we propose Consistency-guided Asynchronous Contrastive Tuning (CoACT), a new approach to continually tune foundation models for new classes in few-shot settings. CoACT comprises three components: (i) asynchronous contrastive tuning, which learns new classes by including LoRA modules in the pre-trained encoder, while enforcing consistency between two asynchronous encoders; (ii) controlled fine-tuning, which facilitates effective tuning of a subset of the foundation model; and (iii) consistency-guided incremental tuning, which enforces additional regularization during later sessions to reduce forgetting of the learned classes. We perform an extensive study on 16 diverse datasets and demonstrate the effectiveness of CoACT, outperforming the best baseline method by 2.47% on average and with up to 12.52% on individual datasets. Additionally, CoACT shows reduced forgetting and robustness in low-shot experiments. As an added bonus, CoACT shows up to 13.5% improvement in standard FSCIL over the current SOTA on benchmark evaluations. We make our code publicly available at https://github.com/ShuvenduRoy/CoACT-FSCIL.

Related papers

Killing Two Birds with One Stone: Unifying Retrieval and Ranking with a Single Generative Recommendation Model [71.45491434257106]
Unified Generative Recommendation Framework (UniGRF) is a novel approach that integrates retrieval and ranking into a single generative model. To enhance inter-stage collaboration, UniGRF introduces a ranking-driven enhancer module. UniGRF significantly outperforms existing models on benchmark datasets.
arXiv Detail & Related papers (2025-04-23T06:43:54Z)
Unbiased Max-Min Embedding Classification for Transductive Few-Shot Learning: Clustering and Classification Are All You Need [83.10178754323955]
Few-shot learning enables models to generalize from only a few labeled examples. We propose the Unbiased Max-Min Embedding Classification (UMMEC) Method, which addresses the key challenges in few-shot learning. Our method significantly improves classification performance with minimal labeled data, advancing the state-of-the-art in annotatedL.
arXiv Detail & Related papers (2025-03-28T07:23:07Z)
Singular Value Fine-tuning for Few-Shot Class-Incremental Learning [38.777602828340356]
Class-Incremental Learning (CIL) aims to prevent catastrophic forgetting of previously learned classes while incorporating new ones. We propose the Singular Value Finetuning for FSCIL (SVFCL) SVFCL applies singular value decomposition to the foundation model weights, keeping the singular vectors fixed while fine-tuning the singular values for each task, and then merging them.
arXiv Detail & Related papers (2025-03-13T09:57:28Z)
High-Performance Few-Shot Segmentation with Foundation Models: An Empirical Study [64.06777376676513]
We develop a few-shot segmentation (FSS) framework based on foundation models. To be specific, we propose a simple approach to extract implicit knowledge from foundation models to construct coarse correspondence. Experiments on two widely used datasets demonstrate the effectiveness of our approach.
arXiv Detail & Related papers (2024-09-10T08:04:11Z)
SMILe: Leveraging Submodular Mutual Information For Robust Few-Shot Object Detection [2.0755366440393743]
Confusion and forgetting of object classes have been challenges of prime interest in Few-Shot Object Detection (FSOD) We introduce a novel Submodular Mutual Information Learning framework which adopts mutual information functions. Our proposed approach generalizes to several existing approaches in FSOD, agnostic of the backbone architecture.
arXiv Detail & Related papers (2024-07-02T20:53:43Z)
Combining Denoising Autoencoders with Contrastive Learning to fine-tune Transformer Models [0.0]
This work proposes a 3 Phase technique to adjust a base model for a classification task. We adapt the model's signal to the data distribution by performing further training with a Denoising Autoencoder (DAE) In addition, we introduce a new data augmentation approach for Supervised Contrastive Learning to correct the unbalanced datasets.
arXiv Detail & Related papers (2024-05-23T11:08:35Z)
Rethinking Few-shot 3D Point Cloud Semantic Segmentation [62.80639841429669]
This paper revisits few-shot 3D point cloud semantic segmentation (FS-PCS) We focus on two significant issues in the state-of-the-art: foreground leakage and sparse point distribution. To address these issues, we introduce a standardized FS-PCS setting, upon which a new benchmark is built.
arXiv Detail & Related papers (2024-03-01T15:14:47Z)
Read Between the Layers: Leveraging Multi-Layer Representations for Rehearsal-Free Continual Learning with Pre-Trained Models [15.847302755988506]
We address the Continual Learning problem, wherein a model must learn a sequence of tasks from non-stationary distributions. We propose LayUP, a new prototype-based approach to CL that leverages second-order feature statistics from multiple intermediate layers of a pre-trained network. Our results demonstrate that fully exhausting the representational capacities of pre-trained models in CL goes well beyond their final embeddings.
arXiv Detail & Related papers (2023-12-13T13:11:44Z)
Generalized Logit Adjustment: Calibrating Fine-tuned Models by Removing Label Bias in Foundation Models [75.9543301303586]
Foundation models like CLIP allow zero-shot transfer on various tasks without additional training data. Fine-tuning and ensembling are also commonly adopted to better fit the downstream tasks. However, we argue that prior work has overlooked the inherent biases in foundation models.
arXiv Detail & Related papers (2023-10-12T08:01:11Z)
Knowledge Transfer-Driven Few-Shot Class-Incremental Learning [23.163459923345556]
Few-shot class-incremental learning (FSCIL) aims to continually learn new classes using a few samples while not forgetting the old classes. Despite the advance of existing FSCIL methods, the proposed knowledge transfer learning schemes are sub-optimal due to the insufficient optimization for the model's plasticity. We propose a Random Episode Sampling and Augmentation (RESA) strategy that relies on diverse pseudo incremental tasks as agents to achieve the knowledge transfer.
arXiv Detail & Related papers (2023-06-19T14:02:45Z)
Consistency-guided Prompt Learning for Vision-Language Models [23.4909421082857]
We propose Consistency-guided Prompt learning (CoPrompt), a new fine-tuning method for vision-language models. Our approach improves the generalization of large foundation models when fine-tuned on downstream tasks in a few-shot setting.
arXiv Detail & Related papers (2023-06-01T23:20:47Z)
Universal Domain Adaptation from Foundation Models: A Baseline Study [58.51162198585434]
We make empirical studies of state-of-the-art UniDA methods using foundation models. We introduce textitCLIP distillation, a parameter-free method specifically designed to distill target knowledge from CLIP models. Although simple, our method outperforms previous approaches in most benchmark tasks.
arXiv Detail & Related papers (2023-05-18T16:28:29Z)
TWINS: A Fine-Tuning Framework for Improved Transferability of Adversarial Robustness and Generalization [89.54947228958494]
This paper focuses on the fine-tuning of an adversarially pre-trained model in various classification tasks. We propose a novel statistics-based approach, Two-WIng NormliSation (TWINS) fine-tuning framework. TWINS is shown to be effective on a wide range of image classification datasets in terms of both generalization and robustness.
arXiv Detail & Related papers (2023-03-20T14:12:55Z)
Modeling Inter-Class and Intra-Class Constraints in Novel Class Discovery [20.67503042774617]
Novel class discovery (NCD) aims at learning a model that transfers the common knowledge from a class-disjoint labelled dataset to another unlabelled dataset. We propose to model both inter-class and intra-class constraints in NCD based on the symmetric Kullback-Leibler divergence (sKLD)
arXiv Detail & Related papers (2022-10-07T14:46:32Z)
Rethinking Few-Shot Class-Incremental Learning with Open-Set Hypothesis in Hyperbolic Geometry [21.38183613466714]
Few-Shot Class-Incremental Learning (FSCIL) aims at incrementally learning novel classes from a few labeled samples. In this paper, we rethink the configuration of FSCIL with the open-set hypothesis by reserving the possibility in the first session for incoming categories. To assign better performances on both close-set and open-set recognition to the model, Hyperbolic Reciprocal Point Learning module (Hyper-RPL) is built on Reciprocal Point Learning (RPL) with hyperbolic neural networks.
arXiv Detail & Related papers (2022-07-20T15:13:48Z)
Improving Calibration for Long-Tailed Recognition [68.32848696795519]
We propose two methods to improve calibration and performance in such scenarios. For dataset bias due to different samplers, we propose shifted batch normalization. Our proposed methods set new records on multiple popular long-tailed recognition benchmark datasets.
arXiv Detail & Related papers (2021-04-01T13:55:21Z)
Contrastive Prototype Learning with Augmented Embeddings for Few-Shot Learning [58.2091760793799]
We propose a novel contrastive prototype learning with augmented embeddings (CPLAE) model. With a class prototype as an anchor, CPL aims to pull the query samples of the same class closer and those of different classes further away. Extensive experiments on several benchmarks demonstrate that our proposed CPLAE achieves new state-of-the-art.
arXiv Detail & Related papers (2021-01-23T13:22:44Z)
One-Shot Object Detection without Fine-Tuning [62.39210447209698]
We introduce a two-stage model consisting of a first stage Matching-FCOS network and a second stage Structure-Aware Relation Module. We also propose novel training strategies that effectively improve detection performance. Our method exceeds the state-of-the-art one-shot performance consistently on multiple datasets.
arXiv Detail & Related papers (2020-05-08T01:59:23Z)

This list is automatically generated from the titles and abstracts of the papers in this site.