Related papers: Enhancing CLIP with CLIP: Exploring Pseudolabeling for Limited-Label Prompt Tuning

Enhancing CLIP with CLIP: Exploring Pseudolabeling for Limited-Label Prompt Tuning

URL: http://arxiv.org/abs/2306.01669v2
Date: Fri, 8 Mar 2024 03:19:39 GMT
Title: Enhancing CLIP with CLIP: Exploring Pseudolabeling for Limited-Label Prompt Tuning
Authors: Cristina Menghini, Andrew Delworth, Stephen H. Bach
Abstract summary: We study the use of pseudolabels, i.e., labels for unlabeled data, to enhance CLIP via prompt tuning. We observe that learning paradigms such as semi-supervised, transductive zero-shot, and unsupervised learning can all be seen as optimizing the same loss function. We find that (1) unexplored prompt tuning strategies that iteratively refine pseudolabels consistently improve CLIP accuracy, by 19.5 points in semi-supervised learning, by 28.4 points in transductive zero-shot learning, and by 15.2 points in unsupervised learning.
Score: 11.284317518288153
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Fine-tuning vision-language models (VLMs) like CLIP to downstream tasks is often necessary to optimize their performance. However, a major obstacle is the limited availability of labeled data. We study the use of pseudolabels, i.e., heuristic labels for unlabeled data, to enhance CLIP via prompt tuning. Conventional pseudolabeling trains a model on labeled data and then generates labels for unlabeled data. VLMs' zero-shot capabilities enable a "second generation" of pseudolabeling approaches that do not require task-specific training on labeled data. By using zero-shot pseudolabels as a source of supervision, we observe that learning paradigms such as semi-supervised, transductive zero-shot, and unsupervised learning can all be seen as optimizing the same loss function. This unified view enables the development of versatile training strategies that are applicable across learning paradigms. We investigate them on image classification tasks where CLIP exhibits limitations, by varying prompt modalities, e.g., textual or visual prompts, and learning paradigms. We find that (1) unexplored prompt tuning strategies that iteratively refine pseudolabels consistently improve CLIP accuracy, by 19.5 points in semi-supervised learning, by 28.4 points in transductive zero-shot learning, and by 15.2 points in unsupervised learning, and (2) unlike conventional semi-supervised pseudolabeling, which exacerbates model biases toward classes with higher-quality pseudolabels, prompt tuning leads to a more equitable distribution of per-class accuracy. The code to reproduce the experiments is at https://github.com/BatsResearch/menghini-neurips23-code.

Related papers

Candidate Pseudolabel Learning: Enhancing Vision-Language Models by Prompt Tuning with Unlabeled Data [9.132277138594652]
We propose a Candidate Pseudolabel Learning method to fine-tune vision-language models with abundant unlabeled data. Our method can result in better performance in true label inclusion and class-balanced instance selection.
arXiv Detail & Related papers (2024-06-15T04:50:20Z)
Pseudo-labelling meets Label Smoothing for Noisy Partial Label Learning [8.387189407144403]
We motivate weakly supervised learning as an effective learning paradigm for problems where curating perfectly annotated datasets is expensive.<n>We focus on Partial Learning (PLL), a weakly-supervised learning paradigm where each training instance is paired with a set of candidate labels.<n>We present a framework that initially assigns pseudo-labels to images by exploiting the noisy partial labels through a weighted nearest neighbour algorithm.
arXiv Detail & Related papers (2024-02-07T13:32:47Z)
Virtual Category Learning: A Semi-Supervised Learning Method for Dense Prediction with Extremely Limited Labels [63.16824565919966]
This paper proposes to use confusing samples proactively without label correction. A Virtual Category (VC) is assigned to each confusing sample in such a way that it can safely contribute to the model optimisation. Our intriguing findings highlight the usage of VC learning in dense vision tasks.
arXiv Detail & Related papers (2023-12-02T16:23:52Z)
Exploiting Completeness and Uncertainty of Pseudo Labels for Weakly Supervised Video Anomaly Detection [149.23913018423022]
Weakly supervised video anomaly detection aims to identify abnormal events in videos using only video-level labels. Two-stage self-training methods have achieved significant improvements by self-generating pseudo labels. We propose an enhancement framework by exploiting completeness and uncertainty properties for effective self-training.
arXiv Detail & Related papers (2022-12-08T05:53:53Z)
Learning with Partial Labels from Semi-supervised Perspective [28.735185883881172]
Partial Label (PL) learning refers to the task of learning from partially labeled data. We propose a novel PL learning method, namely Partial Label learning with Semi-Supervised Perspective (PLSP) PLSP significantly outperforms the existing PL baseline methods, especially on high ambiguity levels.
arXiv Detail & Related papers (2022-11-24T15:12:16Z)
Pseudo-Label Noise Suppression Techniques for Semi-Supervised Semantic Segmentation [21.163070161951868]
Semi-consuming learning (SSL) can reduce the need for large labelled datasets by incorporating unsupervised data into the training. Current SSL approaches use an initially supervised trained model to generate predictions for unlabelled images, called pseudo-labels. We use three mechanisms to control pseudo-label noise and errors.
arXiv Detail & Related papers (2022-10-19T09:46:27Z)
Transductive CLIP with Class-Conditional Contrastive Learning [68.51078382124331]
We propose Transductive CLIP, a novel framework for learning a classification network with noisy labels from scratch. A class-conditional contrastive learning mechanism is proposed to mitigate the reliance on pseudo labels. ensemble labels is adopted as a pseudo label updating strategy to stabilize the training of deep neural networks with noisy labels.
arXiv Detail & Related papers (2022-06-13T14:04:57Z)
Masked Unsupervised Self-training for Zero-shot Image Classification [98.23094305347709]
Masked Unsupervised Self-Training (MUST) is a new approach which leverages two different and complimentary sources of supervision: pseudo-labels and raw images. MUST improves upon CLIP by a large margin and narrows the performance gap between unsupervised and supervised classification.
arXiv Detail & Related papers (2022-06-07T02:03:06Z)
A Closer Look at Self-training for Zero-Label Semantic Segmentation [53.4488444382874]
Being able to segment unseen classes not observed during training is an important technical challenge in deep learning. Prior zero-label semantic segmentation works approach this task by learning visual-semantic embeddings or generative models. We propose a consistency regularizer to filter out noisy pseudo-labels by taking the intersections of the pseudo-labels generated from different augmentations of the same image.
arXiv Detail & Related papers (2021-04-21T14:34:33Z)
In Defense of Pseudo-Labeling: An Uncertainty-Aware Pseudo-label Selection Framework for Semi-Supervised Learning [53.1047775185362]
Pseudo-labeling (PL) is a general SSL approach that does not have this constraint but performs relatively poorly in its original formulation. We argue that PL underperforms due to the erroneous high confidence predictions from poorly calibrated models. We propose an uncertainty-aware pseudo-label selection (UPS) framework which improves pseudo labeling accuracy by drastically reducing the amount of noise encountered in the training process.
arXiv Detail & Related papers (2021-01-15T23:29:57Z)
Iterative label cleaning for transductive and semi-supervised few-shot learning [16.627512688664513]
Few-shot learning amounts to learning representations and acquiring knowledge such that novel tasks may be solved with both supervision and data being limited. We introduce a new algorithm that leverages the manifold structure of the labeled and unlabeled data distribution to predict pseudo-labels. Our solution surpasses or matches the state of the art results on four benchmark datasets.
arXiv Detail & Related papers (2020-12-14T21:54:11Z)
Boosting the Performance of Semi-Supervised Learning with Unsupervised Clustering [10.033658645311188]
We show that ignoring labels altogether for whole epochs intermittently during training can significantly improve performance in the small sample regime. We demonstrate our method's efficacy in boosting several state-of-the-art SSL algorithms.
arXiv Detail & Related papers (2020-12-01T14:19:14Z)

This list is automatically generated from the titles and abstracts of the papers in this site.