Towards Reliable Zero Shot Classification in Self-Supervised Models with
Conformal Prediction
- URL: http://arxiv.org/abs/2210.15805v1
- Date: Thu, 27 Oct 2022 23:52:14 GMT
- Title: Towards Reliable Zero Shot Classification in Self-Supervised Models with
Conformal Prediction
- Authors: Bhawesh Kumar, Anil Palepu, Rudraksh Tuwani and Andrew Beam
- Abstract summary: We develop a conformal prediction procedure to assess when a given test caption may be reliably used.
We show that our proposed conformal procedure improves the reliability of CLIP-style models in the zero-shot classification setting.
- Score: 0.688204255655161
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Self-supervised models trained with a contrastive loss such as CLIP have
shown to be very powerful in zero-shot classification settings. However, to be
used as a zero-shot classifier these models require the user to provide new
captions over a fixed set of labels at test time. In many settings, it is hard
or impossible to know if a new query caption is compatible with the source
captions used to train the model. We address these limitations by framing the
zero-shot classification task as an outlier detection problem and develop a
conformal prediction procedure to assess when a given test caption may be
reliably used. On a real-world medical example, we show that our proposed
conformal procedure improves the reliability of CLIP-style models in the
zero-shot classification setting, and we provide an empirical analysis of the
factors that may affect its performance.
Related papers
- Estimating Uncertainty in Multimodal Foundation Models using Public
Internet Data [15.365603519829088]
Foundation models are trained on vast amounts of data at scale using self-supervised learning.
In this paper, we address the problem of quantifying uncertainty in zero-shot predictions.
We propose a approach for uncertainty estimation in zero-shot settings using conformal prediction with web data.
arXiv Detail & Related papers (2023-10-15T19:24:52Z) - Image-free Classifier Injection for Zero-Shot Classification [72.66409483088995]
Zero-shot learning models achieve remarkable results on image classification for samples from classes that were not seen during training.
We aim to equip pre-trained models with zero-shot classification capabilities without the use of image data.
We achieve this with our proposed Image-free Injection with Semantics (ICIS)
arXiv Detail & Related papers (2023-08-21T09:56:48Z) - When Does Confidence-Based Cascade Deferral Suffice? [69.28314307469381]
Cascades are a classical strategy to enable inference cost to vary adaptively across samples.
A deferral rule determines whether to invoke the next classifier in the sequence, or to terminate prediction.
Despite being oblivious to the structure of the cascade, confidence-based deferral often works remarkably well in practice.
arXiv Detail & Related papers (2023-07-06T04:13:57Z) - ProTeCt: Prompt Tuning for Taxonomic Open Set Classification [59.59442518849203]
Few-shot adaptation methods do not fare well in the taxonomic open set (TOS) setting.
We propose a prompt tuning technique that calibrates the hierarchical consistency of model predictions.
A new Prompt Tuning for Hierarchical Consistency (ProTeCt) technique is then proposed to calibrate classification across label set granularities.
arXiv Detail & Related papers (2023-06-04T02:55:25Z) - Zero-shot Model Diagnosis [80.36063332820568]
A common approach to evaluate deep learning models is to build a labeled test set with attributes of interest and assess how well it performs.
This paper argues the case that Zero-shot Model Diagnosis (ZOOM) is possible without the need for a test set nor labeling.
arXiv Detail & Related papers (2023-03-27T17:59:33Z) - Enabling Calibration In The Zero-Shot Inference of Large Vision-Language
Models [58.720142291102135]
We measure calibration across relevant variables like prompt, dataset, and architecture, and find that zero-shot inference with CLIP is miscalibrated.
A single learned temperature generalizes for each specific CLIP model across inference dataset and prompt choice.
arXiv Detail & Related papers (2023-03-11T17:14:04Z) - Language Models in the Loop: Incorporating Prompting into Weak
Supervision [11.10422546502386]
We propose a new strategy for applying large pre-trained language models to novel tasks when labeled training data is limited.
Instead of applying the model in a typical zero-shot or few-shot fashion, we treat the model as the basis for labeling functions in a weak supervision framework.
arXiv Detail & Related papers (2022-05-04T20:42:40Z) - Estimating the Robustness of Classification Models by the Structure of
the Learned Feature-Space [10.418647759223964]
We argue that fixed testsets are only able to capture a small portion of possible data variations and are thus limited and prone to generate new overfitted solutions.
To overcome these drawbacks, we suggest to estimate the robustness of a model directly from the structure of its learned feature-space.
arXiv Detail & Related papers (2021-06-23T10:52:29Z) - Understanding Classifier Mistakes with Generative Models [88.20470690631372]
Deep neural networks are effective on supervised learning tasks, but have been shown to be brittle.
In this paper, we leverage generative models to identify and characterize instances where classifiers fail to generalize.
Our approach is agnostic to class labels from the training set which makes it applicable to models trained in a semi-supervised way.
arXiv Detail & Related papers (2020-10-05T22:13:21Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.