Image-free Classifier Injection for Zero-Shot Classification
- URL: http://arxiv.org/abs/2308.10599v1
- Date: Mon, 21 Aug 2023 09:56:48 GMT
- Title: Image-free Classifier Injection for Zero-Shot Classification
- Authors: Anders Christensen, Massimiliano Mancini, A. Sophia Koepke, Ole
Winther, Zeynep Akata
- Abstract summary: Zero-shot learning models achieve remarkable results on image classification for samples from classes that were not seen during training.
We aim to equip pre-trained models with zero-shot classification capabilities without the use of image data.
We achieve this with our proposed Image-free Injection with Semantics (ICIS)
- Score: 72.66409483088995
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Zero-shot learning models achieve remarkable results on image classification
for samples from classes that were not seen during training. However, such
models must be trained from scratch with specialised methods: therefore, access
to a training dataset is required when the need for zero-shot classification
arises. In this paper, we aim to equip pre-trained models with zero-shot
classification capabilities without the use of image data. We achieve this with
our proposed Image-free Classifier Injection with Semantics (ICIS) that injects
classifiers for new, unseen classes into pre-trained classification models in a
post-hoc fashion without relying on image data. Instead, the existing
classifier weights and simple class-wise descriptors, such as class names or
attributes, are used. ICIS has two encoder-decoder networks that learn to
reconstruct classifier weights from descriptors (and vice versa), exploiting
(cross-)reconstruction and cosine losses to regularise the decoding process.
Notably, ICIS can be cheaply trained and applied directly on top of pre-trained
classification models. Experiments on benchmark ZSL datasets show that ICIS
produces unseen classifier weights that achieve strong (generalised) zero-shot
classification performance. Code is available at
https://github.com/ExplainableML/ImageFreeZSL .
Related papers
- Re-Scoring Using Image-Language Similarity for Few-Shot Object Detection [4.0208298639821525]
Few-shot object detection, which focuses on detecting novel objects with few labels, is an emerging challenge in the community.
Recent studies show that adapting a pre-trained model or modified loss function can improve performance.
We propose Re-scoring using Image-language Similarity for Few-shot object detection (RISF) which extends Faster R-CNN.
arXiv Detail & Related papers (2023-11-01T04:04:34Z) - Text-to-Image Diffusion Models are Zero-Shot Classifiers [8.26990105697146]
We investigate text-to-image diffusion models by proposing a method for evaluating them as zero-shot classifiers.
We apply our method to Stable Diffusion and Imagen, using it to probe fine-grained aspects of the models' knowledge.
They perform competitively with CLIP on a wide range of zero-shot image classification datasets.
arXiv Detail & Related papers (2023-03-27T14:15:17Z) - GMM-IL: Image Classification using Incrementally Learnt, Independent
Probabilistic Models for Small Sample Sizes [0.4511923587827301]
We present a novel two stage architecture which couples visual feature learning with probabilistic models to represent each class.
We outperform a benchmark of an equivalent network with a Softmax head, obtaining increased accuracy for sample sizes smaller than 12 and increased weighted F1 score for 3 imbalanced class profiles.
arXiv Detail & Related papers (2022-12-01T15:19:42Z) - Prediction Calibration for Generalized Few-shot Semantic Segmentation [101.69940565204816]
Generalized Few-shot Semantic (GFSS) aims to segment each image pixel into either base classes with abundant training examples or novel classes with only a handful of (e.g., 1-5) training images per class.
We build a cross-attention module that guides the classifier's final prediction using the fused multi-level features.
Our PCN outperforms the state-the-art alternatives by large margins.
arXiv Detail & Related papers (2022-10-15T13:30:12Z) - Masked Unsupervised Self-training for Zero-shot Image Classification [98.23094305347709]
Masked Unsupervised Self-Training (MUST) is a new approach which leverages two different and complimentary sources of supervision: pseudo-labels and raw images.
MUST improves upon CLIP by a large margin and narrows the performance gap between unsupervised and supervised classification.
arXiv Detail & Related papers (2022-06-07T02:03:06Z) - Generating Representative Samples for Few-Shot Classification [8.62483598990205]
Few-shot learning aims to learn new categories with a few visual samples per class.
Few-shot class representations are often biased due to data scarcity.
We generate visual samples based on semantic embeddings using a conditional variational autoencoder model.
arXiv Detail & Related papers (2022-05-05T20:58:33Z) - Prototypical Classifier for Robust Class-Imbalanced Learning [64.96088324684683]
We propose textitPrototypical, which does not require fitting additional parameters given the embedding network.
Prototypical produces balanced and comparable predictions for all classes even though the training set is class-imbalanced.
We test our method on CIFAR-10LT, CIFAR-100LT and Webvision datasets, observing that Prototypical obtains substaintial improvements compared with state of the arts.
arXiv Detail & Related papers (2021-10-22T01:55:01Z) - Simpler is Better: Few-shot Semantic Segmentation with Classifier Weight
Transformer [112.95747173442754]
A few-shot semantic segmentation model is typically composed of a CNN encoder, a CNN decoder and a simple classifier.
Most existing methods meta-learn all three model components for fast adaptation to a new class.
In this work we propose to simplify the meta-learning task by focusing solely on the simplest component, the classifier.
arXiv Detail & Related papers (2021-08-06T10:20:08Z) - A Multi-class Approach -- Building a Visual Classifier based on Textual
Descriptions using Zero-Shot Learning [0.34265828682659694]
We overcome the two main hurdles of Machine Learning, i.e. scarcity of data and constrained prediction of the classification model.
We train a classifier by mapping labelled images to their textual description instead of training it for specific classes.
arXiv Detail & Related papers (2020-11-18T12:06:55Z) - Background Splitting: Finding Rare Classes in a Sea of Background [55.03789745276442]
We focus on the real-world problem of training accurate deep models for image classification of a small number of rare categories.
In these scenarios, almost all images belong to the background category in the dataset (>95% of the dataset is background)
We demonstrate that both standard fine-tuning approaches and state-of-the-art approaches for training on imbalanced datasets do not produce accurate deep models in the presence of this extreme imbalance.
arXiv Detail & Related papers (2020-08-28T23:05:15Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.