Related papers: Image-free Classifier Injection for Zero-Shot Classification

Image-free Classifier Injection for Zero-Shot Classification

URL: http://arxiv.org/abs/2308.10599v1
Date: Mon, 21 Aug 2023 09:56:48 GMT
Title: Image-free Classifier Injection for Zero-Shot Classification
Authors: Anders Christensen, Massimiliano Mancini, A. Sophia Koepke, Ole Winther, Zeynep Akata
Abstract summary: Zero-shot learning models achieve remarkable results on image classification for samples from classes that were not seen during training. We aim to equip pre-trained models with zero-shot classification capabilities without the use of image data. We achieve this with our proposed Image-free Injection with Semantics (ICIS)
Score: 72.66409483088995
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Zero-shot learning models achieve remarkable results on image classification for samples from classes that were not seen during training. However, such models must be trained from scratch with specialised methods: therefore, access to a training dataset is required when the need for zero-shot classification arises. In this paper, we aim to equip pre-trained models with zero-shot classification capabilities without the use of image data. We achieve this with our proposed Image-free Classifier Injection with Semantics (ICIS) that injects classifiers for new, unseen classes into pre-trained classification models in a post-hoc fashion without relying on image data. Instead, the existing classifier weights and simple class-wise descriptors, such as class names or attributes, are used. ICIS has two encoder-decoder networks that learn to reconstruct classifier weights from descriptors (and vice versa), exploiting (cross-)reconstruction and cosine losses to regularise the decoding process. Notably, ICIS can be cheaply trained and applied directly on top of pre-trained classification models. Experiments on benchmark ZSL datasets show that ICIS produces unseen classifier weights that achieve strong (generalised) zero-shot classification performance. Code is available at https://github.com/ExplainableML/ImageFreeZSL .

Related papers

Re-Scoring Using Image-Language Similarity for Few-Shot Object Detection [4.0208298639821525]
Few-shot object detection, which focuses on detecting novel objects with few labels, is an emerging challenge in the community. Recent studies show that adapting a pre-trained model or modified loss function can improve performance. We propose Re-scoring using Image-language Similarity for Few-shot object detection (RISF) which extends Faster R-CNN.
arXiv Detail & Related papers (2023-11-01T04:04:34Z)
Text-to-Image Diffusion Models are Zero-Shot Classifiers [8.26990105697146]
We investigate text-to-image diffusion models by proposing a method for evaluating them as zero-shot classifiers. We apply our method to Stable Diffusion and Imagen, using it to probe fine-grained aspects of the models' knowledge. They perform competitively with CLIP on a wide range of zero-shot image classification datasets.
arXiv Detail & Related papers (2023-03-27T14:15:17Z)
GMM-IL: Image Classification using Incrementally Learnt, Independent Probabilistic Models for Small Sample Sizes [0.4511923587827301]
We present a novel two stage architecture which couples visual feature learning with probabilistic models to represent each class. We outperform a benchmark of an equivalent network with a Softmax head, obtaining increased accuracy for sample sizes smaller than 12 and increased weighted F1 score for 3 imbalanced class profiles.
arXiv Detail & Related papers (2022-12-01T15:19:42Z)
Prediction Calibration for Generalized Few-shot Semantic Segmentation [101.69940565204816]
Generalized Few-shot Semantic (GFSS) aims to segment each image pixel into either base classes with abundant training examples or novel classes with only a handful of (e.g., 1-5) training images per class. We build a cross-attention module that guides the classifier's final prediction using the fused multi-level features. Our PCN outperforms the state-the-art alternatives by large margins.
arXiv Detail & Related papers (2022-10-15T13:30:12Z)
Masked Unsupervised Self-training for Zero-shot Image Classification [98.23094305347709]
Masked Unsupervised Self-Training (MUST) is a new approach which leverages two different and complimentary sources of supervision: pseudo-labels and raw images. MUST improves upon CLIP by a large margin and narrows the performance gap between unsupervised and supervised classification.
arXiv Detail & Related papers (2022-06-07T02:03:06Z)
Generating Representative Samples for Few-Shot Classification [8.62483598990205]
Few-shot learning aims to learn new categories with a few visual samples per class. Few-shot class representations are often biased due to data scarcity. We generate visual samples based on semantic embeddings using a conditional variational autoencoder model.
arXiv Detail & Related papers (2022-05-05T20:58:33Z)
Prototypical Classifier for Robust Class-Imbalanced Learning [64.96088324684683]
We propose textitPrototypical, which does not require fitting additional parameters given the embedding network. Prototypical produces balanced and comparable predictions for all classes even though the training set is class-imbalanced. We test our method on CIFAR-10LT, CIFAR-100LT and Webvision datasets, observing that Prototypical obtains substaintial improvements compared with state of the arts.
arXiv Detail & Related papers (2021-10-22T01:55:01Z)
Simpler is Better: Few-shot Semantic Segmentation with Classifier Weight Transformer [112.95747173442754]
A few-shot semantic segmentation model is typically composed of a CNN encoder, a CNN decoder and a simple classifier. Most existing methods meta-learn all three model components for fast adaptation to a new class. In this work we propose to simplify the meta-learning task by focusing solely on the simplest component, the classifier.
arXiv Detail & Related papers (2021-08-06T10:20:08Z)
A Multi-class Approach -- Building a Visual Classifier based on Textual Descriptions using Zero-Shot Learning [0.34265828682659694]
We overcome the two main hurdles of Machine Learning, i.e. scarcity of data and constrained prediction of the classification model. We train a classifier by mapping labelled images to their textual description instead of training it for specific classes.
arXiv Detail & Related papers (2020-11-18T12:06:55Z)
Background Splitting: Finding Rare Classes in a Sea of Background [55.03789745276442]
We focus on the real-world problem of training accurate deep models for image classification of a small number of rare categories. In these scenarios, almost all images belong to the background category in the dataset (>95% of the dataset is background) We demonstrate that both standard fine-tuning approaches and state-of-the-art approaches for training on imbalanced datasets do not produce accurate deep models in the presence of this extreme imbalance.
arXiv Detail & Related papers (2020-08-28T23:05:15Z)

This list is automatically generated from the titles and abstracts of the papers in this site.