RAFIC: Retrieval-Augmented Few-shot Image Classification
- URL: http://arxiv.org/abs/2312.06868v1
- Date: Mon, 11 Dec 2023 22:28:51 GMT
- Title: RAFIC: Retrieval-Augmented Few-shot Image Classification
- Authors: Hangfei Lin, Li Miao, Amir Ziai
- Abstract summary: Few-shot image classification is the task of classifying unseen images to one of N mutually exclusive classes.
We have developed a method for augmenting the set of K with an addition set of A retrieved images.
We demonstrate that RAFIC markedly improves performance of few-shot image classification across two challenging datasets.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Few-shot image classification is the task of classifying unseen images to one
of N mutually exclusive classes, using only a small number of training examples
for each class. The limited availability of these examples (denoted as K)
presents a significant challenge to classification accuracy in some cases. To
address this, we have developed a method for augmenting the set of K with an
addition set of A retrieved images. We call this system Retrieval-Augmented
Few-shot Image Classification (RAFIC). Through a series of experiments, we
demonstrate that RAFIC markedly improves performance of few-shot image
classification across two challenging datasets. RAFIC consists of two main
components: (a) a retrieval component which uses CLIP, LAION-5B, and faiss, in
order to efficiently retrieve images similar to the supplied images, and (b)
retrieval meta-learning, which learns to judiciously utilize the retrieved
images. Code and data is available at github.com/amirziai/rafic.
Related papers
- Semantic Compositions Enhance Vision-Language Contrastive Learning [46.985865191341944]
We show that the zero-shot classification and retrieval capabilities of CLIP-like models can be improved significantly through the introduction of semantically composite examples during pretraining.
Our method fuses the captions and blends 50% of each image to form a new composite sample.
The benefits of CLIP-C are particularly pronounced in settings with relatively limited pretraining data.
arXiv Detail & Related papers (2024-07-01T15:58:20Z) - Enhance Image Classification via Inter-Class Image Mixup with Diffusion Model [80.61157097223058]
A prevalent strategy to bolster image classification performance is through augmenting the training set with synthetic images generated by T2I models.
In this study, we scrutinize the shortcomings of both current generative and conventional data augmentation techniques.
We introduce an innovative inter-class data augmentation method known as Diff-Mix, which enriches the dataset by performing image translations between classes.
arXiv Detail & Related papers (2024-03-28T17:23:45Z) - Advancing Image Retrieval with Few-Shot Learning and Relevance Feedback [5.770351255180495]
Image Retrieval with Relevance Feedback (IRRF) involves iterative human interaction during the retrieval process.
We propose a new scheme based on a hyper-network, that is tailored to the task and facilitates swift adjustment to user feedback.
We show that our method can attain SoTA results in few-shot one-class classification and reach comparable results in binary classification task of few-shot open-set recognition.
arXiv Detail & Related papers (2023-12-18T10:20:28Z) - Re-Scoring Using Image-Language Similarity for Few-Shot Object Detection [4.0208298639821525]
Few-shot object detection, which focuses on detecting novel objects with few labels, is an emerging challenge in the community.
Recent studies show that adapting a pre-trained model or modified loss function can improve performance.
We propose Re-scoring using Image-language Similarity for Few-shot object detection (RISF) which extends Faster R-CNN.
arXiv Detail & Related papers (2023-11-01T04:04:34Z) - Label-Free Event-based Object Recognition via Joint Learning with Image
Reconstruction from Events [42.71383489578851]
We study label-free event-based object recognition where category labels and paired images are not available.
Our method first reconstructs images from events and performs object recognition through Contrastive Language-Image Pre-training (CLIP)
Since the category information is essential in reconstructing images, we propose category-guided attraction loss and category-agnostic repulsion loss.
arXiv Detail & Related papers (2023-08-18T08:28:17Z) - FewSAR: A Few-shot SAR Image Classification Benchmark [17.24173332659616]
Few-shot learning is one of the significant and hard problems in the field of image classification.
FewSAR consists of an open-source Python code library of 15 classic methods in three categories for few-shot SAR image classification.
By analyzing the quantitative results and runtime under the same setting, we observe that the accuracy of metric learning methods can achieve the best results.
arXiv Detail & Related papers (2023-06-16T02:35:00Z) - Contextual Similarity Aggregation with Self-attention for Visual
Re-ranking [96.55393026011811]
We propose a visual re-ranking method by contextual similarity aggregation with self-attention.
We conduct comprehensive experiments on four benchmark datasets to demonstrate the generality and effectiveness of our proposed visual re-ranking method.
arXiv Detail & Related papers (2021-10-26T06:20:31Z) - Disentangled Feature Representation for Few-shot Image Classification [64.40410801469106]
We propose a novel Disentangled Feature Representation framework, dubbed DFR, for few-shot learning applications.
DFR can adaptively decouple the discriminative features that are modeled by the classification branch, from the class-irrelevant component of the variation branch.
In general, most of the popular deep few-shot learning methods can be plugged in as the classification branch, thus DFR can boost their performance on various few-shot tasks.
arXiv Detail & Related papers (2021-09-26T09:53:11Z) - Rectifying the Shortcut Learning of Background: Shared Object
Concentration for Few-Shot Image Recognition [101.59989523028264]
Few-Shot image classification aims to utilize pretrained knowledge learned from a large-scale dataset to tackle a series of downstream classification tasks.
We propose COSOC, a novel Few-Shot Learning framework, to automatically figure out foreground objects at both pretraining and evaluation stage.
arXiv Detail & Related papers (2021-07-16T07:46:41Z) - Grafit: Learning fine-grained image representations with coarse labels [114.17782143848315]
This paper tackles the problem of learning a finer representation than the one provided by training labels.
By jointly leveraging the coarse labels and the underlying fine-grained latent space, it significantly improves the accuracy of category-level retrieval methods.
arXiv Detail & Related papers (2020-11-25T19:06:26Z) - Compact Deep Aggregation for Set Retrieval [87.52470995031997]
We focus on retrieving images containing multiple faces from a large scale dataset of images.
Here the set consists of the face descriptors in each image, and given a query for multiple identities, the goal is then to retrieve, in order, images which contain all the identities.
We show that this compact descriptor has minimal loss of discriminability up to two faces per image, and degrades slowly after that.
arXiv Detail & Related papers (2020-03-26T08:43:15Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.