FSL-Rectifier: Rectify Outliers in Few-Shot Learning via Test-Time Augmentation
- URL: http://arxiv.org/abs/2402.18292v6
- Date: Thu, 19 Dec 2024 03:03:58 GMT
- Title: FSL-Rectifier: Rectify Outliers in Few-Shot Learning via Test-Time Augmentation
- Authors: Yunwei Bai, Ying Kiat Tan, Shiming Chen, Yao Shu, Tsuhan Chen,
- Abstract summary: Few-shot learning (FSL) commonly requires a model to identify images (queries) that belong to classes unseen during training.
We generate additional test-class samples by combining original samples with suitable train-class samples via a generative image combiner.
We obtain averaged features via an augmentor, which leads to more typical representations through the averaging.
- Score: 7.477118370563593
- License:
- Abstract: Few-shot learning (FSL) commonly requires a model to identify images (queries) that belong to classes unseen during training, based on a few labelled samples of the new classes (support set) as reference. So far, plenty of algorithms involve training data augmentation to improve the generalization capability of FSL models, but outlier queries or support images during inference can still pose great generalization challenges. In this work, to reduce the bias caused by the outlier samples, we generate additional test-class samples by combining original samples with suitable train-class samples via a generative image combiner. Then, we obtain averaged features via an augmentor, which leads to more typical representations through the averaging. We experimentally and theoretically demonstrate the effectiveness of our method, obtaining a test accuracy improvement proportion of around 10\% (e.g., from 46.86\% to 53.28\%) for trained FSL models. Importantly, given a pretrained image combiner, our method is training-free for off-the-shelf FSL models, whose performance can be improved without extra datasets nor further training of the models themselves. Codes are available at https://github.com/WendyBaiYunwei/FSL-Rectifier-Pub.
Related papers
- Representation Alignment for Generation: Training Diffusion Transformers Is Easier Than You Think [72.48325960659822]
One main bottleneck in training large-scale diffusion models for generation lies in effectively learning these representations.
We study this by introducing a straightforward regularization called REPresentation Alignment (REPA), which aligns the projections of noisy input hidden states in denoising networks with clean image representations obtained from external, pretrained visual encoders.
The results are striking: our simple strategy yields significant improvements in both training efficiency and generation quality when applied to popular diffusion and flow-based transformers, such as DiTs and SiTs.
arXiv Detail & Related papers (2024-10-09T14:34:53Z) - Pre-Trained Vision-Language Models as Partial Annotators [40.89255396643592]
Pre-trained vision-language models learn massive data to model unified representations of images and natural languages.
In this paper, we investigate a novel "pre-trained annotating - weakly-supervised learning" paradigm for pre-trained model application and experiment on image classification tasks.
arXiv Detail & Related papers (2024-05-23T17:17:27Z) - FreeSeg-Diff: Training-Free Open-Vocabulary Segmentation with Diffusion Models [56.71672127740099]
We focus on the task of image segmentation, which is traditionally solved by training models on closed-vocabulary datasets.
We leverage different and relatively small-sized, open-source foundation models for zero-shot open-vocabulary segmentation.
Our approach (dubbed FreeSeg-Diff), which does not rely on any training, outperforms many training-based approaches on both Pascal VOC and COCO datasets.
arXiv Detail & Related papers (2024-03-29T10:38:25Z) - Rethinking Classifier Re-Training in Long-Tailed Recognition: A Simple
Logits Retargeting Approach [102.0769560460338]
We develop a simple logits approach (LORT) without the requirement of prior knowledge of the number of samples per class.
Our method achieves state-of-the-art performance on various imbalanced datasets, including CIFAR100-LT, ImageNet-LT, and iNaturalist 2018.
arXiv Detail & Related papers (2024-03-01T03:27:08Z) - Generating Representative Samples for Few-Shot Classification [8.62483598990205]
Few-shot learning aims to learn new categories with a few visual samples per class.
Few-shot class representations are often biased due to data scarcity.
We generate visual samples based on semantic embeddings using a conditional variational autoencoder model.
arXiv Detail & Related papers (2022-05-05T20:58:33Z) - KNN-Diffusion: Image Generation via Large-Scale Retrieval [40.6656651653888]
Learning to adapt enables several new capabilities.
Fine-tuning trained models to new samples can be achieved by simply adding them to the table.
Our diffusion-based model trains on images only, by leveraging a joint Text-Image multi-modal metric.
arXiv Detail & Related papers (2022-04-06T14:13:35Z) - Semantics-driven Attentive Few-shot Learning over Clean and Noisy
Samples [0.0]
We aim to train meta-learner models that can leverage prior semantic knowledge about novel classes to guide the classifier synthesis process.
In particular, we propose semantically-conditioned feature attention and sample attention mechanisms that estimate the importance of representation dimensions and training instances.
arXiv Detail & Related papers (2022-01-09T16:16:23Z) - Exploring Complementary Strengths of Invariant and Equivariant
Representations for Few-Shot Learning [96.75889543560497]
In many real-world problems, collecting a large number of labeled samples is infeasible.
Few-shot learning is the dominant approach to address this issue, where the objective is to quickly adapt to novel categories in presence of a limited number of samples.
We propose a novel training mechanism that simultaneously enforces equivariance and invariance to a general set of geometric transformations.
arXiv Detail & Related papers (2021-03-01T21:14:33Z) - Hybrid Consistency Training with Prototype Adaptation for Few-Shot
Learning [11.873143649261362]
Few-Shot Learning aims to improve a model's generalization capability in low data regimes.
Recent FSL works have made steady progress via metric learning, meta learning, representation learning, etc.
arXiv Detail & Related papers (2020-11-19T19:51:33Z) - TAFSSL: Task-Adaptive Feature Sub-Space Learning for few-shot
classification [50.358839666165764]
We show that the Task-Adaptive Feature Sub-Space Learning (TAFSSL) can significantly boost the performance in Few-Shot Learning scenarios.
Specifically, we show that on the challenging miniImageNet and tieredImageNet benchmarks, TAFSSL can improve the current state-of-the-art in both transductive and semi-supervised FSL settings by more than $5%$.
arXiv Detail & Related papers (2020-03-14T16:59:17Z) - AdarGCN: Adaptive Aggregation GCN for Few-Shot Learning [112.95742995816367]
We propose a new few-shot fewshot learning setting termed FSFSL.
Under FSFSL, both the source and target classes have limited training samples.
We also propose a graph convolutional network (GCN)-based label denoising (LDN) method to remove irrelevant images.
arXiv Detail & Related papers (2020-02-28T10:34:36Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.