A Universal Representation Transformer Layer for Few-Shot Image
Classification
- URL: http://arxiv.org/abs/2006.11702v4
- Date: Wed, 2 Sep 2020 22:35:10 GMT
- Title: A Universal Representation Transformer Layer for Few-Shot Image
Classification
- Authors: Lu Liu, William Hamilton, Guodong Long, Jing Jiang, Hugo Larochelle
- Abstract summary: Few-shot classification aims to recognize unseen classes when presented with only a small number of samples.
We consider the problem of multi-domain few-shot image classification, where unseen classes and examples come from diverse data sources.
Here, we propose a Universal Representation Transformer layer, that meta-learns to leverage universal features for few-shot classification.
- Score: 43.31379752656756
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Few-shot classification aims to recognize unseen classes when presented with
only a small number of samples. We consider the problem of multi-domain
few-shot image classification, where unseen classes and examples come from
diverse data sources. This problem has seen growing interest and has inspired
the development of benchmarks such as Meta-Dataset. A key challenge in this
multi-domain setting is to effectively integrate the feature representations
from the diverse set of training domains. Here, we propose a Universal
Representation Transformer (URT) layer, that meta-learns to leverage universal
features for few-shot classification by dynamically re-weighting and composing
the most appropriate domain-specific representations. In experiments, we show
that URT sets a new state-of-the-art result on Meta-Dataset. Specifically, it
achieves top-performance on the highest number of data sources compared to
competing methods. We analyze variants of URT and present a visualization of
the attention score heatmaps that sheds light on how the model performs
cross-domain generalization. Our code is available at
https://github.com/liulu112601/URT.
Related papers
- Diagnose Like a Pathologist: Transformer-Enabled Hierarchical
Attention-Guided Multiple Instance Learning for Whole Slide Image
Classification [39.41442041007595]
Multiple Instance Learning and transformers are increasingly popular in histopathology Whole Slide Image (WSI) classification.
We propose a Hierarchical Attention-Guided Multiple Instance Learning framework to fully exploit the WSIs.
Within this framework, an Integrated Attention Transformer is proposed to further enhance the performance of the transformer.
arXiv Detail & Related papers (2023-01-19T15:38:43Z) - Multi-Domain Long-Tailed Learning by Augmenting Disentangled
Representations [80.76164484820818]
There is an inescapable long-tailed class-imbalance issue in many real-world classification problems.
We study this multi-domain long-tailed learning problem and aim to produce a model that generalizes well across all classes and domains.
Built upon a proposed selective balanced sampling strategy, TALLY achieves this by mixing the semantic representation of one example with the domain-associated nuisances of another.
arXiv Detail & Related papers (2022-10-25T21:54:26Z) - Diverse Instance Discovery: Vision-Transformer for Instance-Aware
Multi-Label Image Recognition [24.406654146411682]
Vision Transformer (ViT) is the research base for this paper.
Our goal is to leverage ViT's patch tokens and self-attention mechanism to mine rich instances in multi-label images.
We propose a weakly supervised object localization-based approach to extract multi-scale local features.
arXiv Detail & Related papers (2022-04-22T14:38:40Z) - Multi-Representation Adaptation Network for Cross-domain Image
Classification [20.615155915233693]
In image classification, it is often expensive and time-consuming to acquire sufficient labels.
Existing approaches mainly align the distributions of representations extracted by a single structure.
We propose Multi-Representation Adaptation which can dramatically improve the classification accuracy for cross-domain image classification.
arXiv Detail & Related papers (2022-01-04T06:34:48Z) - Semi-Supervised Domain Adaptation with Prototypical Alignment and
Consistency Learning [86.6929930921905]
This paper studies how much it can help address domain shifts if we further have a few target samples labeled.
To explore the full potential of landmarks, we incorporate a prototypical alignment (PA) module which calculates a target prototype for each class from the landmarks.
Specifically, we severely perturb the labeled images, making PA non-trivial to achieve and thus promoting model generalizability.
arXiv Detail & Related papers (2021-04-19T08:46:08Z) - Universal Representation Learning from Multiple Domains for Few-shot
Classification [41.821234589075445]
We propose to learn a single set of universal deep representations by distilling knowledge of multiple separately trained networks.
We show that the universal representations can be further refined for previously unseen domains by an efficient adaptation step.
arXiv Detail & Related papers (2021-03-25T13:49:12Z) - Selecting Relevant Features from a Multi-domain Representation for
Few-shot Classification [91.67977602992657]
We propose a new strategy based on feature selection, which is both simpler and more effective than previous feature adaptation approaches.
We show that a simple non-parametric classifier built on top of such features produces high accuracy and generalizes to domains never seen during training.
arXiv Detail & Related papers (2020-03-20T15:44:17Z) - Universal-RCNN: Universal Object Detector via Transferable Graph R-CNN [117.80737222754306]
We present a novel universal object detector called Universal-RCNN.
We first generate a global semantic pool by integrating all high-level semantic representation of all the categories.
An Intra-Domain Reasoning Module learns and propagates the sparse graph representation within one dataset guided by a spatial-aware GCN.
arXiv Detail & Related papers (2020-02-18T07:57:45Z) - Cross-Domain Few-Shot Classification via Learned Feature-Wise
Transformation [109.89213619785676]
Few-shot classification aims to recognize novel categories with only few labeled images in each class.
Existing metric-based few-shot classification algorithms predict categories by comparing the feature embeddings of query images with those from a few labeled images.
While promising performance has been demonstrated, these methods often fail to generalize to unseen domains.
arXiv Detail & Related papers (2020-01-23T18:55:43Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.