Subspace Representation Learning for Few-shot Image Classification
- URL: http://arxiv.org/abs/2105.00379v2
- Date: Wed, 5 May 2021 01:57:40 GMT
- Title: Subspace Representation Learning for Few-shot Image Classification
- Authors: Ting-Yao Hu, Zhi-Qi Cheng, Alexander G. Hauptmann
- Abstract summary: We propose a subspace representation learning framework to tackle few-shot image classification tasks.
It exploits a subspace in local CNN feature space to represent an image, and measures the similarity between two images according to a weighted subspace distance (WSD)
- Score: 105.7788602565317
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: In this paper, we propose a subspace representation learning (SRL) framework
to tackle few-shot image classification tasks. It exploits a subspace in local
CNN feature space to represent an image, and measures the similarity between
two images according to a weighted subspace distance (WSD). When K images are
available for each class, we develop two types of template subspaces to
aggregate K-shot information: the prototypical subspace (PS) and the
discriminative subspace (DS). Based on the SRL framework, we extend metric
learning based techniques from vector to subspace representation. While most
previous works adopted global vector representation, using subspace
representation can effectively preserve the spatial structure, and diversity
within an image. We demonstrate the effectiveness of the SRL framework on three
public benchmark datasets: MiniImageNet, TieredImageNet and Caltech-UCSD
Birds-200-2011 (CUB), and the experimental results illustrate
competitive/superior performance of our method compared to the previous
state-of-the-art.
Related papers
- Finetuning CLIP to Reason about Pairwise Differences [52.028073305958074]
We propose an approach to train vision-language models such as CLIP in a contrastive manner to reason about differences in embedding space.
We first demonstrate that our approach yields significantly improved capabilities in ranking images by a certain attribute.
We also illustrate that the resulting embeddings obey a larger degree of geometric properties in embedding space.
arXiv Detail & Related papers (2024-09-15T13:02:14Z) - Selective Vision-Language Subspace Projection for Few-shot CLIP [55.361337202198925]
We introduce a method called Selective Vision-Language Subspace Projection (SSP)
SSP incorporates local image features and utilizes them as a bridge to enhance the alignment between image-text pairs.
Our approach entails only training-free matrix calculations and can be seamlessly integrated into advanced CLIP-based few-shot learning frameworks.
arXiv Detail & Related papers (2024-07-24T03:45:35Z) - Spatial Latent Representations in Generative Adversarial Networks for
Image Generation [0.0]
We define a family of spatial latent spaces for StyleGAN2.
We show that our spaces are effective for image manipulation and encode semantic information well.
arXiv Detail & Related papers (2023-03-25T20:01:11Z) - Mining Contextual Information Beyond Image for Semantic Segmentation [37.783233906684444]
The paper studies the context aggregation problem in semantic image segmentation.
It proposes to mine the contextual information beyond individual images to further augment the pixel representations.
The proposed method could be effortlessly incorporated into existing segmentation frameworks.
arXiv Detail & Related papers (2021-08-26T14:34:23Z) - Low-Rank Subspaces in GANs [101.48350547067628]
This work introduces low-rank subspaces that enable more precise control of GAN generation.
LowRankGAN is able to find the low-dimensional representation of attribute manifold.
Experiments on state-of-the-art GAN models (including StyleGAN2 and BigGAN) trained on various datasets demonstrate the effectiveness of our LowRankGAN.
arXiv Detail & Related papers (2021-06-08T16:16:32Z) - Isometric Propagation Network for Generalized Zero-shot Learning [72.02404519815663]
A popular strategy is to learn a mapping between the semantic space of class attributes and the visual space of images based on the seen classes and their data.
We propose Isometric propagation Network (IPN), which learns to strengthen the relation between classes within each space and align the class dependency in the two spaces.
IPN achieves state-of-the-art performance on three popular Zero-shot learning benchmarks.
arXiv Detail & Related papers (2021-02-03T12:45:38Z) - DeepEMD: Differentiable Earth Mover's Distance for Few-Shot Learning [122.51237307910878]
We develop methods for few-shot image classification from a new perspective of optimal matching between image regions.
We employ the Earth Mover's Distance (EMD) as a metric to compute a structural distance between dense image representations.
To generate the important weights of elements in the formulation, we design a cross-reference mechanism.
arXiv Detail & Related papers (2020-03-15T08:13:16Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.