Few-Shot Learning by Integrating Spatial and Frequency Representation
- URL: http://arxiv.org/abs/2105.05348v1
- Date: Tue, 11 May 2021 21:44:31 GMT
- Title: Few-Shot Learning by Integrating Spatial and Frequency Representation
- Authors: Xiangyu Chen and Guanghui Wang
- Abstract summary: We propose to integrate the frequency information into the learning model to boost the discrimination ability of the system.
We employ Discrete Cosine Transformation (DCT) to generate the frequency representation, then, integrate the features from both the spatial domain and frequency domain for classification.
- Score: 25.11147383752403
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Human beings can recognize new objects with only a few labeled examples,
however, few-shot learning remains a challenging problem for machine learning
systems. Most previous algorithms in few-shot learning only utilize spatial
information of the images. In this paper, we propose to integrate the frequency
information into the learning model to boost the discrimination ability of the
system. We employ Discrete Cosine Transformation (DCT) to generate the
frequency representation, then, integrate the features from both the spatial
domain and frequency domain for classification. The proposed strategy and its
effectiveness are validated with different backbones, datasets, and algorithms.
Extensive experiments demonstrate that the frequency information is
complementary to the spatial representations in few-shot classification. The
classification accuracy is boosted significantly by integrating features from
both the spatial and frequency domains in different few-shot learning tasks.
Related papers
- Frequency-Spatial Entanglement Learning for Camouflaged Object Detection [34.426297468968485]
Existing methods attempt to reduce the impact of pixel similarity by maximizing the distinguishing ability of spatial features with complicated design.
We propose a new approach to address this issue by jointly exploring the representation in the frequency and spatial domains, introducing the Frequency-Spatial Entanglement Learning (FSEL) method.
Our experiments demonstrate the superiority of our FSEL over 21 state-of-the-art methods, through comprehensive quantitative and qualitative comparisons in three widely-used datasets.
arXiv Detail & Related papers (2024-09-03T07:58:47Z) - High-Discriminative Attribute Feature Learning for Generalized Zero-Shot Learning [54.86882315023791]
We propose an innovative approach called High-Discriminative Attribute Feature Learning for Generalized Zero-Shot Learning (HDAFL)
HDAFL utilizes multiple convolutional kernels to automatically learn discriminative regions highly correlated with attributes in images.
We also introduce a Transformer-based attribute discrimination encoder to enhance the discriminative capability among attributes.
arXiv Detail & Related papers (2024-04-07T13:17:47Z) - Sketched Multi-view Subspace Learning for Hyperspectral Anomalous Change
Detection [12.719327447589345]
A sketched multi-view subspace learning model is proposed for anomalous change detection.
The proposed model preserves major information from the image pairs and improves computational complexity.
experiments are conducted on a benchmark hyperspectral remote sensing dataset and a natural hyperspectral dataset.
arXiv Detail & Related papers (2022-10-09T14:08:17Z) - SATS: Self-Attention Transfer for Continual Semantic Segmentation [50.51525791240729]
continual semantic segmentation suffers from the same catastrophic forgetting issue as in continual classification learning.
This study proposes to transfer a new type of information relevant to knowledge, i.e. the relationships between elements within each image.
The relationship information can be effectively obtained from the self-attention maps in a Transformer-style segmentation model.
arXiv Detail & Related papers (2022-03-15T06:09:28Z) - Consistency and Diversity induced Human Motion Segmentation [231.36289425663702]
We propose a novel Consistency and Diversity induced human Motion (CDMS) algorithm.
Our model factorizes the source and target data into distinct multi-layer feature spaces.
A multi-mutual learning strategy is carried out to reduce the domain gap between the source and target data.
arXiv Detail & Related papers (2022-02-10T06:23:56Z) - Residual Attention: A Simple but Effective Method for Multi-Label
Recognition [29.18904701720024]
We propose an embarrassingly simple module, named class-specific residual attention (CSRA)
CSRA generates class-specific features for every category by proposing a simple spatial attention score, and then combines it with the class-agnostic average pooling feature.
With only 4 lines of code, CSRA also leads to consistent improvement across many diverse pretrained models and datasets without any extra training.
arXiv Detail & Related papers (2021-08-05T08:45:57Z) - Generalized Zero-Shot Learning using Multimodal Variational Auto-Encoder
with Semantic Concepts [0.9054540533394924]
Recent techniques try to learn a cross-modal mapping between the semantic space and the image space.
We propose a Multimodal Variational Auto-Encoder (M-VAE) which can learn the shared latent space of image features and the semantic space.
Our results show that our proposed model outperforms the current state-of-the-art approaches for generalized zero-shot learning.
arXiv Detail & Related papers (2021-06-26T20:08:37Z) - Anomalous Sound Detection Using a Binary Classification Model and Class
Centroids [47.856367556856554]
We propose a binary classification model that is developed by using not only normal data but also outlier data in the other domains as pseudo-anomalous sound data.
We also investigate the effectiveness of additionally using anomalous sound data for further improving the binary classification model.
arXiv Detail & Related papers (2021-06-11T03:35:06Z) - Spatial-spectral Hyperspectral Image Classification via Multiple Random
Anchor Graphs Ensemble Learning [88.60285937702304]
This paper proposes a novel spatial-spectral HSI classification method via multiple random anchor graphs ensemble learning (RAGE)
Firstly, the local binary pattern is adopted to extract the more descriptive features on each selected band, which preserves local structures and subtle changes of a region.
Secondly, the adaptive neighbors assignment is introduced in the construction of anchor graph, to reduce the computational complexity.
arXiv Detail & Related papers (2021-03-25T09:31:41Z) - Region Comparison Network for Interpretable Few-shot Image
Classification [97.97902360117368]
Few-shot image classification has been proposed to effectively use only a limited number of labeled examples to train models for new classes.
We propose a metric learning based method named Region Comparison Network (RCN), which is able to reveal how few-shot learning works.
We also present a new way to generalize the interpretability from the level of tasks to categories.
arXiv Detail & Related papers (2020-09-08T07:29:05Z) - Frequency learning for image classification [1.9336815376402716]
This paper presents a new approach for exploring the Fourier transform of the input images, which is composed of trainable frequency filters.
We propose a slicing procedure to allow the network to learn both global and local features from the frequency-domain representations of the image blocks.
arXiv Detail & Related papers (2020-06-28T00:32:47Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.