Enhancing Few-Shot Image Classification through Learnable Multi-Scale Embedding and Attention Mechanisms
- URL: http://arxiv.org/abs/2409.07989v1
- Date: Thu, 12 Sep 2024 12:34:29 GMT
- Title: Enhancing Few-Shot Image Classification through Learnable Multi-Scale Embedding and Attention Mechanisms
- Authors: Fatemeh Askari, Amirreza Fateh, Mohammad Reza Mohammadi,
- Abstract summary: In the context of few-shot classification, the goal is to train a classifier using a limited number of samples.
Traditional metric-based methods exhibit certain limitations in achieving this objective.
Our approach involves utilizing multi-output embedding network that maps samples into distinct feature spaces.
- Score: 1.1557852082644071
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In the context of few-shot classification, the goal is to train a classifier using a limited number of samples while maintaining satisfactory performance. However, traditional metric-based methods exhibit certain limitations in achieving this objective. These methods typically rely on a single distance value between the query feature and support feature, thereby overlooking the contribution of shallow features. To overcome this challenge, we propose a novel approach in this paper. Our approach involves utilizing multi-output embedding network that maps samples into distinct feature spaces. The proposed method extract feature vectors at different stages, enabling the model to capture both global and abstract features. By utilizing these diverse feature spaces, our model enhances its performance. Moreover, employing a self-attention mechanism improves the refinement of features at each stage, leading to even more robust representations and improved overall performance. Furthermore, assigning learnable weights to each stage significantly improved performance and results. We conducted comprehensive evaluations on the MiniImageNet and FC100 datasets, specifically in the 5-way 1-shot and 5-way 5-shot scenarios. Additionally, we performed a cross-domain task from MiniImageNet to the CUB dataset, achieving high accuracy in the testing domain. These evaluations demonstrate the efficacy of our proposed method in comparison to state-of-the-art approaches. https://github.com/FatemehAskari/MSENet
Related papers
- Sample Less, Learn More: Efficient Action Recognition via Frame Feature
Restoration [59.6021678234829]
We propose a novel method to restore the intermediate features for two sparsely sampled and adjacent video frames.
With the integration of our method, the efficiency of three commonly used baselines has been improved by over 50%, with a mere 0.5% reduction in recognition accuracy.
arXiv Detail & Related papers (2023-07-27T13:52:42Z) - CAD: Co-Adapting Discriminative Features for Improved Few-Shot
Classification [11.894289991529496]
Few-shot classification is a challenging problem that aims to learn a model that can adapt to unseen classes given a few labeled samples.
Recent approaches pre-train a feature extractor, and then fine-tune for episodic meta-learning.
We propose a strategy to cross-attend and re-weight discriminative features for few-shot classification.
arXiv Detail & Related papers (2022-03-25T06:14:51Z) - Beyond Simple Meta-Learning: Multi-Purpose Models for Multi-Domain,
Active and Continual Few-Shot Learning [41.07029317930986]
We propose a variance-sensitive class of models that operates in a low-label regime.
The first method, Simple CNAPS, employs a hierarchically regularized Mahalanobis-distance based classifier.
We further extend this approach to a transductive learning setting, proposing Transductive CNAPS.
arXiv Detail & Related papers (2022-01-13T18:59:02Z) - SCNet: Enhancing Few-Shot Semantic Segmentation by Self-Contrastive
Background Prototypes [56.387647750094466]
Few-shot semantic segmentation aims to segment novel-class objects in a query image with only a few annotated examples.
Most of advanced solutions exploit a metric learning framework that performs segmentation through matching each pixel to a learned foreground prototype.
This framework suffers from biased classification due to incomplete construction of sample pairs with the foreground prototype only.
arXiv Detail & Related papers (2021-04-19T11:21:47Z) - Distribution Alignment: A Unified Framework for Long-tail Visual
Recognition [52.36728157779307]
We propose a unified distribution alignment strategy for long-tail visual recognition.
We then introduce a generalized re-weight method in the two-stage learning to balance the class prior.
Our approach achieves the state-of-the-art results across all four recognition tasks with a simple and unified framework.
arXiv Detail & Related papers (2021-03-30T14:09:53Z) - Multi-scale Interactive Network for Salient Object Detection [91.43066633305662]
We propose the aggregate interaction modules to integrate the features from adjacent levels.
To obtain more efficient multi-scale features, the self-interaction modules are embedded in each decoder unit.
Experimental results on five benchmark datasets demonstrate that the proposed method without any post-processing performs favorably against 23 state-of-the-art approaches.
arXiv Detail & Related papers (2020-07-17T15:41:37Z) - A Few-Shot Sequential Approach for Object Counting [63.82757025821265]
We introduce a class attention mechanism that sequentially attends to objects in the image and extracts their relevant features.
The proposed technique is trained on point-level annotations and uses a novel loss function that disentangles class-dependent and class-agnostic aspects of the model.
We present our results on a variety of object-counting/detection datasets, including FSOD and MS COCO.
arXiv Detail & Related papers (2020-07-03T18:23:39Z) - One-Shot Object Detection without Fine-Tuning [62.39210447209698]
We introduce a two-stage model consisting of a first stage Matching-FCOS network and a second stage Structure-Aware Relation Module.
We also propose novel training strategies that effectively improve detection performance.
Our method exceeds the state-of-the-art one-shot performance consistently on multiple datasets.
arXiv Detail & Related papers (2020-05-08T01:59:23Z) - SimPropNet: Improved Similarity Propagation for Few-shot Image
Segmentation [14.419517737536706]
Recent deep neural network based FSS methods leverage high-dimensional feature similarity between the foreground features of the support images and the query image features.
We propose to jointly predict the support and query masks to force the support features to share characteristics with the query features.
Our method achieves state-of-the-art results for one-shot and five-shot segmentation on the PASCAL-5i dataset.
arXiv Detail & Related papers (2020-04-30T17:56:48Z) - Objectness-Aware Few-Shot Semantic Segmentation [31.13009111054977]
We show how to increase overall model capacity to achieve improved performance.
We introduce objectness, which is class-agnostic and so not prone to overfitting.
Given only one annotated example of an unseen category, experiments show that our method outperforms state-of-art methods with respect to mIoU.
arXiv Detail & Related papers (2020-04-06T19:12:08Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.