NDPNet: A novel non-linear data projection network for few-shot
fine-grained image classification
- URL: http://arxiv.org/abs/2106.06988v2
- Date: Tue, 15 Jun 2021 04:22:17 GMT
- Title: NDPNet: A novel non-linear data projection network for few-shot
fine-grained image classification
- Authors: Weichuan Zhang, Xuefang Liu, Zhe Xue, Yongsheng Gao, Changming Sun
- Abstract summary: We introduce the non-linear data projection concept into the design of metric-based fine-grained image classification architecture.
Our proposed architecture can be easily embedded into any episodic training mechanisms for end-to-end training from scratch.
- Score: 33.71025164816078
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Metric-based few-shot fine-grained image classification (FSFGIC) aims to
learn a transferable feature embedding network by estimating the similarities
between query images and support classes from very few examples. In this work,
we propose, for the first time, to introduce the non-linear data projection
concept into the design of FSFGIC architecture in order to address the limited
sample problem in few-shot learning and at the same time to increase the
discriminability of the model for fine-grained image classification.
Specifically, we first design a feature re-abstraction embedding network that
has the ability to not only obtain the required semantic features for effective
metric learning but also re-enhance such features with finer details from input
images. Then the descriptors of the query images and the support classes are
projected into different non-linear spaces in our proposed similarity metric
learning network to learn discriminative projection factors. This design can
effectively operate in the challenging and restricted condition of a FSFGIC
task for making the distance between the samples within the same class smaller
and the distance between samples from different classes larger and for reducing
the coupling relationship between samples from different categories.
Furthermore, a novel similarity measure based on the proposed non-linear data
project is presented for evaluating the relationships of feature information
between a query image and a support set. It is worth to note that our proposed
architecture can be easily embedded into any episodic training mechanisms for
end-to-end training from scratch. Extensive experiments on FSFGIC tasks
demonstrate the superiority of the proposed methods over the state-of-the-art
benchmarks.
Related papers
- Neural Clustering based Visual Representation Learning [61.72646814537163]
Clustering is one of the most classic approaches in machine learning and data analysis.
We propose feature extraction with clustering (FEC), which views feature extraction as a process of selecting representatives from data.
FEC alternates between grouping pixels into individual clusters to abstract representatives and updating the deep features of pixels with current representatives.
arXiv Detail & Related papers (2024-03-26T06:04:50Z) - Symmetrical Bidirectional Knowledge Alignment for Zero-Shot Sketch-Based
Image Retrieval [69.46139774646308]
This paper studies the problem of zero-shot sketch-based image retrieval (ZS-SBIR)
It aims to use sketches from unseen categories as queries to match the images of the same category.
We propose a novel Symmetrical Bidirectional Knowledge Alignment for zero-shot sketch-based image retrieval (SBKA)
arXiv Detail & Related papers (2023-12-16T04:50:34Z) - Semantic Representation and Dependency Learning for Multi-Label Image
Recognition [76.52120002993728]
We propose a novel and effective semantic representation and dependency learning (SRDL) framework to learn category-specific semantic representation for each category.
Specifically, we design a category-specific attentional regions (CAR) module to generate channel/spatial-wise attention matrices to guide model.
We also design an object erasing (OE) module to implicitly learn semantic dependency among categories by erasing semantic-aware regions.
arXiv Detail & Related papers (2022-04-08T00:55:15Z) - BDA-SketRet: Bi-Level Domain Adaptation for Zero-Shot SBIR [52.78253400327191]
BDA-SketRet is a novel framework performing a bi-level domain adaptation for aligning the spatial and semantic features of the visual data pairs.
Experimental results on the extended Sketchy, TU-Berlin, and QuickDraw exhibit sharp improvements over the literature.
arXiv Detail & Related papers (2022-01-17T18:45:55Z) - Query Adaptive Few-Shot Object Detection with Heterogeneous Graph
Convolutional Networks [33.446875089255876]
Few-shot object detection (FSOD) aims to detect never-seen objects using few examples.
We propose a novel FSOD model using heterogeneous graph convolutional networks.
arXiv Detail & Related papers (2021-12-17T22:08:15Z) - A Compositional Feature Embedding and Similarity Metric for
Ultra-Fine-Grained Visual Categorization [16.843126268445726]
Fine-grained visual categorization (FGVC) aims at classifying objects with small inter-class variances.
This paper proposes a novel compositional feature embedding and similarity metric ( CECS) for ultra-fine-grained visual categorization.
Experimental results on two ultra-FGVC datasets and one FGVC dataset with recent benchmark methods consistently demonstrate that the proposed CECS method achieves the state-the-art performance.
arXiv Detail & Related papers (2021-09-25T15:05:25Z) - Exploiting the relationship between visual and textual features in
social networks for image classification with zero-shot deep learning [0.0]
In this work, we propose a classifier ensemble based on the transferable learning capabilities of the CLIP neural network architecture.
Our experiments, based on image classification tasks according to the labels of the Places dataset, are performed by first considering only the visual part.
Considering the associated texts to the images can help to improve the accuracy depending on the goal.
arXiv Detail & Related papers (2021-07-08T10:54:59Z) - Adaptive Prototypical Networks with Label Words and Joint Representation
Learning for Few-Shot Relation Classification [17.237331828747006]
This work focuses on few-shot relation classification (FSRC)
We propose an adaptive mixture mechanism to add label words to the representation of the class prototype.
Experiments have been conducted on FewRel under different few-shot (FS) settings.
arXiv Detail & Related papers (2021-01-10T11:25:42Z) - Region Comparison Network for Interpretable Few-shot Image
Classification [97.97902360117368]
Few-shot image classification has been proposed to effectively use only a limited number of labeled examples to train models for new classes.
We propose a metric learning based method named Region Comparison Network (RCN), which is able to reveal how few-shot learning works.
We also present a new way to generalize the interpretability from the level of tasks to categories.
arXiv Detail & Related papers (2020-09-08T07:29:05Z) - Learning What Makes a Difference from Counterfactual Examples and
Gradient Supervision [57.14468881854616]
We propose an auxiliary training objective that improves the generalization capabilities of neural networks.
We use pairs of minimally-different examples with different labels, a.k.a counterfactual or contrasting examples, which provide a signal indicative of the underlying causal structure of the task.
Models trained with this technique demonstrate improved performance on out-of-distribution test sets.
arXiv Detail & Related papers (2020-04-20T02:47:49Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.