Boosting Few-shot Fine-grained Recognition with Background Suppression
and Foreground Alignment
- URL: http://arxiv.org/abs/2210.01439v1
- Date: Tue, 4 Oct 2022 07:54:40 GMT
- Title: Boosting Few-shot Fine-grained Recognition with Background Suppression
and Foreground Alignment
- Authors: Zican Zha, Hao Tang, Yunlian Sun, and Jinhui Tang
- Abstract summary: Few-shot fine-grained recognition (FS-FGR) aims to recognize novel fine-grained categories with the help of limited available samples.
We propose a two-stage background suppression and foreground alignment framework, which is composed of a background activation suppression (BAS) module, a foreground object alignment (FOA) module, and a local to local (L2L) similarity metric.
Experiments conducted on multiple popular fine-grained benchmarks demonstrate that our method outperforms the existing state-of-the-art by a large margin.
- Score: 53.401889855278704
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Few-shot fine-grained recognition (FS-FGR) aims to recognize novel
fine-grained categories with the help of limited available samples.
Undoubtedly, this task inherits the main challenges from both few-shot learning
and fine-grained recognition. First, the lack of labeled samples makes the
learned model easy to overfit. Second, it also suffers from high intra-class
variance and low inter-class difference in the datasets. To address this
challenging task, we propose a two-stage background suppression and foreground
alignment framework, which is composed of a background activation suppression
(BAS) module, a foreground object alignment (FOA) module, and a local to local
(L2L) similarity metric. Specifically, the BAS is introduced to generate a
foreground mask for localization to weaken background disturbance and enhance
dominative foreground objects. What's more, considering the lack of labeled
samples, we compute the pairwise similarity of feature maps using both the raw
image and the refined image. The FOA then reconstructs the feature map of each
support sample according to its correction to the query ones, which addresses
the problem of misalignment between support-query image pairs. To enable the
proposed method to have the ability to capture subtle differences in confused
samples, we present a novel L2L similarity metric to further measure the local
similarity between a pair of aligned spatial features in the embedding space.
Extensive experiments conducted on multiple popular fine-grained benchmarks
demonstrate that our method outperforms the existing state-of-the-art by a
large margin.
Related papers
- Skeleton-Guided Instance Separation for Fine-Grained Segmentation in
Microscopy [23.848474219551818]
One of the fundamental challenges in microscopy (MS) image analysis is instance segmentation (IS)
We propose a novel one-stage framework named A2B-IS to address this challenge and enhance the accuracy of IS in MS images.
Our method has been thoroughly validated on two large-scale MS datasets.
arXiv Detail & Related papers (2024-01-18T11:14:32Z) - Dense Affinity Matching for Few-Shot Segmentation [83.65203917246745]
Few-Shot (FSS) aims to segment the novel class images with a few samples.
We propose a dense affinity matching framework to exploit the support-query interaction.
We show that our framework performs very competitively under different settings with only 0.68M parameters.
arXiv Detail & Related papers (2023-07-17T12:27:15Z) - Multi-Granularity Denoising and Bidirectional Alignment for Weakly
Supervised Semantic Segmentation [75.32213865436442]
We propose an end-to-end multi-granularity denoising and bidirectional alignment (MDBA) model to alleviate the noisy label and multi-class generalization issues.
The MDBA model can reach the mIoU of 69.5% and 70.2% on validation and test sets for the PASCAL VOC 2012 dataset.
arXiv Detail & Related papers (2023-05-09T03:33:43Z) - Embedding contrastive unsupervised features to cluster in- and
out-of-distribution noise in corrupted image datasets [18.19216557948184]
Using search engines for web image retrieval is a tempting alternative to manual curation when creating an image dataset.
Their main drawback remains the proportion of incorrect (noisy) samples retrieved.
We propose a two stage algorithm starting with a detection step where we use unsupervised contrastive feature learning.
We find that the alignment and uniformity principles of contrastive learning allow OOD samples to be linearly separated from ID samples on the unit hypersphere.
arXiv Detail & Related papers (2022-07-04T16:51:56Z) - BDA-SketRet: Bi-Level Domain Adaptation for Zero-Shot SBIR [52.78253400327191]
BDA-SketRet is a novel framework performing a bi-level domain adaptation for aligning the spatial and semantic features of the visual data pairs.
Experimental results on the extended Sketchy, TU-Berlin, and QuickDraw exhibit sharp improvements over the literature.
arXiv Detail & Related papers (2022-01-17T18:45:55Z) - Inter-class Discrepancy Alignment for Face Recognition [55.578063356210144]
We propose a unified framework calledInter-class DiscrepancyAlignment(IDA)
IDA-DAO is used to align the similarity scores considering the discrepancy between the images and its neighbors.
IDA-SSE can provide convincing inter-class neighbors by introducing virtual candidate images generated with GAN.
arXiv Detail & Related papers (2021-03-02T08:20:08Z) - ANL: Anti-Noise Learning for Cross-Domain Person Re-Identification [25.035093667770052]
We propose an Anti-Noise Learning (ANL) approach, which contains two modules.
FDA module is designed to gather the id-related samples and disperse id-unrelated samples, through the camera-wise contrastive learning and adversarial adaptation.
Reliable Sample Selection ( RSS) module utilizes an Auxiliary Model to correct noisy labels and select reliable samples for the Main Model.
arXiv Detail & Related papers (2020-12-27T02:38:45Z) - Deep Semantic Matching with Foreground Detection and Cycle-Consistency [103.22976097225457]
We address weakly supervised semantic matching based on a deep network.
We explicitly estimate the foreground regions to suppress the effect of background clutter.
We develop cycle-consistent losses to enforce the predicted transformations across multiple images to be geometrically plausible and consistent.
arXiv Detail & Related papers (2020-03-31T22:38:09Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.