Spatial-Scale Aligned Network for Fine-Grained Recognition
- URL: http://arxiv.org/abs/2001.01211v1
- Date: Sun, 5 Jan 2020 11:12:08 GMT
- Title: Spatial-Scale Aligned Network for Fine-Grained Recognition
- Authors: Lizhao Gao, Haihua Xu, Chong Sun, Junling Liu, Yu-Wing Tai
- Abstract summary: Existing approaches for fine-grained visual recognition focus on learning marginal region-based representations.
We propose the spatial-scale aligned network (SSANET) and implicitly address misalignments during the recognition process.
- Score: 42.71878867504503
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Existing approaches for fine-grained visual recognition focus on learning
marginal region-based representations while neglecting the spatial and scale
misalignments, leading to inferior performance. In this paper, we propose the
spatial-scale aligned network (SSANET) and implicitly address misalignments
during the recognition process. Especially, SSANET consists of 1) a
self-supervised proposal mining formula with Morphological Alignment
Constraints; 2) a discriminative scale mining (DSM) module, which exploits the
feature pyramid via a circulant matrix, and provides the Fourier solver for
fast scale alignments; 3) an oriented pooling (OP) module, that performs the
pooling operation in several pre-defined orientations. Each orientation defines
one kind of spatial alignment, and the network automatically determines which
is the optimal alignments through learning. With the proposed two modules, our
algorithm can automatically determine the accurate local proposal regions and
generate more robust target representations being invariant to various
appearance variances. Extensive experiments verify that SSANET is competent at
learning better spatial-scale invariant target representations, yielding
superior performance on the fine-grained recognition task on several
benchmarks.
Related papers
- Double-Shot 3D Shape Measurement with a Dual-Branch Network [14.749887303860717]
We propose a dual-branch Convolutional Neural Network (CNN)-Transformer network (PDCNet) to process different structured light (SL) modalities.
Within PDCNet, a Transformer branch is used to capture global perception in the fringe images, while a CNN branch is designed to collect local details in the speckle images.
We show that our method can reduce fringe order ambiguity while producing high-accuracy results on a self-made dataset.
arXiv Detail & Related papers (2024-07-19T10:49:26Z) - Task-Oriented Sensing, Computation, and Communication Integration for
Multi-Device Edge AI [108.08079323459822]
This paper studies a new multi-intelligent edge artificial-latency (AI) system, which jointly exploits the AI model split inference and integrated sensing and communication (ISAC)
We measure the inference accuracy by adopting an approximate but tractable metric, namely discriminant gain.
arXiv Detail & Related papers (2022-07-03T06:57:07Z) - Distribution Regularized Self-Supervised Learning for Domain Adaptation
of Semantic Segmentation [3.284878354988896]
This paper proposes a pixel-level distribution regularization scheme (DRSL) for self-supervised domain adaptation of semantic segmentation.
In a typical setting, the classification loss forces the semantic segmentation model to greedily learn the representations that capture inter-class variations.
We capture pixel-level intra-class variations through class-aware multi-modal distribution learning.
arXiv Detail & Related papers (2022-06-20T09:52:49Z) - Learning towards Synchronous Network Memorizability and Generalizability
for Continual Segmentation across Multiple Sites [52.84959869494459]
In clinical practice, a segmentation network is often required to continually learn on a sequential data stream from multiple sites.
Existing methods are usually restricted in either network memorizability on previous sites or generalizability on unseen sites.
This paper aims to tackle the problem of Synchronous Memorizability and Generalizability with a novel proposed SMG-learning framework.
arXiv Detail & Related papers (2022-06-14T13:04:36Z) - Semi-supervised Domain Adaptive Structure Learning [72.01544419893628]
Semi-supervised domain adaptation (SSDA) is a challenging problem requiring methods to overcome both 1) overfitting towards poorly annotated data and 2) distribution shift across domains.
We introduce an adaptive structure learning method to regularize the cooperation of SSL and DA.
arXiv Detail & Related papers (2021-12-12T06:11:16Z) - HSVA: Hierarchical Semantic-Visual Adaptation for Zero-Shot Learning [74.76431541169342]
Zero-shot learning (ZSL) tackles the unseen class recognition problem, transferring semantic knowledge from seen classes to unseen ones.
We propose a novel hierarchical semantic-visual adaptation (HSVA) framework to align semantic and visual domains.
Experiments on four benchmark datasets demonstrate HSVA achieves superior performance on both conventional and generalized ZSL.
arXiv Detail & Related papers (2021-09-30T14:27:50Z) - Inter-class Discrepancy Alignment for Face Recognition [55.578063356210144]
We propose a unified framework calledInter-class DiscrepancyAlignment(IDA)
IDA-DAO is used to align the similarity scores considering the discrepancy between the images and its neighbors.
IDA-SSE can provide convincing inter-class neighbors by introducing virtual candidate images generated with GAN.
arXiv Detail & Related papers (2021-03-02T08:20:08Z) - Align Deep Features for Oriented Object Detection [40.28244152216309]
We propose a single-shot Alignment Network (S$2$A-Net) consisting of two modules: a Feature Alignment Module (FAM) and an Oriented Detection Module (ODM)
The FAM can generate high-quality anchors with an Anchor Refinement Network and adaptively align the convolutional features according to the anchor boxes with a novel Alignment Convolution.
The ODM first adopts active rotating filters to encode the orientation information and then produces orientation-sensitive and orientation-invariant features to alleviate the inconsistency between classification score and localization accuracy.
arXiv Detail & Related papers (2020-08-21T09:55:13Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.