Invariant Deep Compressible Covariance Pooling for Aerial Scene
Categorization
- URL: http://arxiv.org/abs/2011.05702v1
- Date: Wed, 11 Nov 2020 11:13:07 GMT
- Title: Invariant Deep Compressible Covariance Pooling for Aerial Scene
Categorization
- Authors: Shidong Wang, Yi Ren, Gerard Parr, Yu Guan and Ling Shao
- Abstract summary: We propose a novel invariant deep compressible covariance pooling (IDCCP) to solve nuisance variations in aerial scene categorization.
We conduct extensive experiments on the publicly released aerial scene image data sets and demonstrate the superiority of this method compared with state-of-the-art methods.
- Score: 80.55951673479237
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Learning discriminative and invariant feature representation is the key to
visual image categorization. In this article, we propose a novel invariant deep
compressible covariance pooling (IDCCP) to solve nuisance variations in aerial
scene categorization. We consider transforming the input image according to a
finite transformation group that consists of multiple confounding orthogonal
matrices, such as the D4 group. Then, we adopt a Siamese-style network to
transfer the group structure to the representation space, where we can derive a
trivial representation that is invariant under the group action. The linear
classifier trained with trivial representation will also be possessed with
invariance. To further improve the discriminative power of representation, we
extend the representation to the tensor space while imposing orthogonal
constraints on the transformation matrix to effectively reduce feature
dimensions. We conduct extensive experiments on the publicly released aerial
scene image data sets and demonstrate the superiority of this method compared
with state-of-the-art methods. In particular, with using ResNet architecture,
our IDCCP model can reduce the dimension of the tensor representation by about
98% without sacrificing accuracy (i.e., <0.5%).
Related papers
- Mind the Gap Between Prototypes and Images in Cross-domain Finetuning [64.97317635355124]
We propose a contrastive prototype-image adaptation (CoPA) to adapt different transformations respectively for prototypes and images.
Experiments on Meta-Dataset demonstrate that CoPA achieves the state-of-the-art performance more efficiently.
arXiv Detail & Related papers (2024-10-16T11:42:11Z) - Affine-Transformation-Invariant Image Classification by Differentiable
Arithmetic Distribution Module [8.125023712173686]
Convolutional Neural Networks (CNNs) have achieved promising results in image classification.
CNNs are vulnerable to affine transformations including rotation, translation, flip and shuffle.
In this work, we introduce a more robust substitute by incorporating distribution learning techniques.
arXiv Detail & Related papers (2023-09-01T22:31:32Z) - Fine-grained Recognition with Learnable Semantic Data Augmentation [68.48892326854494]
Fine-grained image recognition is a longstanding computer vision challenge.
We propose diversifying the training data at the feature-level to alleviate the discriminative region loss problem.
Our method significantly improves the generalization performance on several popular classification networks.
arXiv Detail & Related papers (2023-09-01T11:15:50Z) - Deep Diversity-Enhanced Feature Representation of Hyperspectral Images [87.47202258194719]
We rectify 3D convolution by modifying its topology to enhance the rank upper-bound.
We also propose a novel diversity-aware regularization (DA-Reg) term that acts on the feature maps to maximize independence among elements.
To demonstrate the superiority of the proposed Re$3$-ConvSet and DA-Reg, we apply them to various HS image processing and analysis tasks.
arXiv Detail & Related papers (2023-01-15T16:19:18Z) - Difference of Anisotropic and Isotropic TV for Segmentation under Blur
and Poisson Noise [2.6381163133447836]
We adopt a smoothing-and-thresholding (SaT) segmentation framework that finds awise-smooth solution, followed by $k-means to segment the image.
Specifically for the image smoothing step, we replace the maximum noise in the MumfordShah model with a maximum variation of anisotropic total variation (AITV) as a regularization.
Convergence analysis is provided to validate the efficacy of the scheme.
arXiv Detail & Related papers (2023-01-06T01:14:56Z) - Topographic VAEs learn Equivariant Capsules [84.33745072274942]
We introduce the Topographic VAE: a novel method for efficiently training deep generative models with topographically organized latent variables.
We show that such a model indeed learns to organize its activations according to salient characteristics such as digit class, width, and style on MNIST.
We demonstrate approximate equivariance to complex transformations, expanding upon the capabilities of existing group equivariant neural networks.
arXiv Detail & Related papers (2021-09-03T09:25:57Z) - Group Equivariant Subsampling [60.53371517247382]
Subsampling is used in convolutional neural networks (CNNs) in the form of pooling or strided convolutions.
We first introduce translation equivariant subsampling/upsampling layers that can be used to construct exact translation equivariant CNNs.
We then generalise these layers beyond translations to general groups, thus proposing group equivariant subsampling/upsampling.
arXiv Detail & Related papers (2021-06-10T16:14:00Z) - Deep Transformation-Invariant Clustering [24.23117820167443]
We present an approach that does not rely on abstract features but instead learns to predict image transformations.
This learning process naturally fits in the gradient-based training of K-means and Gaussian mixture model.
We demonstrate that our novel approach yields competitive and highly promising results on standard image clustering benchmarks.
arXiv Detail & Related papers (2020-06-19T13:43:08Z) - BasisVAE: Translation-invariant feature-level clustering with
Variational Autoencoders [9.51828574518325]
Variational Autoencoders (VAEs) provide a flexible and scalable framework for non-linear dimensionality reduction.
We show how a collapsed variational inference scheme leads to scalable and efficient inference for BasisVAE.
arXiv Detail & Related papers (2020-03-06T23:10:52Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.