Related papers: The problems with using STNs to align CNN feature maps

The problems with using STNs to align CNN feature maps

URL: http://arxiv.org/abs/2001.05858v1
Date: Tue, 14 Jan 2020 12:59:56 GMT
Title: The problems with using STNs to align CNN feature maps
Authors: Lukas Finnveden, Ylva Jansson, Tony Lindeberg
Abstract summary: We argue that spatial transformer networks (STNs) do not have the ability to align the feature maps of a transformed image and its original. We advocate taking advantage of more complex features in deeper layers by instead sharing parameters between the classification and the localisation network.
Score: 0.0
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Spatial transformer networks (STNs) were designed to enable CNNs to learn invariance to image transformations. STNs were originally proposed to transform CNN feature maps as well as input images. This enables the use of more complex features when predicting transformation parameters. However, since STNs perform a purely spatial transformation, they do not, in the general case, have the ability to align the feature maps of a transformed image and its original. We present a theoretical argument for this and investigate the practical implications, showing that this inability is coupled with decreased classification accuracy. We advocate taking advantage of more complex features in deeper layers by instead sharing parameters between the classification and the localisation network.

Related papers

Variable-size Symmetry-based Graph Fourier Transforms for image compression [65.7352685872625]
We propose a new family of Symmetry-based Graph Fourier Transforms of variable sizes into a coding framework. Our proposed algorithm generates symmetric graphs on the grid by adding specific symmetrical connections between nodes. Experiments show that SBGFTs outperform the primary transforms integrated in the explicit Multiple Transform Selection.
arXiv Detail & Related papers (2024-11-24T13:00:44Z)
Distance Weighted Trans Network for Image Completion [52.318730994423106]
We propose a new architecture that relies on Distance-based Weighted Transformer (DWT) to better understand the relationships between an image's components. CNNs are used to augment the local texture information of coarse priors. DWT blocks are used to recover certain coarse textures and coherent visual structures.
arXiv Detail & Related papers (2023-10-11T12:46:11Z)
Entropy Transformer Networks: A Learning Approach via Tangent Bundle Data Manifold [8.893886200299228]
This paper focuses on an accurate and fast approach for image transformation employed in the design of CNN architectures. A novel Entropy STN (ESTN) is proposed that interpolates on the data manifold distributions. Experiments on challenging benchmarks show that the proposed ESTN can improve predictive accuracy over a range of computer vision tasks.
arXiv Detail & Related papers (2023-07-24T04:21:51Z)
Random Padding Data Augmentation [23.70951896315126]
convolutional neural network (CNN) learns the same object in different positions in images. The usefulness of the features' spatial information in CNNs has not been well investigated. We introduce Random Padding, a new type of padding method for training CNNs.
arXiv Detail & Related papers (2023-02-17T04:15:33Z)
B-cos Networks: Alignment is All We Need for Interpretability [136.27303006772294]
We present a new direction for increasing the interpretability of deep neural networks (DNNs) by promoting weight-input alignment during training. A B-cos transform induces a single linear transform that faithfully summarises the full model computations. We show that it can easily be integrated into common models such as VGGs, ResNets, InceptionNets, and DenseNets.
arXiv Detail & Related papers (2022-05-20T16:03:29Z)
Quantized convolutional neural networks through the lens of partial differential equations [6.88204255655161]
Quantization of Convolutional Neural Networks (CNNs) is a common approach to ease the computational burden involved in the deployment of CNNs. In this work, we explore ways to improve quantized CNNs using PDE-based perspective and analysis.
arXiv Detail & Related papers (2021-08-31T22:18:52Z)
CoTr: Efficiently Bridging CNN and Transformer for 3D Medical Image Segmentation [95.51455777713092]
Convolutional neural networks (CNNs) have been the de facto standard for nowadays 3D medical image segmentation. We propose a novel framework that efficiently bridges a bf Convolutional neural network and a bf Transformer bf (CoTr) for accurate 3D medical image segmentation.
arXiv Detail & Related papers (2021-03-04T13:34:22Z)
Volumetric Transformer Networks [88.85542905676712]
We introduce a learnable module, the volumetric transformer network (VTN) VTN predicts channel-wise warping fields so as to reconfigure intermediate CNN features spatially and channel-wisely. Our experiments show that VTN consistently boosts the features' representation power and consequently the networks' accuracy on fine-grained image recognition and instance-level image retrieval.
arXiv Detail & Related papers (2020-07-18T14:00:12Z)
Inability of spatial transformations of CNN feature maps to support invariant recognition [0.0]
We show that spatial transformations of CNN feature maps cannot align the feature maps of a transformed image to match those of its original. For rotations and reflections, spatially transforming feature maps or filters can enable invariance but only for networks with learnt or hardcoded rotation- or reflection-invariant features.
arXiv Detail & Related papers (2020-04-30T12:12:58Z)
Understanding when spatial transformer networks do not support invariance, and what to do about it [0.0]
spatial transformer networks (STNs) were designed to enable convolutional neural networks (CNNs) to learn invariance to image transformations. We show that STNs do not have the ability to align the feature maps of a transformed image with those of its original. We investigate alternative STN architectures that make use of complex features.
arXiv Detail & Related papers (2020-04-24T12:20:35Z)
Computational optimization of convolutional neural networks using separated filters architecture [69.73393478582027]
We consider a convolutional neural network transformation that reduces computation complexity and thus speedups neural network processing. Use of convolutional neural networks (CNN) is the standard approach to image recognition despite the fact they can be too computationally demanding.
arXiv Detail & Related papers (2020-02-18T17:42:13Z)

This list is automatically generated from the titles and abstracts of the papers in this site.