Convolutional Neural Networks Are Not Invariant to Translation, but They
Can Learn to Be
- URL: http://arxiv.org/abs/2110.05861v1
- Date: Tue, 12 Oct 2021 09:51:07 GMT
- Title: Convolutional Neural Networks Are Not Invariant to Translation, but They
Can Learn to Be
- Authors: Valerio Biscione, Jeffrey S. Bowers
- Abstract summary: When seeing a new object, humans can immediately recognize it across different retinal locations.
It is commonly believed that Convolutional Neural Networks (CNNs) are architecturally invariant to translation.
We show how pretraining a network on an environment with the right latent' characteristics can result in the network learning deep perceptual rules.
- Score: 0.76146285961466
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: When seeing a new object, humans can immediately recognize it across
different retinal locations: the internal object representation is invariant to
translation. It is commonly believed that Convolutional Neural Networks (CNNs)
are architecturally invariant to translation thanks to the convolution and/or
pooling operations they are endowed with. In fact, several studies have found
that these networks systematically fail to recognise new objects on untrained
locations. In this work, we test a wide variety of CNNs architectures showing
how, apart from DenseNet-121, none of the models tested was architecturally
invariant to translation. Nevertheless, all of them could learn to be invariant
to translation. We show how this can be achieved by pretraining on ImageNet,
and it is sometimes possible with much simpler data sets when all the items are
fully translated across the input canvas. At the same time, this invariance can
be disrupted by further training due to catastrophic forgetting/interference.
These experiments show how pretraining a network on an environment with the
right `latent' characteristics (a more naturalistic environment) can result in
the network learning deep perceptual rules which would dramatically improve
subsequent generalization.
Related papers
- Metric as Transform: Exploring beyond Affine Transform for Interpretable Neural Network [2.7195102129095003]
We find dot product neurons with global influence less interpretable as compared to local influence of euclidean distance.
We develop an interpretable local dictionary based Neural Networks and use it to understand and reject adversarial examples.
arXiv Detail & Related papers (2024-10-21T16:22:19Z) - Latent Space Translation via Semantic Alignment [29.2401314068038]
We show how representations learned from different neural modules can be translated between different pre-trained networks.
Our method directly estimates a transformation between two given latent spaces, thereby enabling effective stitching of encoders and decoders without additional training.
Notably, we show how it is possible to zero-shot stitch text encoders and vision decoders, or vice-versa, yielding surprisingly good classification performance in this multimodal setting.
arXiv Detail & Related papers (2023-11-01T17:12:00Z) - Unveiling Invariances via Neural Network Pruning [44.47186380630998]
Invariance describes transformations that do not alter data's underlying semantics.
Modern networks are handcrafted to handle well-known invariances.
We propose a framework to learn novel network architectures that capture data-dependent invariances via pruning.
arXiv Detail & Related papers (2023-09-15T05:38:33Z) - Finding Differences Between Transformers and ConvNets Using
Counterfactual Simulation Testing [82.67716657524251]
We present a counterfactual framework that allows us to study the robustness of neural networks with respect to naturalistic variations.
Our method allows for a fair comparison of the robustness of recently released, state-of-the-art Convolutional Neural Networks and Vision Transformers.
arXiv Detail & Related papers (2022-11-29T18:59:23Z) - Fair Interpretable Learning via Correction Vectors [68.29997072804537]
We propose a new framework for fair representation learning centered around the learning of "correction vectors"
The corrections are then simply summed up to the original features, and can therefore be analyzed as an explicit penalty or bonus to each feature.
We show experimentally that a fair representation learning problem constrained in such a way does not impact performance.
arXiv Detail & Related papers (2022-01-17T10:59:33Z) - Revisiting Transformation Invariant Geometric Deep Learning: Are Initial
Representations All You Need? [80.86819657126041]
We show that transformation-invariant and distance-preserving initial representations are sufficient to achieve transformation invariance.
Specifically, we realize transformation-invariant and distance-preserving initial point representations by modifying multi-dimensional scaling.
We prove that TinvNN can strictly guarantee transformation invariance, being general and flexible enough to be combined with the existing neural networks.
arXiv Detail & Related papers (2021-12-23T03:52:33Z) - Dynamic Inference with Neural Interpreters [72.90231306252007]
We present Neural Interpreters, an architecture that factorizes inference in a self-attention network as a system of modules.
inputs to the model are routed through a sequence of functions in a way that is end-to-end learned.
We show that Neural Interpreters perform on par with the vision transformer using fewer parameters, while being transferrable to a new task in a sample efficient manner.
arXiv Detail & Related papers (2021-10-12T23:22:45Z) - What Does CNN Shift Invariance Look Like? A Visualization Study [87.79405274610681]
Feature extraction with convolutional neural networks (CNNs) is a popular method to represent images for machine learning tasks.
We focus on measuring and visualizing the shift invariance of extracted features from popular off-the-shelf CNN models.
We conclude that features extracted from popular networks are not globally invariant, and that biases and artifacts exist within this variance.
arXiv Detail & Related papers (2020-11-09T01:16:30Z) - Learning Translation Invariance in CNNs [1.52292571922932]
We show how, even though CNNs are not 'architecturally invariant' to translation, they can indeed 'learn' to be invariant to translation.
We investigated how this pretraining affected the internal network representations.
These experiments show how pretraining a network on an environment with the right 'latent' characteristics can result in the network learning deep perceptual rules.
arXiv Detail & Related papers (2020-11-06T09:39:27Z) - Understanding the Role of Individual Units in a Deep Neural Network [85.23117441162772]
We present an analytic framework to systematically identify hidden units within image classification and image generation networks.
First, we analyze a convolutional neural network (CNN) trained on scene classification and discover units that match a diverse set of object concepts.
Second, we use a similar analytic method to analyze a generative adversarial network (GAN) model trained to generate scenes.
arXiv Detail & Related papers (2020-09-10T17:59:10Z) - Localized convolutional neural networks for geospatial wind forecasting [0.0]
Convolutional Neural Networks (CNN) possess positive qualities when it comes to many spatial data.
In this work, we propose localized convolutional neural networks that enable CNNs to learn local features in addition to the global ones.
They can be added to any convolutional layers, easily end-to-end trained, introduce minimal additional complexity, and let CNNs retain most of their benefits to the extent that they are needed.
arXiv Detail & Related papers (2020-05-12T17:14:49Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.