Related papers: Learning Translation Invariance in CNNs

Learning Translation Invariance in CNNs

URL: http://arxiv.org/abs/2011.11757v1
Date: Fri, 6 Nov 2020 09:39:27 GMT
Title: Learning Translation Invariance in CNNs
Authors: Valerio Biscione, Jeffrey Bowers
Abstract summary: We show how, even though CNNs are not 'architecturally invariant' to translation, they can indeed 'learn' to be invariant to translation. We investigated how this pretraining affected the internal network representations. These experiments show how pretraining a network on an environment with the right 'latent' characteristics can result in the network learning deep perceptual rules.
Score: 1.52292571922932
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: When seeing a new object, humans can immediately recognize it across different retinal locations: we say that the internal object representation is invariant to translation. It is commonly believed that Convolutional Neural Networks (CNNs) are architecturally invariant to translation thanks to the convolution and/or pooling operations they are endowed with. In fact, several works have found that these networks systematically fail to recognise new objects on untrained locations. In this work we show how, even though CNNs are not 'architecturally invariant' to translation, they can indeed 'learn' to be invariant to translation. We verified that this can be achieved by pretraining on ImageNet, and we found that it is also possible with much simpler datasets in which the items are fully translated across the input canvas. We investigated how this pretraining affected the internal network representations, finding that the invariance was almost always acquired, even though it was some times disrupted by further training due to catastrophic forgetting/interference. These experiments show how pretraining a network on an environment with the right 'latent' characteristics (a more naturalistic environment) can result in the network learning deep perceptual rules which would dramatically improve subsequent generalization.

Related papers

Convolutional Neural Networks Can (Meta-)Learn the Same-Different Relation [8.075796717801985]
Humans remain vastly superior to CNNs in visual tasks involving relations. We show that the same CNN architectures that fail to generalize the same-different relation with conventional training are able to succeed when trained via meta-learning.
arXiv Detail & Related papers (2025-03-29T20:24:23Z)
Metric as Transform: Exploring beyond Affine Transform for Interpretable Neural Network [2.7195102129095003]
We find dot product neurons with global influence less interpretable as compared to local influence of euclidean distance. We develop an interpretable local dictionary based Neural Networks and use it to understand and reject adversarial examples.
arXiv Detail & Related papers (2024-10-21T16:22:19Z)
Latent Space Translation via Semantic Alignment [29.2401314068038]
We show how representations learned from different neural modules can be translated between different pre-trained networks. Our method directly estimates a transformation between two given latent spaces, thereby enabling effective stitching of encoders and decoders without additional training. Notably, we show how it is possible to zero-shot stitch text encoders and vision decoders, or vice-versa, yielding surprisingly good classification performance in this multimodal setting.
arXiv Detail & Related papers (2023-11-01T17:12:00Z)
Unveiling Invariances via Neural Network Pruning [44.47186380630998]
Invariance describes transformations that do not alter data's underlying semantics. Modern networks are handcrafted to handle well-known invariances. We propose a framework to learn novel network architectures that capture data-dependent invariances via pruning.
arXiv Detail & Related papers (2023-09-15T05:38:33Z)
Fair Interpretable Learning via Correction Vectors [68.29997072804537]
We propose a new framework for fair representation learning centered around the learning of "correction vectors" The corrections are then simply summed up to the original features, and can therefore be analyzed as an explicit penalty or bonus to each feature. We show experimentally that a fair representation learning problem constrained in such a way does not impact performance.
arXiv Detail & Related papers (2022-01-17T10:59:33Z)
Dynamic Inference with Neural Interpreters [72.90231306252007]
We present Neural Interpreters, an architecture that factorizes inference in a self-attention network as a system of modules. inputs to the model are routed through a sequence of functions in a way that is end-to-end learned. We show that Neural Interpreters perform on par with the vision transformer using fewer parameters, while being transferrable to a new task in a sample efficient manner.
arXiv Detail & Related papers (2021-10-12T23:22:45Z)
Convolutional Neural Networks Are Not Invariant to Translation, but They Can Learn to Be [0.76146285961466]
When seeing a new object, humans can immediately recognize it across different retinal locations. It is commonly believed that Convolutional Neural Networks (CNNs) are architecturally invariant to translation. We show how pretraining a network on an environment with the right latent' characteristics can result in the network learning deep perceptual rules.
arXiv Detail & Related papers (2021-10-12T09:51:07Z)
Reasoning-Modulated Representations [85.08205744191078]
We study a common setting where our task is not purely opaque. Our approach paves the way for a new class of data-efficient representation learning.
arXiv Detail & Related papers (2021-07-19T13:57:13Z)
Learn to Differ: Sim2Real Small Defection Segmentation Network [8.488353860049898]
Small defection segmentation approaches are trained in specific settings and tend to be limited by fixed context. We propose the network SSDS that learns a way of distinguishing small defections between two images regardless of the context.
arXiv Detail & Related papers (2021-03-07T08:25:56Z)
Category-Learning with Context-Augmented Autoencoder [63.05016513788047]
Finding an interpretable non-redundant representation of real-world data is one of the key problems in Machine Learning. We propose a novel method of using data augmentations when training autoencoders. We train a Variational Autoencoder in such a way, that it makes transformation outcome predictable by auxiliary network.
arXiv Detail & Related papers (2020-10-10T14:04:44Z)
Understanding the Role of Individual Units in a Deep Neural Network [85.23117441162772]
We present an analytic framework to systematically identify hidden units within image classification and image generation networks. First, we analyze a convolutional neural network (CNN) trained on scene classification and discover units that match a diverse set of object concepts. Second, we use a similar analytic method to analyze a generative adversarial network (GAN) model trained to generate scenes.
arXiv Detail & Related papers (2020-09-10T17:59:10Z)
Hold me tight! Influence of discriminative features on deep network boundaries [63.627760598441796]
We propose a new perspective that relates dataset features to the distance of samples to the decision boundary. This enables us to carefully tweak the position of the training samples and measure the induced changes on the boundaries of CNNs trained on large-scale vision datasets.
arXiv Detail & Related papers (2020-02-15T09:29:36Z)

This list is automatically generated from the titles and abstracts of the papers in this site.