Global Pooling, More than Meets the Eye: Position Information is Encoded
Channel-Wise in CNNs
- URL: http://arxiv.org/abs/2108.07884v1
- Date: Tue, 17 Aug 2021 21:27:30 GMT
- Title: Global Pooling, More than Meets the Eye: Position Information is Encoded
Channel-Wise in CNNs
- Authors: Md Amirul Islam, Matthew Kowal, Sen Jia, Konstantinos G. Derpanis and
Neil D. B. Bruce
- Abstract summary: We demonstrate that positional information is encoded based on the ordering of the channel dimensions, while semantic information is largely not.
We show the real world impact of these findings by applying them to two applications.
- Score: 32.81128493853064
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In this paper, we challenge the common assumption that collapsing the spatial
dimensions of a 3D (spatial-channel) tensor in a convolutional neural network
(CNN) into a vector via global pooling removes all spatial information.
Specifically, we demonstrate that positional information is encoded based on
the ordering of the channel dimensions, while semantic information is largely
not. Following this demonstration, we show the real world impact of these
findings by applying them to two applications. First, we propose a simple yet
effective data augmentation strategy and loss function which improves the
translation invariance of a CNN's output. Second, we propose a method to
efficiently determine which channels in the latent representation are
responsible for (i) encoding overall position information or (ii)
region-specific positions. We first show that semantic segmentation has a
significant reliance on the overall position channels to make predictions. We
then show for the first time that it is possible to perform a `region-specific'
attack, and degrade a network's performance in a particular part of the input.
We believe our findings and demonstrated applications will benefit research
areas concerned with understanding the characteristics of CNNs.
Related papers
- DAS: A Deformable Attention to Capture Salient Information in CNNs [2.321323878201932]
Self-attention can improve a model's access to global information but increases computational overhead.
We present a fast and simple fully convolutional method called DAS that helps focus attention on relevant information.
arXiv Detail & Related papers (2023-11-20T18:49:58Z) - What Can Be Learnt With Wide Convolutional Neural Networks? [69.55323565255631]
We study infinitely-wide deep CNNs in the kernel regime.
We prove that deep CNNs adapt to the spatial scale of the target function.
We conclude by computing the generalisation error of a deep CNN trained on the output of another deep CNN.
arXiv Detail & Related papers (2022-08-01T17:19:32Z) - SAR Despeckling Using Overcomplete Convolutional Networks [53.99620005035804]
despeckling is an important problem in remote sensing as speckle degrades SAR images.
Recent studies show that convolutional neural networks(CNNs) outperform classical despeckling methods.
This study employs an overcomplete CNN architecture to focus on learning low-level features by restricting the receptive field.
We show that the proposed network improves despeckling performance compared to recent despeckling methods on synthetic and real SAR images.
arXiv Detail & Related papers (2022-05-31T15:55:37Z) - Residual Moment Loss for Medical Image Segmentation [56.72261489147506]
Location information is proven to benefit the deep learning models on capturing the manifold structure of target objects.
Most existing methods encode the location information in an implicit way, for the network to learn.
We propose a novel loss function, namely residual moment (RM) loss, to explicitly embed the location information of segmentation targets.
arXiv Detail & Related papers (2021-06-27T09:31:49Z) - Position, Padding and Predictions: A Deeper Look at Position Information
in CNNs [30.583407443282365]
We show that a surprising degree of absolute position information is encoded in commonly used CNNs.
We show that zero padding drives CNNs to encode position information in their internal representations, while a lack of padding precludes position encoding.
This gives rise to deeper questions about the role of position information in CNNs.
arXiv Detail & Related papers (2021-01-28T23:40:32Z) - Weakly-Supervised Action Localization and Action Recognition using
Global-Local Attention of 3D CNN [4.924442315857227]
3D Convolutional Neural Network (3D CNN) captures spatial and temporal information on 3D data such as video sequences.
We propose two approaches to improve the visual explanations and classification in 3D CNN.
arXiv Detail & Related papers (2020-12-17T12:29:16Z) - Channel-wise Knowledge Distillation for Dense Prediction [73.99057249472735]
We propose to align features channel-wise between the student and teacher networks.
We consistently achieve superior performance on three benchmarks with various network structures.
arXiv Detail & Related papers (2020-11-26T12:00:38Z) - Self-supervised Video Representation Learning by Uncovering
Spatio-temporal Statistics [74.6968179473212]
This paper proposes a novel pretext task to address the self-supervised learning problem.
We compute a series of partitioning-temporal statistical summaries, such as the spatial location and dominant direction of the largest motion.
A neural network is built and trained to yield the statistical summaries given the video frames as inputs.
arXiv Detail & Related papers (2020-08-31T08:31:56Z) - Localized convolutional neural networks for geospatial wind forecasting [0.0]
Convolutional Neural Networks (CNN) possess positive qualities when it comes to many spatial data.
In this work, we propose localized convolutional neural networks that enable CNNs to learn local features in addition to the global ones.
They can be added to any convolutional layers, easily end-to-end trained, introduce minimal additional complexity, and let CNNs retain most of their benefits to the extent that they are needed.
arXiv Detail & Related papers (2020-05-12T17:14:49Z) - Learning to Predict Context-adaptive Convolution for Semantic
Segmentation [66.27139797427147]
Long-range contextual information is essential for achieving high-performance semantic segmentation.
We propose a Context-adaptive Convolution Network (CaC-Net) to predict a spatially-varying feature weighting vector.
Our CaC-Net achieves superior segmentation performance on three public datasets.
arXiv Detail & Related papers (2020-04-17T13:09:17Z) - How Much Position Information Do Convolutional Neural Networks Encode? [27.604154992915863]
In contrast to fully connected networks, Convolutional Neural Networks (CNNs) achieve efficiency by learning weights associated with local filters with a finite spatial extent.
In this paper, we test this hypothesis revealing the surprising degree of absolute position information that is encoded in commonly used neural networks.
arXiv Detail & Related papers (2020-01-22T19:44:43Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.