PadChannel: Improving CNN Performance through Explicit Padding Encoding
- URL: http://arxiv.org/abs/2311.07623v2
- Date: Thu, 16 Nov 2023 17:17:40 GMT
- Title: PadChannel: Improving CNN Performance through Explicit Padding Encoding
- Authors: Juho Kim
- Abstract summary: In convolutional neural networks (CNNs), padding plays a pivotal role in preserving spatial dimensions throughout the layers.
Traditional padding techniques do not explicitly distinguish between the actual image content and the padded regions.
We propose PadChannel, a novel padding method that encodes padding statuses as an additional input channel.
- Score: 40.39759037668144
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In convolutional neural networks (CNNs), padding plays a pivotal role in
preserving spatial dimensions throughout the layers. Traditional padding
techniques do not explicitly distinguish between the actual image content and
the padded regions, potentially causing CNNs to incorrectly interpret the
boundary pixels or regions that resemble boundaries. This ambiguity can lead to
suboptimal feature extraction. To address this, we propose PadChannel, a novel
padding method that encodes padding statuses as an additional input channel,
enabling CNNs to easily distinguish genuine pixels from padded ones. By
incorporating PadChannel into several prominent CNN architectures, we observed
small performance improvements and notable reductions in the variances on the
ImageNet-1K image classification task at marginal increases in the
computational cost. The source code is available at
https://github.com/AussieSeaweed/pad-channel
Related papers
- MotionDeltaCNN: Sparse CNN Inference of Frame Differences in Moving
Camera Videos [16.865802182250857]
MotionDeltaCNN is a sparse CNN inference framework that supports moving cameras.
We introduce spherical buffers and padded convolutions to enable seamless fusion of newly unveiled regions and previously processed regions.
Our evaluation shows that we outperform DeltaCNN by up to 90% for moving camera videos.
arXiv Detail & Related papers (2022-10-18T14:23:05Z) - Localizing Semantic Patches for Accelerating Image Classification [12.250230630124758]
We first pinpoint task-aware regions over the input image by a lightweight patch proposal network called AnchorNet.
We then feed these localized semantic patches with much smaller spatial redundancy into a general classification network.
Our method outperforms SOTA dynamic inference methods with fewer inference costs.
arXiv Detail & Related papers (2022-06-07T15:01:54Z) - Context-aware Padding for Semantic Segmentation [82.37483350347559]
We propose a context-aware (CA) padding approach to extend the image.
Using context-aware padding, the ResNet-based segmentation model achieves higher mean Intersection-Over-Union than the traditional zero padding.
arXiv Detail & Related papers (2021-09-16T10:33:21Z) - Shape-Tailored Deep Neural Networks [87.55487474723994]
We present Shape-Tailored Deep Neural Networks (ST-DNN)
ST-DNN extend convolutional networks (CNN), which aggregate data from fixed shape (square) neighborhoods, to compute descriptors defined on arbitrarily shaped regions.
We show that ST-DNN are 3-4 orders of magnitude smaller then CNNs used for segmentation.
arXiv Detail & Related papers (2021-02-16T23:32:14Z) - The Mind's Eye: Visualizing Class-Agnostic Features of CNNs [92.39082696657874]
We propose an approach to visually interpret CNN features given a set of images by creating corresponding images that depict the most informative features of a specific layer.
Our method uses a dual-objective activation and distance loss, without requiring a generator network nor modifications to the original model.
arXiv Detail & Related papers (2021-01-29T07:46:39Z) - Position, Padding and Predictions: A Deeper Look at Position Information
in CNNs [30.583407443282365]
We show that a surprising degree of absolute position information is encoded in commonly used CNNs.
We show that zero padding drives CNNs to encode position information in their internal representations, while a lack of padding precludes position encoding.
This gives rise to deeper questions about the role of position information in CNNs.
arXiv Detail & Related papers (2021-01-28T23:40:32Z) - Shape Defense Against Adversarial Attacks [47.64219291655723]
Humans rely heavily on shape information to recognize objects. Conversely, convolutional neural networks (CNNs) are biased more towards texture.
Here, we explore how shape bias can be incorporated into CNNs to improve their robustness.
Two algorithms are proposed, based on the observation that edges are invariant to moderate imperceptible perturbations.
arXiv Detail & Related papers (2020-08-31T03:23:59Z) - Patch-wise Attack for Fooling Deep Neural Network [153.59832333877543]
We propose a patch-wise iterative algorithm -- a black-box attack towards mainstream normally trained and defense models.
We significantly improve the success rate by 9.2% for defense models and 3.7% for normally trained models on average.
arXiv Detail & Related papers (2020-07-14T01:50:22Z) - On Translation Invariance in CNNs: Convolutional Layers can Exploit
Absolute Spatial Location [18.932504899552494]
We show that CNNs can and will exploit the absolute spatial location by learning filters that respond exclusively to particular absolute locations.
Because modern CNNs filters have a huge receptive field, these boundary effects operate even far from the image boundary.
arXiv Detail & Related papers (2020-03-16T08:00:06Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.