Related papers: An Empirical Method to Quantify the Peripheral Performance Degradation in Deep Networks

An Empirical Method to Quantify the Peripheral Performance Degradation in Deep Networks

URL: http://arxiv.org/abs/2012.02749v1
Date: Fri, 4 Dec 2020 18:00:47 GMT
Title: An Empirical Method to Quantify the Peripheral Performance Degradation in Deep Networks
Authors: Calden Wloka and John K. Tsotsos
Abstract summary: convolutional neural network (CNN) kernels compound with each convolutional layer. Deeper and deeper networks combined with stride-based down-sampling means that the propagation of this region can end up covering a non-negligable portion of the image. Our dataset is constructed by inserting objects into high resolution backgrounds, thereby allowing us to crop sub-images which place target objects at specific locations relative to the image border. By probing the behaviour of Mask R-CNN across a selection of target locations, we see clear patterns of performance degredation near the image boundary, and in particular in the image corners.
Score: 18.808132632482103
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: When applying a convolutional kernel to an image, if the output is to remain the same size as the input then some form of padding is required around the image boundary, meaning that for each layer of convolution in a convolutional neural network (CNN), a strip of pixels equal to the half-width of the kernel size is produced with a non-veridical representation. Although most CNN kernels are small to reduce the parameter load of a network, this non-veridical area compounds with each convolutional layer. The tendency toward deeper and deeper networks combined with stride-based down-sampling means that the propagation of this region can end up covering a non-negligable portion of the image. Although this issue with convolutions has been well acknowledged over the years, the impact of this degraded peripheral representation on modern network behavior has not been fully quantified. What are the limits of translation invariance? Does image padding successfully mitigate the issue, or is performance affected as an object moves between the image border and center? Using Mask R-CNN as an experimental model, we design a dataset and methodology to quantify the spatial dependency of network performance. Our dataset is constructed by inserting objects into high resolution backgrounds, thereby allowing us to crop sub-images which place target objects at specific locations relative to the image border. By probing the behaviour of Mask R-CNN across a selection of target locations, we see clear patterns of performance degredation near the image boundary, and in particular in the image corners. Quantifying both the extent and magnitude of this spatial anisotropy in network performance is important for the deployment of deep networks into unconstrained and realistic environments in which the location of objects or regions of interest are not guaranteed to be well localized within a given image.

Related papers

DDU-Net: A Domain Decomposition-based CNN for High-Resolution Image Segmentation on Multiple GPUs [46.873264197900916]
A domain decomposition-based U-Net architecture is introduced, which partitions input images into non-overlapping patches. A communication network is added to facilitate inter-patch information exchange to enhance the understanding of spatial context. Results show that the approach achieves a $2-3,%$ higher intersection over union (IoU) score compared to the same network without inter-patch communication.
arXiv Detail & Related papers (2024-07-31T01:07:21Z)
Mutual-Guided Dynamic Network for Image Fusion [51.615598671899335]
We propose a novel mutual-guided dynamic network (MGDN) for image fusion, which allows for effective information utilization across different locations and inputs. Experimental results on five benchmark datasets demonstrate that our proposed method outperforms existing methods on four image fusion tasks.
arXiv Detail & Related papers (2023-08-24T03:50:37Z)
CEC-CNN: A Consecutive Expansion-Contraction Convolutional Network for Very Small Resolution Medical Image Classification [0.8108972030676009]
We introduce a new CNN architecture which preserves multi-scale features from deep, intermediate, and shallow layers. Using a dataset of very low resolution patches from Pancreatic Ductal Adenocarcinoma (PDAC) CT scans we demonstrate that our network can outperform current state of the art models.
arXiv Detail & Related papers (2022-09-27T20:01:12Z)
Global and Local Alignment Networks for Unpaired Image-to-Image Translation [170.08142745705575]
The goal of unpaired image-to-image translation is to produce an output image reflecting the target domain's style. Due to the lack of attention to the content change in existing methods, semantic information from source images suffers from degradation during translation. We introduce a novel approach, Global and Local Alignment Networks (GLA-Net) Our method effectively generates sharper and more realistic images than existing approaches.
arXiv Detail & Related papers (2021-11-19T18:01:54Z)
Spatially-Adaptive Image Restoration using Distortion-Guided Networks [51.89245800461537]
We present a learning-based solution for restoring images suffering from spatially-varying degradations. We propose SPAIR, a network design that harnesses distortion-localization information and dynamically adjusts to difficult regions in the image.
arXiv Detail & Related papers (2021-08-19T11:02:25Z)
AINet: Association Implantation for Superpixel Segmentation [82.21559299694555]
We propose a novel textbfAssociation textbfImplantation (AI) module to enable the network to explicitly capture the relations between the pixel and its surrounding grids. Our method could not only achieve state-of-the-art performance but maintain satisfactory inference efficiency.
arXiv Detail & Related papers (2021-01-26T10:40:13Z)
Deformable spatial propagation network for depth completion [2.5306673456895306]
We propose a deformable spatial propagation network (DSPN) to adaptively generates different receptive field and affinity matrix for each pixel. It allows the network obtain information with much fewer but more relevant pixels for propagation.
arXiv Detail & Related papers (2020-07-08T16:39:50Z)
Multi-scale Cloud Detection in Remote Sensing Images using a Dual Convolutional Neural Network [4.812718493682455]
CNN has advanced the state of the art in pixel-level classification of remote sensing images. We propose an architecture of two cascaded CNN model components successively processing undersampled and full resolution images. We achieve a 16% relative improvement in pixel accuracy over a CNN baseline based on patching.
arXiv Detail & Related papers (2020-06-01T10:27:42Z)
Image Fine-grained Inpainting [89.17316318927621]
We present a one-stage model that utilizes dense combinations of dilated convolutions to obtain larger and more effective receptive fields. To better train this efficient generator, except for frequently-used VGG feature matching loss, we design a novel self-guided regression loss. We also employ a discriminator with local and global branches to ensure local-global contents consistency.
arXiv Detail & Related papers (2020-02-07T03:45:25Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.