Related papers: Analyzing the Dependency of ConvNets on Spatial Information

Analyzing the Dependency of ConvNets on Spatial Information

URL: http://arxiv.org/abs/2002.01827v1
Date: Wed, 5 Feb 2020 15:22:32 GMT
Title: Analyzing the Dependency of ConvNets on Spatial Information
Authors: Yue Fan, Yongqin Xian, Max Maria Losch, Bernt Schiele
Abstract summary: We propose spatial shuffling and GAP+FC to destroy spatial information during both training and testing phases. We observe that spatial information can be deleted from later layers with small performance drops.
Score: 81.93266969255711
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Intuitively, image classification should profit from using spatial information. Recent work, however, suggests that this might be overrated in standard CNNs. In this paper, we are pushing the envelope and aim to further investigate the reliance on spatial information. We propose spatial shuffling and GAP+FC to destroy spatial information during both training and testing phases. Interestingly, we observe that spatial information can be deleted from later layers with small performance drops, which indicates spatial information at later layers is not necessary for good performance. For example, test accuracy of VGG-16 only drops by 0.03% and 2.66% with spatial information completely removed from the last 30% and 53% layers on CIFAR100, respectively. Evaluation on several object recognition datasets (CIFAR100, Small-ImageNet, ImageNet) with a wide range of CNN architectures (VGG16, ResNet50, ResNet152) shows an overall consistent pattern.

Related papers

Revealing the Utilized Rank of Subspaces of Learning in Neural Networks [3.4133351364625275]
We study how well the learned weights of a neural network utilize the space available to them. Most learned weights appear to be full rank, and are therefore not amenable to low rank decomposition. We propose a simple data-driven transformation that projects the weights onto the subspace where the data and the weight interact.
arXiv Detail & Related papers (2024-07-05T18:14:39Z)
Getting it Right: Improving Spatial Consistency in Text-to-Image Models [103.52640413616436]
One of the key shortcomings in current text-to-image (T2I) models is their inability to consistently generate images which faithfully follow the spatial relationships specified in the text prompt. We create SPRIGHT, the first spatially focused, large-scale dataset, by re-captioning 6 million images from 4 widely used vision datasets. We find that training on images containing a larger number of objects leads to substantial improvements in spatial consistency, including state-of-the-art results on T2I-CompBench with a spatial score of 0.2133, by fine-tuning on 500 images.
arXiv Detail & Related papers (2024-04-01T15:55:25Z)
Reducing Effects of Swath Gaps on Unsupervised Machine Learning Models for NASA MODIS Instruments [0.6157382820537718]
NASA Terra and NASA Aqua satellites capture imagery containing swath gaps, which are areas of no data. With annotated data as supervision, a model can learn to differentiate between the area of focus and the swath gap. We propose an augmentation technique that considerably removes the existence of swath gaps in order to allow CNNs to focus on the region of interest.
arXiv Detail & Related papers (2021-06-13T23:50:05Z)
Wise-SrNet: A Novel Architecture for Enhancing Image Classification by Learning Spatial Resolution of Feature Maps [0.5892638927736115]
One of the main challenges since the advancement of convolutional neural networks is how to connect the extracted feature map to the final classification layer. In this paper, we aim to tackle this problem by replacing the GAP layer with a new architecture called Wise-SrNet. It is inspired by the depthwise convolutional idea and is designed for processing spatial resolution while not increasing computational cost.
arXiv Detail & Related papers (2021-04-26T00:37:11Z)
Real-time Semantic Segmentation via Spatial-detail Guided Context Propagation [49.70144583431999]
We propose the spatial-detail guided context propagation network (SGCPNet) for achieving real-time semantic segmentation. It uses the spatial details of shallow layers to guide the propagation of the low-resolution global contexts, in which the lost spatial information can be effectively reconstructed. It achieves 69.5% mIoU segmentation accuracy, while its speed reaches 178.5 FPS on 768x1536 images on a GeForce GTX 1080 Ti GPU card.
arXiv Detail & Related papers (2020-05-22T07:07:26Z)
Spatially Attentive Output Layer for Image Classification [19.61612493183965]
Most convolutional neural networks (CNNs) for image classification use a global average pooling (GAP) followed by a fully-connected (FC) layer for output logits. We propose a novel spatial output layer on top of the existing convolutional feature maps to explicitly exploit the location-specific output information.
arXiv Detail & Related papers (2020-04-16T10:11:38Z)
Improved Residual Networks for Image and Video Recognition [98.10703825716142]
Residual networks (ResNets) represent a powerful type of convolutional neural network (CNN) architecture. We show consistent improvements in accuracy and learning convergence over the baseline. Our proposed approach allows us to train extremely deep networks, while the baseline shows severe optimization issues.
arXiv Detail & Related papers (2020-04-10T11:09:50Z)
Real-Time High-Performance Semantic Image Segmentation of Urban Street Scenes [98.65457534223539]
We propose a real-time high-performance DCNN-based method for robust semantic segmentation of urban street scenes. The proposed method achieves the accuracy of 73.6% and 68.0% mean Intersection over Union (mIoU) with the inference speed of 51.0 fps and 39.3 fps.
arXiv Detail & Related papers (2020-03-11T08:45:53Z)
R-FCN: Object Detection via Region-based Fully Convolutional Networks [87.62557357527861]
We present region-based, fully convolutional networks for accurate and efficient object detection. Our result is achieved at a test-time speed of 170ms per image, 2.5-20x faster than the Faster R-CNN counterpart.
arXiv Detail & Related papers (2016-05-20T15:50:11Z)

This list is automatically generated from the titles and abstracts of the papers in this site.