Pooling Revisited: Your Receptive Field is Suboptimal
- URL: http://arxiv.org/abs/2205.15254v1
- Date: Mon, 30 May 2022 17:03:40 GMT
- Title: Pooling Revisited: Your Receptive Field is Suboptimal
- Authors: Dong-Hwan Jang, Sanghyeok Chu, Joonhyuk Kim, Bohyung Han
- Abstract summary: The size and shape of the receptive field determine how the network aggregates local information.
We propose a simple yet effective Dynamically Optimized Pooling operation, referred to as DynOPool.
Our experiments show that the models equipped with the proposed learnable resizing module outperform the baseline networks on multiple datasets in image classification and semantic segmentation.
- Score: 35.11562214480459
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The size and shape of the receptive field determine how the network
aggregates local information and affect the overall performance of a model
considerably. Many components in a neural network, such as kernel sizes and
strides for convolution and pooling operations, influence the configuration of
a receptive field. However, they still rely on hyperparameters, and the
receptive fields of existing models result in suboptimal shapes and sizes.
Hence, we propose a simple yet effective Dynamically Optimized Pooling
operation, referred to as DynOPool, which optimizes the scale factors of
feature maps end-to-end by learning the desirable size and shape of its
receptive field in each layer. Any kind of resizing modules in a deep neural
network can be replaced by the operations with DynOPool at a minimal cost.
Also, DynOPool controls the complexity of a model by introducing an additional
loss term that constrains computational cost. Our experiments show that the
models equipped with the proposed learnable resizing module outperform the
baseline networks on multiple datasets in image classification and semantic
segmentation.
Related papers
- MorphPool: Efficient Non-linear Pooling & Unpooling in CNNs [9.656707333320037]
Pooling is essentially an operation from the field of Mathematical Morphology, with max pooling as a limited special case.
In addition to pooling operations, encoder-decoder networks used for pixel-level predictions also require unpooling.
Extensive experimentation on two tasks and three large-scale datasets shows that morphological pooling and unpooling lead to improved predictive performance at much reduced parameter counts.
arXiv Detail & Related papers (2022-11-25T11:25:20Z) - Dynamic Graph Message Passing Networks for Visual Recognition [112.49513303433606]
Modelling long-range dependencies is critical for scene understanding tasks in computer vision.
A fully-connected graph is beneficial for such modelling, but its computational overhead is prohibitive.
We propose a dynamic graph message passing network, that significantly reduces the computational complexity.
arXiv Detail & Related papers (2022-09-20T14:41:37Z) - AdaPool: Exponential Adaptive Pooling for Information-Retaining
Downsampling [82.08631594071656]
Pooling layers are essential building blocks of Convolutional Neural Networks (CNNs)
We propose an adaptive and exponentially weighted pooling method named adaPool.
We demonstrate how adaPool improves the preservation of detail through a range of tasks including image and video classification and object detection.
arXiv Detail & Related papers (2021-11-01T08:50:37Z) - PnP-DETR: Towards Efficient Visual Analysis with Transformers [146.55679348493587]
Recently, DETR pioneered the solution vision tasks with transformers, it directly translates the image feature map into the object result.
Recent transformer-based image recognition model andTT show consistent efficiency gain.
arXiv Detail & Related papers (2021-09-15T01:10:30Z) - Refining activation downsampling with SoftPool [74.1840492087968]
Convolutional Neural Networks (CNNs) use pooling to decrease the size of activation maps.
We propose SoftPool: a fast and efficient method for exponentially weighted activation downsampling.
We show that SoftPool can retain more information in the reduced activation maps.
arXiv Detail & Related papers (2021-01-02T12:09:49Z) - Scaling Wide Residual Networks for Panoptic Segmentation [29.303735643858026]
Wide Residual Networks (Wide-ResNets) are a shallow but wide model variant of the Residual Networks (ResNets)
We revisit its architecture design for the recent challenging panoptic segmentation task, which aims to unify semantic segmentation and instance segmentation.
We demonstrate that such a simple scaling scheme, coupled with grid search, identifies several SWideRNets that significantly advance state-of-the-art performance on panoptic segmentation datasets in both the fast model regime and strong model regime.
arXiv Detail & Related papers (2020-11-23T19:14:11Z) - Structured Convolutions for Efficient Neural Network Design [65.36569572213027]
We tackle model efficiency by exploiting redundancy in the textitimplicit structure of the building blocks of convolutional neural networks.
We show how this decomposition can be applied to 2D and 3D kernels as well as the fully-connected layers.
arXiv Detail & Related papers (2020-08-06T04:38:38Z) - The Heterogeneity Hypothesis: Finding Layer-Wise Differentiated Network
Architectures [179.66117325866585]
We investigate a design space that is usually overlooked, i.e. adjusting the channel configurations of predefined networks.
We find that this adjustment can be achieved by shrinking widened baseline networks and leads to superior performance.
Experiments are conducted on various networks and datasets for image classification, visual tracking and image restoration.
arXiv Detail & Related papers (2020-06-29T17:59:26Z) - Split-Merge Pooling [36.2980225204665]
Split-Merge pooling is introduced to preserve spatial information without subsampling.
We evaluate our approach for dense semantic segmentation of large image sizes taken from the Cityscapes and GTA-5 datasets.
arXiv Detail & Related papers (2020-06-13T23:20:30Z) - Multi Layer Neural Networks as Replacement for Pooling Operations [13.481518628796692]
We show that one perceptron can already be used effectively as a pooling operation without increasing the complexity of the model.
We compare our approach to tensor convolution with strides as a pooling operation and show that our approach is both effective and reduces complexity.
arXiv Detail & Related papers (2020-06-12T07:08:38Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.