Wise-SrNet: A Novel Architecture for Enhancing Image Classification by
Learning Spatial Resolution of Feature Maps
- URL: http://arxiv.org/abs/2104.12294v3
- Date: Mon, 11 Mar 2024 14:38:54 GMT
- Title: Wise-SrNet: A Novel Architecture for Enhancing Image Classification by
Learning Spatial Resolution of Feature Maps
- Authors: Mohammad Rahimzadeh, AmirAli Askari, Soroush Parvin, Elnaz Safi,
Mohammad Reza Mohammadi
- Abstract summary: One of the main challenges since the advancement of convolutional neural networks is how to connect the extracted feature map to the final classification layer.
In this paper, we aim to tackle this problem by replacing the GAP layer with a new architecture called Wise-SrNet.
It is inspired by the depthwise convolutional idea and is designed for processing spatial resolution while not increasing computational cost.
- Score: 0.5892638927736115
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: One of the main challenges since the advancement of convolutional neural
networks is how to connect the extracted feature map to the final
classification layer. VGG models used two sets of fully connected layers for
the classification part of their architectures, which significantly increased
the number of models' weights. ResNet and the next deep convolutional models
used the Global Average Pooling (GAP) layer to compress the feature map and
feed it to the classification layer. Although using the GAP layer reduces the
computational cost, but also causes losing spatial resolution of the feature
map, which results in decreasing learning efficiency. In this paper, we aim to
tackle this problem by replacing the GAP layer with a new architecture called
Wise-SrNet. It is inspired by the depthwise convolutional idea and is designed
for processing spatial resolution while not increasing computational cost. We
have evaluated our method using three different datasets: Intel Image
Classification Challenge, MIT Indoors Scenes, and a part of the ImageNet
dataset. We investigated the implementation of our architecture on several
models of the Inception, ResNet, and DenseNet families. Applying our
architecture has revealed a significant effect on increasing convergence speed
and accuracy. Our Experiments on images with 224*224 resolution increased the
Top-1 accuracy between 2% to 8% on different datasets and models. Running our
models on 512*512 resolution images of the MIT Indoors Scenes dataset showed a
notable result of improving the Top-1 accuracy within 3% to 26%. We will also
demonstrate the GAP layer's disadvantage when the input images are large and
the number of classes is not few. In this circumstance, our proposed
architecture can do a great help in enhancing classification results. The code
is shared at https://github.com/mr7495/image-classification-spatial.
Related papers
- Parameter-Inverted Image Pyramid Networks [49.35689698870247]
We propose a novel network architecture known as the Inverted Image Pyramid Networks (PIIP)
Our core idea is to use models with different parameter sizes to process different resolution levels of the image pyramid.
PIIP achieves superior performance in tasks such as object detection, segmentation, and image classification.
arXiv Detail & Related papers (2024-06-06T17:59:10Z) - Additional Look into GAN-based Augmentation for Deep Learning COVID-19
Image Classification [57.1795052451257]
We study the dependence of the GAN-based augmentation performance on dataset size with a focus on small samples.
We train StyleGAN2-ADA with both sets and then, after validating the quality of generated images, we use trained GANs as one of the augmentations approaches in multi-class classification problems.
The GAN-based augmentation approach is found to be comparable with classical augmentation in the case of medium and large datasets but underperforms in the case of smaller datasets.
arXiv Detail & Related papers (2024-01-26T08:28:13Z) - LR-Net: A Block-based Convolutional Neural Network for Low-Resolution
Image Classification [0.0]
We develop a novel image classification architecture, composed of blocks that are designed to learn both low level and global features from noisy and low-resolution images.
Our design of the blocks was heavily influenced by Residual Connections and Inception modules in order to increase performance and reduce parameter sizes.
We have performed in-depth tests that demonstrate the presented architecture is faster and more accurate than existing cutting-edge convolutional neural networks.
arXiv Detail & Related papers (2022-07-19T20:01:11Z) - Deep Learning Based Automated COVID-19 Classification from Computed
Tomography Images [0.0]
The paper presents a Convolutional Neural Networks (CNN) model for image classification, aiming at increasing predictive performance for COVID-19 diagnosis.
This work proposes a less complex solution based on simply classifying 2D CT-Scan slices of images using their pixels via a 2D CNN model.
Despite the simplicity in architecture, the proposed model showed improved quantitative results exceeding state-of-the-art on the same dataset of images.
arXiv Detail & Related papers (2021-11-22T13:35:10Z) - Convolutional Neural Networks from Image Markers [62.997667081978825]
Feature Learning from Image Markers (FLIM) was recently proposed to estimate convolutional filters, with no backpropagation, from strokes drawn by a user on very few images.
This paper extends FLIM for fully connected layers and demonstrates it on different image classification problems.
The results show that FLIM-based convolutional neural networks can outperform the same architecture trained from scratch by backpropagation.
arXiv Detail & Related papers (2020-12-15T22:58:23Z) - DenserNet: Weakly Supervised Visual Localization Using Multi-scale
Feature Aggregation [7.2531609092488445]
We develop a convolutional neural network architecture which aggregates feature maps at different semantic levels for image representations.
Second, our model is trained end-to-end without pixel-level annotation other than positive and negative GPS-tagged image pairs.
Third, our method is computationally efficient as our architecture has shared features and parameters during computation.
arXiv Detail & Related papers (2020-12-04T02:16:47Z) - KiU-Net: Overcomplete Convolutional Architectures for Biomedical Image
and Volumetric Segmentation [71.79090083883403]
"Traditional" encoder-decoder based approaches perform poorly in detecting smaller structures and are unable to segment boundary regions precisely.
We propose KiU-Net which has two branches: (1) an overcomplete convolutional network Kite-Net which learns to capture fine details and accurate edges of the input, and (2) U-Net which learns high level features.
The proposed method achieves a better performance as compared to all the recent methods with an additional benefit of fewer parameters and faster convergence.
arXiv Detail & Related papers (2020-10-04T19:23:33Z) - Multi-scale Attention U-Net (MsAUNet): A Modified U-Net Architecture for
Scene Segmentation [1.713291434132985]
We propose a novel multi-scale attention network for scene segmentation by using contextual information from an image.
This network can map local features with their global counterparts with improved accuracy and emphasize on discriminative image regions.
We have evaluated our model on two standard datasets named PascalVOC2012 and ADE20k.
arXiv Detail & Related papers (2020-09-15T08:03:41Z) - Road Segmentation for Remote Sensing Images using Adversarial Spatial
Pyramid Networks [28.32775611169636]
We introduce a new model to apply structured domain adaption for synthetic image generation and road segmentation.
A novel scale-wise architecture is introduced to learn from the multi-level feature maps and improve the semantics of the features.
Our model achieves state-of-the-art 78.86 IOU on the Massachusetts dataset with 14.89M parameters and 86.78B FLOPs, with 4x fewer FLOPs but higher accuracy (+3.47% IOU)
arXiv Detail & Related papers (2020-08-10T11:00:19Z) - When Residual Learning Meets Dense Aggregation: Rethinking the
Aggregation of Deep Neural Networks [57.0502745301132]
We propose Micro-Dense Nets, a novel architecture with global residual learning and local micro-dense aggregations.
Our micro-dense block can be integrated with neural architecture search based models to boost their performance.
arXiv Detail & Related papers (2020-04-19T08:34:52Z) - Improved Residual Networks for Image and Video Recognition [98.10703825716142]
Residual networks (ResNets) represent a powerful type of convolutional neural network (CNN) architecture.
We show consistent improvements in accuracy and learning convergence over the baseline.
Our proposed approach allows us to train extremely deep networks, while the baseline shows severe optimization issues.
arXiv Detail & Related papers (2020-04-10T11:09:50Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.