Studying inductive biases in image classification task
- URL: http://arxiv.org/abs/2210.17141v1
- Date: Mon, 31 Oct 2022 08:43:26 GMT
- Title: Studying inductive biases in image classification task
- Authors: Nana Arizumi
- Abstract summary: Self-attention (SA) structures have locally independent filters and can use large kernels, which contradicts the previously popular convolutional neural networks (CNNs)
We show that context awareness was the crucial property; however, large local information was not necessary to construct CA parameters.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Recently, self-attention (SA) structures became popular in computer vision
fields. They have locally independent filters and can use large kernels, which
contradicts the previously popular convolutional neural networks (CNNs). CNNs
success was attributed to the hard-coded inductive biases of locality and
spatial invariance. However, recent studies have shown that inductive biases in
CNNs are too restrictive. On the other hand, the relative position encodings,
similar to depthwise (DW) convolution, are necessary for the local SA networks,
which indicates that the SA structures are not entirely spatially variant.
Hence, we would like to determine which part of inductive biases contributes to
the success of the local SA structures. To do so, we introduced context-aware
decomposed attention (CADA), which decomposes attention maps into multiple
trainable base kernels and accumulates them using context-aware (CA)
parameters. This way, we could identify the link between the CNNs and SA
networks. We conducted ablation studies using the ResNet50 applied to the
ImageNet classification task. DW convolution could have a large locality
without increasing computational costs compared to CNNs, but the accuracy
saturates with larger kernels. CADA follows this characteristic of locality. We
showed that context awareness was the crucial property; however, large local
information was not necessary to construct CA parameters. Even though no
spatial invariance makes training difficult, more relaxed spatial invariance
gave better accuracy than strict spatial invariance. Also, additional strong
spatial invariance through relative position encoding was preferable. We
extended these experiments to filters for downsampling and showed that locality
bias is more critical for downsampling but can remove the strong locality bias
using relaxed spatial invariance.
Related papers
- Improved Generalization of Weight Space Networks via Augmentations [53.87011906358727]
Learning in deep weight spaces (DWS) is an emerging research direction, with applications to 2D and 3D neural fields (INRs, NeRFs)
We empirically analyze the reasons for this overfitting and find that a key reason is the lack of diversity in DWS datasets.
To address this, we explore strategies for data augmentation in weight spaces and propose a MixUp method adapted for weight spaces.
arXiv Detail & Related papers (2024-02-06T15:34:44Z) - CPR++: Object Localization via Single Coarse Point Supervision [55.8671776333499]
coarse point refinement (CPR) is first attempt to alleviate semantic variance from an algorithmic perspective.
CPR reduces semantic variance by selecting a semantic centre point in a neighbourhood region to replace the initial annotated point.
CPR++ can obtain scale information and further reduce the semantic variance in a global region.
arXiv Detail & Related papers (2024-01-30T17:38:48Z) - What Can Be Learnt With Wide Convolutional Neural Networks? [69.55323565255631]
We study infinitely-wide deep CNNs in the kernel regime.
We prove that deep CNNs adapt to the spatial scale of the target function.
We conclude by computing the generalisation error of a deep CNN trained on the output of another deep CNN.
arXiv Detail & Related papers (2022-08-01T17:19:32Z) - Counting Varying Density Crowds Through Density Guided Adaptive
Selection CNN and Transformer Estimation [25.050801798414263]
Human tend to locate and count the target in low-density regions, and reason the number in high-density regions.
We propose a CNN and Transformer Adaptive Selection Network (CTASNet) which can adaptively select the appropriate counting branch for different density regions.
arXiv Detail & Related papers (2022-06-21T02:05:41Z) - Rethinking Spatial Invariance of Convolutional Networks for Object
Counting [119.83017534355842]
We try to use locally connected Gaussian kernels to replace the original convolution filter to estimate the spatial position in the density map.
Inspired by previous work, we propose a low-rank approximation accompanied with translation invariance to favorably implement the approximation of massive Gaussian convolution.
Our methods significantly outperform other state-of-the-art methods and achieve promising learning of the spatial position of objects.
arXiv Detail & Related papers (2022-06-10T17:51:25Z) - SAR Despeckling Using Overcomplete Convolutional Networks [53.99620005035804]
despeckling is an important problem in remote sensing as speckle degrades SAR images.
Recent studies show that convolutional neural networks(CNNs) outperform classical despeckling methods.
This study employs an overcomplete CNN architecture to focus on learning low-level features by restricting the receptive field.
We show that the proposed network improves despeckling performance compared to recent despeckling methods on synthetic and real SAR images.
arXiv Detail & Related papers (2022-05-31T15:55:37Z) - Decentralized Local Stochastic Extra-Gradient for Variational
Inequalities [125.62877849447729]
We consider distributed variational inequalities (VIs) on domains with the problem data that is heterogeneous (non-IID) and distributed across many devices.
We make a very general assumption on the computational network that covers the settings of fully decentralized calculations.
We theoretically analyze its convergence rate in the strongly-monotone, monotone, and non-monotone settings.
arXiv Detail & Related papers (2021-06-15T17:45:51Z) - Localized convolutional neural networks for geospatial wind forecasting [0.0]
Convolutional Neural Networks (CNN) possess positive qualities when it comes to many spatial data.
In this work, we propose localized convolutional neural networks that enable CNNs to learn local features in addition to the global ones.
They can be added to any convolutional layers, easily end-to-end trained, introduce minimal additional complexity, and let CNNs retain most of their benefits to the extent that they are needed.
arXiv Detail & Related papers (2020-05-12T17:14:49Z) - Understanding when spatial transformer networks do not support
invariance, and what to do about it [0.0]
spatial transformer networks (STNs) were designed to enable convolutional neural networks (CNNs) to learn invariance to image transformations.
We show that STNs do not have the ability to align the feature maps of a transformed image with those of its original.
We investigate alternative STN architectures that make use of complex features.
arXiv Detail & Related papers (2020-04-24T12:20:35Z) - Revisiting Saliency Metrics: Farthest-Neighbor Area Under Curve [23.334584322129142]
Saliency detection has been widely studied because it plays an important role in various vision applications.
It is difficult to evaluate saliency systems because each measure has its own bias.
We propose a new saliency metric based on the AUC property, which aims at sampling a more directional negative set for evaluation.
arXiv Detail & Related papers (2020-02-24T20:55:42Z) - Revisiting Spatial Invariance with Low-Rank Local Connectivity [33.430515807834254]
We show that relaxing spatial invariance improves classification accuracy over both convolution and locally connected layers.
In experiments with small convolutional networks, we find that relaxing spatial invariance improves classification accuracy over both convolution and locally connected layers.
arXiv Detail & Related papers (2020-02-07T18:56:37Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.