A novel feature-scrambling approach reveals the capacity of
convolutional neural networks to learn spatial relations
- URL: http://arxiv.org/abs/2212.06021v1
- Date: Mon, 12 Dec 2022 16:40:29 GMT
- Title: A novel feature-scrambling approach reveals the capacity of
convolutional neural networks to learn spatial relations
- Authors: Amr Farahat, Felix Effenberger, Martin Vinck
- Abstract summary: Convolutional neural networks (CNNs) are one of the most successful computer vision systems to solve object recognition.
Yet it remains poorly understood how CNNs actually make their decisions, what the nature of their internal representations is, and how their recognition strategies differ from humans.
- Score: 0.0
- License: http://creativecommons.org/publicdomain/zero/1.0/
- Abstract: Convolutional neural networks (CNNs) are one of the most successful computer
vision systems to solve object recognition. Furthermore, CNNs have major
applications in understanding the nature of visual representations in the human
brain. Yet it remains poorly understood how CNNs actually make their decisions,
what the nature of their internal representations is, and how their recognition
strategies differ from humans. Specifically, there is a major debate about the
question of whether CNNs primarily rely on surface regularities of objects, or
whether they are capable of exploiting the spatial arrangement of features,
similar to humans. Here, we develop a novel feature-scrambling approach to
explicitly test whether CNNs use the spatial arrangement of features (i.e.
object parts) to classify objects. We combine this approach with a systematic
manipulation of effective receptive field sizes of CNNs as well as minimal
recognizable configurations (MIRCs) analysis. In contrast to much previous
literature, we provide evidence that CNNs are in fact capable of using
relatively long-range spatial relationships for object classification.
Moreover, the extent to which CNNs use spatial relationships depends heavily on
the dataset, e.g. texture vs. sketch. In fact, CNNs even use different
strategies for different classes within heterogeneous datasets (ImageNet),
suggesting CNNs have a continuous spectrum of classification strategies.
Finally, we show that CNNs learn the spatial arrangement of features only up to
an intermediate level of granularity, which suggests that intermediate rather
than global shape features provide the optimal trade-off between sensitivity
and specificity in object classification. These results provide novel insights
into the nature of CNN representations and the extent to which they rely on the
spatial arrangement of features for object classification.
Related papers
- What Can Be Learnt With Wide Convolutional Neural Networks? [69.55323565255631]
We study infinitely-wide deep CNNs in the kernel regime.
We prove that deep CNNs adapt to the spatial scale of the target function.
We conclude by computing the generalisation error of a deep CNN trained on the output of another deep CNN.
arXiv Detail & Related papers (2022-08-01T17:19:32Z) - Wider Vision: Enriching Convolutional Neural Networks via Alignment to
External Knowledge Bases [0.3867363075280543]
We aim to explain and expand CNNs models via the mirroring or alignment of CNN to an external knowledge base.
This will allow us to give a semantic context or label for each visual feature.
Our results show that in the aligned embedding space, nodes from the knowledge graph are close to the CNN feature nodes that have similar meanings.
arXiv Detail & Related papers (2021-02-22T16:00:03Z) - The Mind's Eye: Visualizing Class-Agnostic Features of CNNs [92.39082696657874]
We propose an approach to visually interpret CNN features given a set of images by creating corresponding images that depict the most informative features of a specific layer.
Our method uses a dual-objective activation and distance loss, without requiring a generator network nor modifications to the original model.
arXiv Detail & Related papers (2021-01-29T07:46:39Z) - Exploring the Interchangeability of CNN Embedding Spaces [0.5735035463793008]
We map between 10 image-classification CNNs and between 4 facial-recognition CNNs.
For CNNs trained to the same classes and sharing a common backend-logit architecture, a linear-mapping may always be calculated directly from the backend layer weights.
The implications are far-reaching, suggesting an underlying commonality between representations learned by networks designed and trained for a common task.
arXiv Detail & Related papers (2020-10-05T20:32:40Z) - IAUnet: Global Context-Aware Feature Learning for Person
Re-Identification [106.50534744965955]
IAU block enables the feature to incorporate the globally spatial, temporal, and channel context.
It is lightweight, end-to-end trainable, and can be easily plugged into existing CNNs to form IAUnet.
Experiments show that IAUnet performs favorably against state-of-the-art on both image and video reID tasks.
arXiv Detail & Related papers (2020-09-02T13:07:10Z) - Decoding CNN based Object Classifier Using Visualization [6.666597301197889]
We visualize what type of features are extracted in different convolution layers of CNN.
Visualizing heat map of activation helps us to understand how CNN classifies and localizes different objects in image.
arXiv Detail & Related papers (2020-07-15T05:01:27Z) - Teaching CNNs to mimic Human Visual Cognitive Process & regularise
Texture-Shape bias [18.003188982585737]
Recent experiments in computer vision demonstrate texture bias as the primary reason for supreme results in models employing Convolutional Neural Networks (CNNs)
It is believed that the cost function forces the CNN to take a greedy approach and develop a proclivity for local information like texture to increase accuracy, thus failing to explore any global statistics.
We propose CognitiveCNN, a new intuitive architecture, inspired from feature integration theory in psychology to utilise human interpretable feature like shape, texture, edges etc. to reconstruct, and classify the image.
arXiv Detail & Related papers (2020-06-25T22:32:54Z) - A Systematic Evaluation: Fine-Grained CNN vs. Traditional CNN
Classifiers [54.996358399108566]
We investigate the performance of the landmark general CNN classifiers, which presented top-notch results on large scale classification datasets.
We compare it against state-of-the-art fine-grained classifiers.
We show an extensive evaluation on six datasets to determine whether the fine-grained classifier is able to elevate the baseline in their experiments.
arXiv Detail & Related papers (2020-03-24T23:49:14Z) - Hold me tight! Influence of discriminative features on deep network
boundaries [63.627760598441796]
We propose a new perspective that relates dataset features to the distance of samples to the decision boundary.
This enables us to carefully tweak the position of the training samples and measure the induced changes on the boundaries of CNNs trained on large-scale vision datasets.
arXiv Detail & Related papers (2020-02-15T09:29:36Z) - Approximation and Non-parametric Estimation of ResNet-type Convolutional
Neural Networks [52.972605601174955]
We show a ResNet-type CNN can attain the minimax optimal error rates in important function classes.
We derive approximation and estimation error rates of the aformentioned type of CNNs for the Barron and H"older classes.
arXiv Detail & Related papers (2019-03-24T19:42:39Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.