Do DNNs trained on Natural Images acquire Gestalt Properties?
- URL: http://arxiv.org/abs/2203.07302v1
- Date: Mon, 14 Mar 2022 17:06:11 GMT
- Title: Do DNNs trained on Natural Images acquire Gestalt Properties?
- Authors: Valerio Biscione, Jeffrey S. Bowers
- Abstract summary: Deep Neural Networks (DNNs) trained on natural images have been proposed as compelling models of human vision.
We compared human and DNN responses in discrimination judgments.
We found that network trained on natural images exhibited sensitivity to shapes at the last stage of classification.
- Score: 0.6091702876917281
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Under some circumstances, humans tend to perceive individual elements as a
group or "whole". This has been widely investigated for more than a century by
the school of Gestalt Psychology, which formulated several laws of perceptual
grouping. Recently, Deep Neural Networks (DNNs) trained on natural images have
been proposed as compelling models of human vision based on reports that they
learn internal representations similar to the primate ventral visual stream and
show similar patterns of errors in object classification tasks. That is, DNNs
often perform well on brain and behavioral benchmarks. Here we compared human
and DNN responses in discrimination judgments that assess a range of Gestalt
organization principles (Pomerantz et al., 1977; Pomerantz and Portillo, 2011).
Amongst the DNNs tested we selected models that perform well on the Brain-Score
benchmark (Schrimpf et al., 2018). We found that network trained on natural
images exhibited sensitivity to shapes at the last stage of classification,
which in some cases matched humans responses. When shape familiarity was
controlled for (by using dot patterns that would not resemble shapes) we found
the networks were insensitive to the standard Gestalt principles of proximity,
orientation, and linearity, which have been shown to have a strong and robust
effect on humans. This shows that models that perform well on behavioral and
brain benchmarks nevertheless miss fundamental principles of human vision.
Related papers
- Evaluating Multiview Object Consistency in Humans and Image Models [68.36073530804296]
We leverage an experimental design from the cognitive sciences which requires zero-shot visual inferences about object shape.
We collect 35K trials of behavioral data from over 500 participants.
We then evaluate the performance of common vision models.
arXiv Detail & Related papers (2024-09-09T17:59:13Z) - Dimensions underlying the representational alignment of deep neural networks with humans [3.1668470116181817]
We propose a generic framework for yielding comparable representations in humans and deep neural networks (DNNs)
Applying this framework to humans and a DNN model of natural images revealed a low-dimensional DNN embedding of both visual and semantic dimensions.
In contrast to humans, DNNs exhibited a clear dominance of visual over semantic features, indicating divergent strategies for representing images.
arXiv Detail & Related papers (2024-06-27T11:14:14Z) - Divergences in Color Perception between Deep Neural Networks and Humans [3.0315685825606633]
We develop experiments for evaluating the perceptual coherence of color embeddings in deep neural networks (DNNs)
We assess how well these algorithms predict human color similarity judgments collected via an online survey.
We compare DNN performance against an interpretable and cognitively plausible model of color perception based on wavelet decomposition.
arXiv Detail & Related papers (2023-09-11T20:26:40Z) - Evaluating alignment between humans and neural network representations in image-based learning tasks [5.657101730705275]
We tested how well the representations of $86$ pretrained neural network models mapped to human learning trajectories.
We found that while training dataset size was a core determinant of alignment with human choices, contrastive training with multi-modal data (text and imagery) was a common feature of currently publicly available models that predicted human generalisation.
In conclusion, pretrained neural networks can serve to extract representations for cognitive models, as they appear to capture some fundamental aspects of cognition that are transferable across tasks.
arXiv Detail & Related papers (2023-06-15T08:18:29Z) - Guiding Visual Attention in Deep Convolutional Neural Networks Based on
Human Eye Movements [0.0]
Deep Convolutional Neural Networks (DCNNs) were originally inspired by principles of biological vision.
Recent advances in deep learning seem to decrease this similarity.
We investigate a purely data-driven approach to obtain useful models.
arXiv Detail & Related papers (2022-06-21T17:59:23Z) - Prune and distill: similar reformatting of image information along rat
visual cortex and deep neural networks [61.60177890353585]
Deep convolutional neural networks (CNNs) have been shown to provide excellent models for its functional analogue in the brain, the ventral stream in visual cortex.
Here we consider some prominent statistical patterns that are known to exist in the internal representations of either CNNs or the visual cortex.
We show that CNNs and visual cortex share a similarly tight relationship between dimensionality expansion/reduction of object representations and reformatting of image information.
arXiv Detail & Related papers (2022-05-27T08:06:40Z) - MetaAvatar: Learning Animatable Clothed Human Models from Few Depth
Images [60.56518548286836]
To generate realistic cloth deformations from novel input poses, watertight meshes or dense full-body scans are usually needed as inputs.
We propose an approach that can quickly generate realistic clothed human avatars, represented as controllable neural SDFs, given only monocular depth images.
arXiv Detail & Related papers (2021-06-22T17:30:12Z) - Gaze Perception in Humans and CNN-Based Model [66.89451296340809]
We compare how a CNN (convolutional neural network) based model of gaze and humans infer the locus of attention in images of real-world scenes.
We show that compared to the model, humans' estimates of the locus of attention are more influenced by the context of the scene.
arXiv Detail & Related papers (2021-04-17T04:52:46Z) - Assessing The Importance Of Colours For CNNs In Object Recognition [70.70151719764021]
Convolutional neural networks (CNNs) have been shown to exhibit conflicting properties.
We demonstrate that CNNs often rely heavily on colour information while making a prediction.
We evaluate a model trained with congruent images on congruent, greyscale, and incongruent images.
arXiv Detail & Related papers (2020-12-12T22:55:06Z) - Fooling the primate brain with minimal, targeted image manipulation [67.78919304747498]
We propose an array of methods for creating minimal, targeted image perturbations that lead to changes in both neuronal activity and perception as reflected in behavior.
Our work shares the same goal with adversarial attack, namely the manipulation of images with minimal, targeted noise that leads ANN models to misclassify the images.
arXiv Detail & Related papers (2020-11-11T08:30:54Z) - Seeing eye-to-eye? A comparison of object recognition performance in
humans and deep convolutional neural networks under image manipulation [0.0]
This study aims towards a behavioral comparison of visual core object recognition performance between humans and feedforward neural networks.
Analyses of accuracy revealed that humans not only outperform DCNNs on all conditions, but also display significantly greater robustness towards shape and most notably color alterations.
arXiv Detail & Related papers (2020-07-13T10:26:30Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.