Seeing eye-to-eye? A comparison of object recognition performance in
humans and deep convolutional neural networks under image manipulation
- URL: http://arxiv.org/abs/2007.06294v2
- Date: Sun, 13 Dec 2020 11:08:45 GMT
- Title: Seeing eye-to-eye? A comparison of object recognition performance in
humans and deep convolutional neural networks under image manipulation
- Authors: Leonard E. van Dyck and Walter R. Gruber
- Abstract summary: This study aims towards a behavioral comparison of visual core object recognition performance between humans and feedforward neural networks.
Analyses of accuracy revealed that humans not only outperform DCNNs on all conditions, but also display significantly greater robustness towards shape and most notably color alterations.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: For a considerable time, deep convolutional neural networks (DCNNs) have
reached human benchmark performance in object recognition. On that account,
computational neuroscience and the field of machine learning have started to
attribute numerous similarities and differences to artificial and biological
vision. This study aims towards a behavioral comparison of visual core object
recognition performance between humans and feedforward neural networks in a
classification learning paradigm on an ImageNet data set. For this purpose,
human participants (n = 65) competed in an online experiment against different
feedforward DCNNs. The designed approach based on a typical learning process of
seven different monkey categories included a training and validation phase with
natural examples, as well as a testing phase with novel, unexperienced shape
and color manipulations. Analyses of accuracy revealed that humans not only
outperform DCNNs on all conditions, but also display significantly greater
robustness towards shape and most notably color alterations. Furthermore, a
precise examination of behavioral patterns highlights these findings by
revealing independent classification errors between the groups. The obtained
results show that humans contrast strongly with artificial feedforward
architectures when it comes to visual core object recognition of manipulated
images. In general, these findings are in line with a growing body of
literature, that hints towards recurrence as a crucial factor for adequate
generalization abilities.
Related papers
- Comparing supervised learning dynamics: Deep neural networks match human data efficiency but show a generalisation lag [3.0333265803394993]
Recent research has seen many behavioral comparisons between humans and deep neural networks (DNNs) in the domain of image classification.
Here we report a detailed investigation of the learning dynamics in human observers and various classic and state-of-the-art DNNs.
Across the whole learning process we evaluate and compare how well learned representations can be generalized to previously unseen test data.
arXiv Detail & Related papers (2024-02-14T16:47:20Z) - Divergences in Color Perception between Deep Neural Networks and Humans [3.0315685825606633]
We develop experiments for evaluating the perceptual coherence of color embeddings in deep neural networks (DNNs)
We assess how well these algorithms predict human color similarity judgments collected via an online survey.
We compare DNN performance against an interpretable and cognitively plausible model of color perception based on wavelet decomposition.
arXiv Detail & Related papers (2023-09-11T20:26:40Z) - Multi-Domain Norm-referenced Encoding Enables Data Efficient Transfer
Learning of Facial Expression Recognition [62.997667081978825]
We propose a biologically-inspired mechanism for transfer learning in facial expression recognition.
Our proposed architecture provides an explanation for how the human brain might innately recognize facial expressions on varying head shapes.
Our model achieves a classification accuracy of 92.15% on the FERG dataset with extreme data efficiency.
arXiv Detail & Related papers (2023-04-05T09:06:30Z) - Connecting metrics for shape-texture knowledge in computer vision [1.7785095623975342]
Deep neural networks remain brittle and susceptible to many changes in the image that do not cause humans to misclassify images.
Part of this different behavior may be explained by the type of features humans and deep neural networks use in vision tasks.
arXiv Detail & Related papers (2023-01-25T14:37:42Z) - Guiding Visual Attention in Deep Convolutional Neural Networks Based on
Human Eye Movements [0.0]
Deep Convolutional Neural Networks (DCNNs) were originally inspired by principles of biological vision.
Recent advances in deep learning seem to decrease this similarity.
We investigate a purely data-driven approach to obtain useful models.
arXiv Detail & Related papers (2022-06-21T17:59:23Z) - Prune and distill: similar reformatting of image information along rat
visual cortex and deep neural networks [61.60177890353585]
Deep convolutional neural networks (CNNs) have been shown to provide excellent models for its functional analogue in the brain, the ventral stream in visual cortex.
Here we consider some prominent statistical patterns that are known to exist in the internal representations of either CNNs or the visual cortex.
We show that CNNs and visual cortex share a similarly tight relationship between dimensionality expansion/reduction of object representations and reformatting of image information.
arXiv Detail & Related papers (2022-05-27T08:06:40Z) - Data-driven emergence of convolutional structure in neural networks [83.4920717252233]
We show how fully-connected neural networks solving a discrimination task can learn a convolutional structure directly from their inputs.
By carefully designing data models, we show that the emergence of this pattern is triggered by the non-Gaussian, higher-order local structure of the inputs.
arXiv Detail & Related papers (2022-02-01T17:11:13Z) - Overcoming the Domain Gap in Neural Action Representations [60.47807856873544]
3D pose data can now be reliably extracted from multi-view video sequences without manual intervention.
We propose to use it to guide the encoding of neural action representations together with a set of neural and behavioral augmentations.
To reduce the domain gap, during training, we swap neural and behavioral data across animals that seem to be performing similar actions.
arXiv Detail & Related papers (2021-12-02T12:45:46Z) - Comparing object recognition in humans and deep convolutional neural
networks -- An eye tracking study [7.222232547612573]
Deep convolutional neural networks (DCNNs) and the ventral visual pathway share vast architectural and functional similarities.
We demonstrate a comparison of human observers (N = 45) and three feedforward DCNNs through eye tracking and saliency maps.
A DCNN with biologically plausible receptive field sizes called vNet reveals higher agreement with human viewing behavior as contrasted with a standard ResNet architecture.
arXiv Detail & Related papers (2021-07-30T23:32:05Z) - Continuous Emotion Recognition with Spatiotemporal Convolutional Neural
Networks [82.54695985117783]
We investigate the suitability of state-of-the-art deep learning architectures for continuous emotion recognition using long video sequences captured in-the-wild.
We have developed and evaluated convolutional recurrent neural networks combining 2D-CNNs and long short term-memory units, and inflated 3D-CNN models, which are built by inflating the weights of a pre-trained 2D-CNN model during fine-tuning.
arXiv Detail & Related papers (2020-11-18T13:42:05Z) - Fooling the primate brain with minimal, targeted image manipulation [67.78919304747498]
We propose an array of methods for creating minimal, targeted image perturbations that lead to changes in both neuronal activity and perception as reflected in behavior.
Our work shares the same goal with adversarial attack, namely the manipulation of images with minimal, targeted noise that leads ANN models to misclassify the images.
arXiv Detail & Related papers (2020-11-11T08:30:54Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.