Divergences in Color Perception between Deep Neural Networks and Humans
- URL: http://arxiv.org/abs/2309.05809v1
- Date: Mon, 11 Sep 2023 20:26:40 GMT
- Title: Divergences in Color Perception between Deep Neural Networks and Humans
- Authors: Ethan O. Nadler, Elise Darragh-Ford, Bhargav Srinivasa Desikan,
Christian Conaway, Mark Chu, Tasker Hull, Douglas Guilbeault
- Abstract summary: We develop experiments for evaluating the perceptual coherence of color embeddings in deep neural networks (DNNs)
We assess how well these algorithms predict human color similarity judgments collected via an online survey.
We compare DNN performance against an interpretable and cognitively plausible model of color perception based on wavelet decomposition.
- Score: 3.0315685825606633
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Deep neural networks (DNNs) are increasingly proposed as models of human
vision, bolstered by their impressive performance on image classification and
object recognition tasks. Yet, the extent to which DNNs capture fundamental
aspects of human vision such as color perception remains unclear. Here, we
develop novel experiments for evaluating the perceptual coherence of color
embeddings in DNNs, and we assess how well these algorithms predict human color
similarity judgments collected via an online survey. We find that
state-of-the-art DNN architectures $-$ including convolutional neural networks
and vision transformers $-$ provide color similarity judgments that strikingly
diverge from human color judgments of (i) images with controlled color
properties, (ii) images generated from online searches, and (iii) real-world
images from the canonical CIFAR-10 dataset. We compare DNN performance against
an interpretable and cognitively plausible model of color perception based on
wavelet decomposition, inspired by foundational theories in computational
neuroscience. While one deep learning model $-$ a convolutional DNN trained on
a style transfer task $-$ captures some aspects of human color perception, our
wavelet algorithm provides more coherent color embeddings that better predict
human color judgments compared to all DNNs we examine. These results hold when
altering the high-level visual task used to train similar DNN architectures
(e.g., image classification versus image segmentation), as well as when
examining the color embeddings of different layers in a given DNN architecture.
These findings break new ground in the effort to analyze the perceptual
representations of machine learning algorithms and to improve their ability to
serve as cognitively plausible models of human vision. Implications for machine
learning, human perception, and embodied cognition are discussed.
Related papers
- Dimensions underlying the representational alignment of deep neural networks with humans [3.1668470116181817]
We propose a generic framework for yielding comparable representations in humans and deep neural networks (DNNs)
Applying this framework to humans and a DNN model of natural images revealed a low-dimensional DNN embedding of both visual and semantic dimensions.
In contrast to humans, DNNs exhibited a clear dominance of visual over semantic features, indicating divergent strategies for representing images.
arXiv Detail & Related papers (2024-06-27T11:14:14Z) - Color Equivariant Convolutional Networks [50.655443383582124]
CNNs struggle if there is data imbalance between color variations introduced by accidental recording conditions.
We propose Color Equivariant Convolutions ( CEConvs), a novel deep learning building block that enables shape feature sharing across the color spectrum.
We demonstrate the benefits of CEConvs in terms of downstream performance to various tasks and improved robustness to color changes, including train-test distribution shifts.
arXiv Detail & Related papers (2023-10-30T09:18:49Z) - Harmonizing the object recognition strategies of deep neural networks
with humans [10.495114898741205]
We show that state-of-the-art deep neural networks (DNNs) are becoming less aligned with humans as their accuracy improves.
Our work represents the first demonstration that the scaling laws that are guiding the design of DNNs today have also produced worse models of human vision.
arXiv Detail & Related papers (2022-11-08T20:03:49Z) - A domain adaptive deep learning solution for scanpath prediction of
paintings [66.46953851227454]
This paper focuses on the eye-movement analysis of viewers during the visual experience of a certain number of paintings.
We introduce a new approach to predicting human visual attention, which impacts several cognitive functions for humans.
The proposed new architecture ingests images and returns scanpaths, a sequence of points featuring a high likelihood of catching viewers' attention.
arXiv Detail & Related papers (2022-09-22T22:27:08Z) - Impact of Colour Variation on Robustness of Deep Neural Networks [0.0]
Deep neural networks (DNNs) have shown state-of-the-art performance for computer vision applications like image classification, segmentation and object detection.
Recent advances have shown their vulnerability to manual digital perturbations in the input data, namely adversarial attacks.
In this work, we propose a color-variation dataset by distorting their RGB color on a subset of the ImageNet with 27 different combinations.
arXiv Detail & Related papers (2022-09-02T08:16:04Z) - Learning to Structure an Image with Few Colors and Beyond [59.34619548026885]
We propose a color quantization network, ColorCNN, which learns to structure an image in limited color spaces by minimizing the classification loss.
We introduce ColorCNN+, which supports multiple color space size configurations, and addresses the previous issues of poor recognition accuracy and undesirable visual fidelity under large color spaces.
For potential applications, we show that ColorCNNs can be used as image compression methods for network recognition.
arXiv Detail & Related papers (2022-08-17T17:59:15Z) - Prune and distill: similar reformatting of image information along rat
visual cortex and deep neural networks [61.60177890353585]
Deep convolutional neural networks (CNNs) have been shown to provide excellent models for its functional analogue in the brain, the ventral stream in visual cortex.
Here we consider some prominent statistical patterns that are known to exist in the internal representations of either CNNs or the visual cortex.
We show that CNNs and visual cortex share a similarly tight relationship between dimensionality expansion/reduction of object representations and reformatting of image information.
arXiv Detail & Related papers (2022-05-27T08:06:40Z) - Comparing object recognition in humans and deep convolutional neural
networks -- An eye tracking study [7.222232547612573]
Deep convolutional neural networks (DCNNs) and the ventral visual pathway share vast architectural and functional similarities.
We demonstrate a comparison of human observers (N = 45) and three feedforward DCNNs through eye tracking and saliency maps.
A DCNN with biologically plausible receptive field sizes called vNet reveals higher agreement with human viewing behavior as contrasted with a standard ResNet architecture.
arXiv Detail & Related papers (2021-07-30T23:32:05Z) - Assessing The Importance Of Colours For CNNs In Object Recognition [70.70151719764021]
Convolutional neural networks (CNNs) have been shown to exhibit conflicting properties.
We demonstrate that CNNs often rely heavily on colour information while making a prediction.
We evaluate a model trained with congruent images on congruent, greyscale, and incongruent images.
arXiv Detail & Related papers (2020-12-12T22:55:06Z) - Continuous Emotion Recognition with Spatiotemporal Convolutional Neural
Networks [82.54695985117783]
We investigate the suitability of state-of-the-art deep learning architectures for continuous emotion recognition using long video sequences captured in-the-wild.
We have developed and evaluated convolutional recurrent neural networks combining 2D-CNNs and long short term-memory units, and inflated 3D-CNN models, which are built by inflating the weights of a pre-trained 2D-CNN model during fine-tuning.
arXiv Detail & Related papers (2020-11-18T13:42:05Z) - Seeing eye-to-eye? A comparison of object recognition performance in
humans and deep convolutional neural networks under image manipulation [0.0]
This study aims towards a behavioral comparison of visual core object recognition performance between humans and feedforward neural networks.
Analyses of accuracy revealed that humans not only outperform DCNNs on all conditions, but also display significantly greater robustness towards shape and most notably color alterations.
arXiv Detail & Related papers (2020-07-13T10:26:30Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.