Comparing Facial Expression Recognition in Humans and Machines: Using
CAM, GradCAM, and Extremal Perturbation
- URL: http://arxiv.org/abs/2110.04481v1
- Date: Sat, 9 Oct 2021 06:54:41 GMT
- Title: Comparing Facial Expression Recognition in Humans and Machines: Using
CAM, GradCAM, and Extremal Perturbation
- Authors: Serin Park, Christian Wallraven
- Abstract summary: Facial expression recognition (FER) is a topic attracting significant research in both psychology and machine learning.
We compared the recognition performance and attention patterns of humans and machines during a two-alternative forced-choice FER task.
- Score: 5.025654873456756
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Facial expression recognition (FER) is a topic attracting significant
research in both psychology and machine learning with a wide range of
applications. Despite a wealth of research on human FER and considerable
progress in computational FER made possible by deep neural networks (DNNs),
comparatively less work has been done on comparing the degree to which DNNs may
be comparable to human performance. In this work, we compared the recognition
performance and attention patterns of humans and machines during a
two-alternative forced-choice FER task. Human attention was here gathered
through click data that progressively uncovered a face, whereas model attention
was obtained using three different popular techniques from explainable AI: CAM,
GradCAM and Extremal Perturbation. In both cases, performance was gathered as
percent correct. For this task, we found that humans outperformed machines
quite significantly. In terms of attention patterns, we found that Extremal
Perturbation had the best overall fit with the human attention map during the
task.
Related papers
- Offline Imitation Learning Through Graph Search and Retrieval [57.57306578140857]
Imitation learning is a powerful machine learning algorithm for a robot to acquire manipulation skills.
We propose GSR, a simple yet effective algorithm that learns from suboptimal demonstrations through Graph Search and Retrieval.
GSR can achieve a 10% to 30% higher success rate and over 30% higher proficiency compared to baselines.
arXiv Detail & Related papers (2024-07-22T06:12:21Z) - Do humans and machines have the same eyes? Human-machine perceptual
differences on image classification [8.474744196892722]
Trained computer vision models are assumed to solve vision tasks by imitating human behavior learned from training labels.
Our study first quantifies and analyzes the statistical distributions of mistakes from the two sources.
We empirically demonstrate a post-hoc human-machine collaboration that outperforms humans or machines alone.
arXiv Detail & Related papers (2023-04-18T05:09:07Z) - Extreme Image Transformations Affect Humans and Machines Differently [0.0]
Some recent artificial neural networks (ANNs) claim to model aspects of primate neural and human performance data.
We introduce a set of novel image transforms inspired by neurophysiological findings and evaluate humans and ANNs on an object recognition task.
We show that machines perform better than humans for certain transforms and struggle to perform at par with humans on others that are easy for humans.
arXiv Detail & Related papers (2022-11-30T18:12:53Z) - I am Only Happy When There is Light: The Impact of Environmental Changes
on Affective Facial Expressions Recognition [65.69256728493015]
We study the impact of different image conditions on the recognition of arousal from human facial expressions.
Our results show how the interpretation of human affective states can differ greatly in either the positive or negative direction.
arXiv Detail & Related papers (2022-10-28T16:28:26Z) - Overcoming the Domain Gap in Neural Action Representations [60.47807856873544]
3D pose data can now be reliably extracted from multi-view video sequences without manual intervention.
We propose to use it to guide the encoding of neural action representations together with a set of neural and behavioral augmentations.
To reduce the domain gap, during training, we swap neural and behavioral data across animals that seem to be performing similar actions.
arXiv Detail & Related papers (2021-12-02T12:45:46Z) - Partial success in closing the gap between human and machine vision [30.78663978510427]
A few years ago, the first CNN surpassed human performance on ImageNet.
Here we ask: Are we making progress in closing the gap between human and machine vision?
We tested human observers on a broad range of out-of-distribution (OOD) datasets.
arXiv Detail & Related papers (2021-06-14T13:23:35Z) - Gaze Perception in Humans and CNN-Based Model [66.89451296340809]
We compare how a CNN (convolutional neural network) based model of gaze and humans infer the locus of attention in images of real-world scenes.
We show that compared to the model, humans' estimates of the locus of attention are more influenced by the context of the scene.
arXiv Detail & Related papers (2021-04-17T04:52:46Z) - Where is my hand? Deep hand segmentation for visual self-recognition in
humanoid robots [129.46920552019247]
We propose the use of a Convolution Neural Network (CNN) to segment the robot hand from an image in an egocentric view.
We fine-tuned the Mask-RCNN network for the specific task of segmenting the hand of the humanoid robot Vizzy.
arXiv Detail & Related papers (2021-02-09T10:34:32Z) - Dissonance Between Human and Machine Understanding [16.32018730049208]
We present a large-scale crowdsourcing study that reveals and quantifies the dissonance between human and machine understanding.
Our findings have important implications on human-machine collaboration, considering that a long term goal in the field of artificial intelligence is to make machines capable of learning and reasoning like humans.
arXiv Detail & Related papers (2021-01-18T21:45:35Z) - Human vs. supervised machine learning: Who learns patterns faster? [0.0]
This study provides an answer to how learning performance differs between humans and machines when there is limited training data.
We have designed an experiment in which 44 humans and three different machine learning algorithms identify patterns in labeled training data and have to label instances according to the patterns they find.
arXiv Detail & Related papers (2020-11-30T13:39:26Z) - Learning to Augment Expressions for Few-shot Fine-grained Facial
Expression Recognition [98.83578105374535]
We present a novel Fine-grained Facial Expression Database - F2ED.
It includes more than 200k images with 54 facial expressions from 119 persons.
Considering the phenomenon of uneven data distribution and lack of samples is common in real-world scenarios, we evaluate several tasks of few-shot expression learning.
We propose a unified task-driven framework - Compositional Generative Adversarial Network (Comp-GAN) learning to synthesize facial images.
arXiv Detail & Related papers (2020-01-17T03:26:32Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.