Humans Beat Deep Networks at Recognizing Objects in Unusual Poses, Given
Enough Time
- URL: http://arxiv.org/abs/2402.03973v1
- Date: Tue, 6 Feb 2024 13:06:14 GMT
- Title: Humans Beat Deep Networks at Recognizing Objects in Unusual Poses, Given
Enough Time
- Authors: Netta Ollikka, Amro Abbas, Andrea Perin, Markku Kilpel\"ainen,
St\'ephane Deny
- Abstract summary: Humans excel at recognizing objects in unusual poses, in contrast with state-of-the-art pretrained networks.
As we limit image exposure time, human performance degrades to the level of deep networks.
Even time-limited humans are dissimilar to feed-forward deep networks.
- Score: 1.6874375111244329
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Deep learning is closing the gap with humans on several object recognition
benchmarks. Here we investigate this gap in the context of challenging images
where objects are seen from unusual viewpoints. We find that humans excel at
recognizing objects in unusual poses, in contrast with state-of-the-art
pretrained networks (EfficientNet, SWAG, ViT, SWIN, BEiT, ConvNext) which are
systematically brittle in this condition. Remarkably, as we limit image
exposure time, human performance degrades to the level of deep networks,
suggesting that additional mental processes (requiring additional time) take
place when humans identify objects in unusual poses. Finally, our analysis of
error patterns of humans vs. networks reveals that even time-limited humans are
dissimilar to feed-forward deep networks. We conclude that more work is needed
to bring computer vision systems to the level of robustness of the human visual
system. Understanding the nature of the mental processes taking place during
extra viewing time may be key to attain such robustness.
Related papers
- Degraded Polygons Raise Fundamental Questions of Neural Network
Perception [0.0]
We revisit the task of recovering images under degradation, first introduced over 30 years ago in the Recognition-by-Components theory of human vision.
We implement the Automated Shape Recoverability Test for rapidly generating large-scale datasets of perimeter-degraded regular polygons.
We find that neural networks' behavior on this simple task conflicts with human behavior.
arXiv Detail & Related papers (2023-06-08T06:02:39Z) - Scene-aware Egocentric 3D Human Pose Estimation [72.57527706631964]
Egocentric 3D human pose estimation with a single head-mounted fisheye camera has recently attracted attention due to its numerous applications in virtual and augmented reality.
Existing methods still struggle in challenging poses where the human body is highly occluded or is closely interacting with the scene.
We propose a scene-aware egocentric pose estimation method that guides the prediction of the egocentric pose with scene constraints.
arXiv Detail & Related papers (2022-12-20T21:35:39Z) - A Brief Survey on Person Recognition at a Distance [46.47338660858037]
Person recognition at a distance entails recognizing the identity of an individual appearing in images or videos collected by long-range imaging systems such as drones or surveillance cameras.
Despite recent advances in deep convolutional neural networks (DCNNs), this remains challenging.
arXiv Detail & Related papers (2022-12-17T22:15:10Z) - Robustness of Humans and Machines on Object Recognition with Extreme
Image Transformations [0.0]
We introduce a novel set of image transforms and evaluate humans and networks on an object recognition task.
We found performance for a few common networks quickly decreases while humans are able to recognize objects with a high accuracy.
arXiv Detail & Related papers (2022-05-09T17:15:54Z) - Ultrafast Image Categorization in Biology and Neural Models [0.0]
We re-trained the standard VGG 16 CNN on two independent tasks that are ecologically relevant to humans.
We show that re-training the network achieves a human-like level of performance, comparable to that reported in psychophysical tasks.
arXiv Detail & Related papers (2022-05-07T11:19:40Z) - Continuous Emotion Recognition with Spatiotemporal Convolutional Neural
Networks [82.54695985117783]
We investigate the suitability of state-of-the-art deep learning architectures for continuous emotion recognition using long video sequences captured in-the-wild.
We have developed and evaluated convolutional recurrent neural networks combining 2D-CNNs and long short term-memory units, and inflated 3D-CNN models, which are built by inflating the weights of a pre-trained 2D-CNN model during fine-tuning.
arXiv Detail & Related papers (2020-11-18T13:42:05Z) - Self-Supervised Viewpoint Learning From Image Collections [116.56304441362994]
We propose a novel learning framework which incorporates an analysis-by-synthesis paradigm to reconstruct images in a viewpoint aware manner.
We show that our approach performs competitively to fully-supervised approaches for several object categories like human faces, cars, buses, and trains.
arXiv Detail & Related papers (2020-04-03T22:01:41Z) - TimeConvNets: A Deep Time Windowed Convolution Neural Network Design for
Real-time Video Facial Expression Recognition [93.0013343535411]
This study explores a novel deep time windowed convolutional neural network design (TimeConvNets) for the purpose of real-time video facial expression recognition.
We show that TimeConvNets can better capture the transient nuances of facial expressions and boost classification accuracy while maintaining a low inference time.
arXiv Detail & Related papers (2020-03-03T20:58:52Z) - Learning Depth With Very Sparse Supervision [57.911425589947314]
This paper explores the idea that perception gets coupled to 3D properties of the world via interaction with the environment.
We train a specialized global-local network architecture with what would be available to a robot interacting with the environment.
Experiments on several datasets show that, when ground truth is available even for just one of the image pixels, the proposed network can learn monocular dense depth estimation up to 22.5% more accurately than state-of-the-art approaches.
arXiv Detail & Related papers (2020-03-02T10:44:13Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.