Perception Over Time: Temporal Dynamics for Robust Image Understanding
- URL: http://arxiv.org/abs/2203.06254v1
- Date: Fri, 11 Mar 2022 21:11:59 GMT
- Title: Perception Over Time: Temporal Dynamics for Robust Image Understanding
- Authors: Maryam Daniali, Edward Kim
- Abstract summary: Deep learning surpasses human-level performance in narrow and specific vision tasks.
Human visual perception is orders of magnitude more robust to changes in the input stimulus.
We introduce a novel method of incorporating temporal dynamics into static image understanding.
- Score: 5.584060970507506
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: While deep learning surpasses human-level performance in narrow and specific
vision tasks, it is fragile and over-confident in classification. For example,
minor transformations in perspective, illumination, or object deformation in
the image space can result in drastically different labeling, which is
especially transparent via adversarial perturbations. On the other hand, human
visual perception is orders of magnitude more robust to changes in the input
stimulus. But unfortunately, we are far from fully understanding and
integrating the underlying mechanisms that result in such robust perception. In
this work, we introduce a novel method of incorporating temporal dynamics into
static image understanding. We describe a neuro-inspired method that decomposes
a single image into a series of coarse-to-fine images that simulates how
biological vision integrates information over time. Next, we demonstrate how
our novel visual perception framework can utilize this information "over time"
using a biologically plausible algorithm with recurrent units, and as a result,
significantly improving its accuracy and robustness over standard CNNs. We also
compare our proposed approach with state-of-the-art models and explicitly
quantify our adversarial robustness properties through multiple ablation
studies. Our quantitative and qualitative results convincingly demonstrate
exciting and transformative improvements over the standard computer vision and
deep learning architectures used today.
Related papers
- Connectivity-Inspired Network for Context-Aware Recognition [1.049712834719005]
We focus on the effect of incorporating circuit motifs found in biological brains to address visual recognition.
Our convolutional architecture is inspired by the connectivity of human cortical and subcortical streams.
We present a new plug-and-play module to model context awareness.
arXiv Detail & Related papers (2024-09-06T15:42:10Z) - Degraded Polygons Raise Fundamental Questions of Neural Network Perception [5.423100066629618]
We revisit the task of recovering images under degradation, first introduced over 30 years ago in the Recognition-by-Components theory of human vision.
We implement the Automated Shape Recoverability Test for rapidly generating large-scale datasets of perimeter-degraded regular polygons.
We find that neural networks' behavior on this simple task conflicts with human behavior.
arXiv Detail & Related papers (2023-06-08T06:02:39Z) - Understanding Self-Predictive Learning for Reinforcement Learning [61.62067048348786]
We study the learning dynamics of self-predictive learning for reinforcement learning.
We propose a novel self-predictive algorithm that learns two representations simultaneously.
arXiv Detail & Related papers (2022-12-06T20:43:37Z) - Reconstruction-guided attention improves the robustness and shape
processing of neural networks [5.156484100374057]
We build an iterative encoder-decoder network that generates an object reconstruction and uses it as top-down attentional feedback.
Our model shows strong generalization performance against various image perturbations.
Our study shows that modeling reconstruction-based feedback endows AI systems with a powerful attention mechanism.
arXiv Detail & Related papers (2022-09-27T18:32:22Z) - A domain adaptive deep learning solution for scanpath prediction of
paintings [66.46953851227454]
This paper focuses on the eye-movement analysis of viewers during the visual experience of a certain number of paintings.
We introduce a new approach to predicting human visual attention, which impacts several cognitive functions for humans.
The proposed new architecture ingests images and returns scanpaths, a sequence of points featuring a high likelihood of catching viewers' attention.
arXiv Detail & Related papers (2022-09-22T22:27:08Z) - Human Eyes Inspired Recurrent Neural Networks are More Robust Against Adversarial Noises [7.689542442882423]
We designed a dual-stream vision model inspired by the human brain.
This model features retina-like input layers and includes two streams: one determining the next point of focus (the fixation), while the other interprets the visuals surrounding the fixation.
We evaluated this model against various benchmarks in terms of object recognition, gaze behavior and adversarial robustness.
arXiv Detail & Related papers (2022-06-15T03:44:42Z) - Multimodal perception for dexterous manipulation [14.314776558032166]
We propose a cross-modal sensory data generation framework for the translation between vision and touch.
We propose a-temporal attention model for tactile texture recognition, which takes both spatial features and time dimension into consideration.
arXiv Detail & Related papers (2021-12-28T21:20:26Z) - Fast Training of Neural Lumigraph Representations using Meta Learning [109.92233234681319]
We develop a new neural rendering approach with the goal of quickly learning a high-quality representation which can also be rendered in real-time.
Our approach, MetaNLR++, accomplishes this by using a unique combination of a neural shape representation and 2D CNN-based image feature extraction, aggregation, and re-projection.
We show that MetaNLR++ achieves similar or better photorealistic novel view synthesis results in a fraction of the time that competing methods require.
arXiv Detail & Related papers (2021-06-28T18:55:50Z) - Causal Navigation by Continuous-time Neural Networks [108.84958284162857]
We propose a theoretical and experimental framework for learning causal representations using continuous-time neural networks.
We evaluate our method in the context of visual-control learning of drones over a series of complex tasks.
arXiv Detail & Related papers (2021-06-15T17:45:32Z) - Learning Temporal Dynamics from Cycles in Narrated Video [85.89096034281694]
We propose a self-supervised solution to the problem of learning to model how the world changes as time elapses.
Our model learns modality-agnostic functions to predict forward and backward in time, which must undo each other when composed.
We apply the learned dynamics model without further training to various tasks, such as predicting future action and temporally ordering sets of images.
arXiv Detail & Related papers (2021-01-07T02:41:32Z) - Limited-angle tomographic reconstruction of dense layered objects by
dynamical machine learning [68.9515120904028]
Limited-angle tomography of strongly scattering quasi-transparent objects is a challenging, highly ill-posed problem.
Regularizing priors are necessary to reduce artifacts by improving the condition of such problems.
We devised a recurrent neural network (RNN) architecture with a novel split-convolutional gated recurrent unit (SC-GRU) as the building block.
arXiv Detail & Related papers (2020-07-21T11:48:22Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.