Visual Attention Emerges from Recurrent Sparse Reconstruction
- URL: http://arxiv.org/abs/2204.10962v1
- Date: Sat, 23 Apr 2022 00:35:02 GMT
- Title: Visual Attention Emerges from Recurrent Sparse Reconstruction
- Authors: Baifeng Shi, Yale Song, Neel Joshi, Trevor Darrell, Xin Wang
- Abstract summary: We present a new attention formulation built on two prominent features of the human visual attention mechanism: recurrency and sparsity.
We show that self-attention is a special case of VARS with a single-step optimization and no sparsity constraint.
VARS can be readily used as a replacement for self-attention in popular vision transformers, consistently improving their robustness across various benchmarks.
- Score: 82.78753751860603
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Visual attention helps achieve robust perception under noise, corruption, and
distribution shifts in human vision, which are areas where modern neural
networks still fall short. We present VARS, Visual Attention from Recurrent
Sparse reconstruction, a new attention formulation built on two prominent
features of the human visual attention mechanism: recurrency and sparsity.
Related features are grouped together via recurrent connections between
neurons, with salient objects emerging via sparse regularization. VARS adopts
an attractor network with recurrent connections that converges toward a stable
pattern over time. Network layers are represented as ordinary differential
equations (ODEs), formulating attention as a recurrent attractor network that
equivalently optimizes the sparse reconstruction of input using a dictionary of
"templates" encoding underlying patterns of data. We show that self-attention
is a special case of VARS with a single-step optimization and no sparsity
constraint. VARS can be readily used as a replacement for self-attention in
popular vision transformers, consistently improving their robustness across
various benchmarks. Code is released on GitHub (https://github.com/bfshi/VARS).
Related papers
- The Dynamic Net Architecture: Learning Robust and Holistic Visual Representations Through Self-Organizing Networks [3.9848584845601014]
We present a novel intelligent-system architecture called "Dynamic Net Architecture" (DNA)
DNA relies on recurrence-stabilized networks and discuss it in application to vision.
arXiv Detail & Related papers (2024-07-08T06:22:10Z) - A Primal-Dual Framework for Transformers and Neural Networks [52.814467832108875]
Self-attention is key to the remarkable success of transformers in sequence modeling tasks.
We show that the self-attention corresponds to the support vector expansion derived from a support vector regression problem.
We propose two new attentions: Batch Normalized Attention (Attention-BN) and Attention with Scaled Head (Attention-SH)
arXiv Detail & Related papers (2024-06-19T19:11:22Z) - Leveraging Low-Rank and Sparse Recurrent Connectivity for Robust
Closed-Loop Control [63.310780486820796]
We show how a parameterization of recurrent connectivity influences robustness in closed-loop settings.
We find that closed-form continuous-time neural networks (CfCs) with fewer parameters can outperform their full-rank, fully-connected counterparts.
arXiv Detail & Related papers (2023-10-05T21:44:18Z) - Singular Value Representation: A New Graph Perspective On Neural
Networks [0.0]
We introduce the Singular Value Representation (SVR), a new method to represent the internal state of neural networks.
We derive a precise statistical framework to discriminate meaningful connections between spectral neurons for fully connected and convolutional layers.
arXiv Detail & Related papers (2023-02-16T10:10:31Z) - Optimized Symbolic Interval Propagation for Neural Network Verification [1.8047694351309207]
We present DPNeurifyFV, a novel branch-and-bound solver for ReLU networks with low dimensional input-space.
We evaluate our approach on the airborne collision avoidance networks ACAS Xu and demonstrate runtime improvements compared to state-of-art tools.
arXiv Detail & Related papers (2022-12-15T14:15:29Z) - A Generic Shared Attention Mechanism for Various Backbone Neural Networks [53.36677373145012]
Self-attention modules (SAMs) produce strongly correlated attention maps across different layers.
Dense-and-Implicit Attention (DIA) shares SAMs across layers and employs a long short-term memory module.
Our simple yet effective DIA can consistently enhance various network backbones.
arXiv Detail & Related papers (2022-10-27T13:24:08Z) - Reconstruction-guided attention improves the robustness and shape
processing of neural networks [5.156484100374057]
We build an iterative encoder-decoder network that generates an object reconstruction and uses it as top-down attentional feedback.
Our model shows strong generalization performance against various image perturbations.
Our study shows that modeling reconstruction-based feedback endows AI systems with a powerful attention mechanism.
arXiv Detail & Related papers (2022-09-27T18:32:22Z) - Improving Neural Predictivity in the Visual Cortex with Gated Recurrent
Connections [0.0]
We aim to shift the focus on architectures that take into account lateral recurrent connections, a ubiquitous feature of the ventral visual stream, to devise adaptive receptive fields.
In order to increase the robustness of our approach and the biological fidelity of the activations, we employ specific data augmentation techniques.
arXiv Detail & Related papers (2022-03-22T17:27:22Z) - Relational Self-Attention: What's Missing in Attention for Video
Understanding [52.38780998425556]
We introduce a relational feature transform, dubbed the relational self-attention (RSA)
Our experiments and ablation studies show that the RSA network substantially outperforms convolution and self-attention counterparts.
arXiv Detail & Related papers (2021-11-02T15:36:11Z) - Robust Person Re-Identification through Contextual Mutual Boosting [77.1976737965566]
We propose the Contextual Mutual Boosting Network (CMBN) to localize pedestrians.
It localizes pedestrians and recalibrates features by effectively exploiting contextual information and statistical inference.
Experiments on the benchmarks demonstrate the superiority of the architecture compared the state-of-the-art.
arXiv Detail & Related papers (2020-09-16T06:33:35Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.