Related papers: Visual Attention Emerges from Recurrent Sparse Reconstruction

Visual Attention Emerges from Recurrent Sparse Reconstruction

URL: http://arxiv.org/abs/2204.10962v1
Date: Sat, 23 Apr 2022 00:35:02 GMT
Title: Visual Attention Emerges from Recurrent Sparse Reconstruction
Authors: Baifeng Shi, Yale Song, Neel Joshi, Trevor Darrell, Xin Wang
Abstract summary: We present a new attention formulation built on two prominent features of the human visual attention mechanism: recurrency and sparsity. We show that self-attention is a special case of VARS with a single-step optimization and no sparsity constraint. VARS can be readily used as a replacement for self-attention in popular vision transformers, consistently improving their robustness across various benchmarks.
Score: 82.78753751860603
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Visual attention helps achieve robust perception under noise, corruption, and distribution shifts in human vision, which are areas where modern neural networks still fall short. We present VARS, Visual Attention from Recurrent Sparse reconstruction, a new attention formulation built on two prominent features of the human visual attention mechanism: recurrency and sparsity. Related features are grouped together via recurrent connections between neurons, with salient objects emerging via sparse regularization. VARS adopts an attractor network with recurrent connections that converges toward a stable pattern over time. Network layers are represented as ordinary differential equations (ODEs), formulating attention as a recurrent attractor network that equivalently optimizes the sparse reconstruction of input using a dictionary of "templates" encoding underlying patterns of data. We show that self-attention is a special case of VARS with a single-step optimization and no sparsity constraint. VARS can be readily used as a replacement for self-attention in popular vision transformers, consistently improving their robustness across various benchmarks. Code is released on GitHub (https://github.com/bfshi/VARS).

Related papers

Spiking Meets Attention: Efficient Remote Sensing Image Super-Resolution with Attention Spiking Neural Networks [57.17129753411926]
Spiking neural networks (SNNs) are emerging as a promising alternative to traditional artificial neural networks (ANNs) We propose SpikeSR, which achieves state-of-the-art performance across various remote sensing benchmarks such as AID, DOTA, and DIOR.
arXiv Detail & Related papers (2025-03-06T09:06:06Z)
The Dynamic Net Architecture: Learning Robust and Holistic Visual Representations Through Self-Organizing Networks [3.9848584845601014]
We present a novel intelligent-system architecture called "Dynamic Net Architecture" (DNA) DNA relies on recurrence-stabilized networks and discuss it in application to vision.
arXiv Detail & Related papers (2024-07-08T06:22:10Z)
A Primal-Dual Framework for Transformers and Neural Networks [52.814467832108875]
Self-attention is key to the remarkable success of transformers in sequence modeling tasks. We show that the self-attention corresponds to the support vector expansion derived from a support vector regression problem. We propose two new attentions: Batch Normalized Attention (Attention-BN) and Attention with Scaled Head (Attention-SH)
arXiv Detail & Related papers (2024-06-19T19:11:22Z)
Leveraging Low-Rank and Sparse Recurrent Connectivity for Robust Closed-Loop Control [63.310780486820796]
We show how a parameterization of recurrent connectivity influences robustness in closed-loop settings. We find that closed-form continuous-time neural networks (CfCs) with fewer parameters can outperform their full-rank, fully-connected counterparts.
arXiv Detail & Related papers (2023-10-05T21:44:18Z)
Singular Value Representation: A New Graph Perspective On Neural Networks [0.0]
We introduce the Singular Value Representation (SVR), a new method to represent the internal state of neural networks. We derive a precise statistical framework to discriminate meaningful connections between spectral neurons for fully connected and convolutional layers.
arXiv Detail & Related papers (2023-02-16T10:10:31Z)
Optimized Symbolic Interval Propagation for Neural Network Verification [1.8047694351309207]
We present DPNeurifyFV, a novel branch-and-bound solver for ReLU networks with low dimensional input-space. We evaluate our approach on the airborne collision avoidance networks ACAS Xu and demonstrate runtime improvements compared to state-of-art tools.
arXiv Detail & Related papers (2022-12-15T14:15:29Z)
A Generic Shared Attention Mechanism for Various Backbone Neural Networks [53.36677373145012]
Self-attention modules (SAMs) produce strongly correlated attention maps across different layers. Dense-and-Implicit Attention (DIA) shares SAMs across layers and employs a long short-term memory module. Our simple yet effective DIA can consistently enhance various network backbones.
arXiv Detail & Related papers (2022-10-27T13:24:08Z)
Reconstruction-guided attention improves the robustness and shape processing of neural networks [5.156484100374057]
We build an iterative encoder-decoder network that generates an object reconstruction and uses it as top-down attentional feedback. Our model shows strong generalization performance against various image perturbations. Our study shows that modeling reconstruction-based feedback endows AI systems with a powerful attention mechanism.
arXiv Detail & Related papers (2022-09-27T18:32:22Z)
Improving Neural Predictivity in the Visual Cortex with Gated Recurrent Connections [0.0]
We aim to shift the focus on architectures that take into account lateral recurrent connections, a ubiquitous feature of the ventral visual stream, to devise adaptive receptive fields. In order to increase the robustness of our approach and the biological fidelity of the activations, we employ specific data augmentation techniques.
arXiv Detail & Related papers (2022-03-22T17:27:22Z)
Improved Dual Correlation Reduction Network [40.792587861237166]
We propose a novel deep graph clustering algorithm termed Improved Dual Correlation Reduction Network (IDCRN) By approximating the cross-view feature correlation matrix to an identity matrix, we reduce the redundancy between different dimensions of features. We also avoid the collapsed representation caused by the over-smoothing issue in Graph Convolutional Networks (GCNs) through an introduced propagation regularization term.
arXiv Detail & Related papers (2022-02-25T07:48:32Z)
Relational Self-Attention: What's Missing in Attention for Video Understanding [52.38780998425556]
We introduce a relational feature transform, dubbed the relational self-attention (RSA) Our experiments and ablation studies show that the RSA network substantially outperforms convolution and self-attention counterparts.
arXiv Detail & Related papers (2021-11-02T15:36:11Z)
Robust Person Re-Identification through Contextual Mutual Boosting [77.1976737965566]
We propose the Contextual Mutual Boosting Network (CMBN) to localize pedestrians. It localizes pedestrians and recalibrates features by effectively exploiting contextual information and statistical inference. Experiments on the benchmarks demonstrate the superiority of the architecture compared the state-of-the-art.
arXiv Detail & Related papers (2020-09-16T06:33:35Z)

This list is automatically generated from the titles and abstracts of the papers in this site.