Gaussian Constrained Attention Network for Scene Text Recognition
- URL: http://arxiv.org/abs/2010.09169v1
- Date: Mon, 19 Oct 2020 01:55:30 GMT
- Title: Gaussian Constrained Attention Network for Scene Text Recognition
- Authors: Zhi Qiao, Xugong Qin, Yu Zhou, Fei Yang, Weiping Wang
- Abstract summary: We argue that the existing attention mechanism faces the problem of attention diffusion, in which the model may not focus on a certain character area.
We propose a 2D attention-based method integrated with a novel Gaussian Constrained Refinement Module.
In this way, the attention weights will be more concentrated and the attention-based recognition network achieves better performance.
- Score: 16.485898019983797
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Scene text recognition has been a hot topic in computer vision. Recent
methods adopt the attention mechanism for sequence prediction which achieve
convincing results. However, we argue that the existing attention mechanism
faces the problem of attention diffusion, in which the model may not focus on a
certain character area. In this paper, we propose Gaussian Constrained
Attention Network to deal with this problem. It is a 2D attention-based method
integrated with a novel Gaussian Constrained Refinement Module, which predicts
an additional Gaussian mask to refine the attention weights. Different from
adopting an additional supervision on the attention weights simply, our
proposed method introduces an explicit refinement. In this way, the attention
weights will be more concentrated and the attention-based recognition network
achieves better performance. The proposed Gaussian Constrained Refinement
Module is flexible and can be applied to existing attention-based methods
directly. The experiments on several benchmark datasets demonstrate the
effectiveness of our proposed method. Our code has been available at
https://github.com/Pay20Y/GCAN.
Related papers
- Localized Gaussians as Self-Attention Weights for Point Clouds Correspondence [92.07601770031236]
We investigate semantically meaningful patterns in the attention heads of an encoder-only Transformer architecture.
We find that fixing the attention weights not only accelerates the training process but also enhances the stability of the optimization.
arXiv Detail & Related papers (2024-09-20T07:41:47Z) - Towards Robust Semantic Segmentation against Patch-based Attack via Attention Refinement [68.31147013783387]
We observe that the attention mechanism is vulnerable to patch-based adversarial attacks.
In this paper, we propose a Robust Attention Mechanism (RAM) to improve the robustness of the semantic segmentation model.
arXiv Detail & Related papers (2024-01-03T13:58:35Z) - RFAConv: Innovating Spatial Attention and Standard Convolutional Operation [7.2646541547165056]
We propose a novel attention mechanism called Receptive-Field Attention (RFA)
RFA focuses on the receptive-field spatial feature but also provides effective attention weights for large-size convolutional kernels.
It offers nearly negligible increment of computational cost and parameters, while significantly improving network performance.
arXiv Detail & Related papers (2023-04-06T16:21:56Z) - Where to Look: A Unified Attention Model for Visual Recognition with
Reinforcement Learning [5.247711598719703]
We propose to unify the top-down and bottom-up attention together for recurrent visual attention.
Our model exploits the image pyramids and Q-learning to select regions of interests in the top-down attention mechanism.
We train our model in an end-to-end reinforcement learning framework, and evaluate our method on visual classification tasks.
arXiv Detail & Related papers (2021-11-13T18:44:50Z) - Alignment Attention by Matching Key and Query Distributions [48.93793773929006]
This paper introduces alignment attention that explicitly encourages self-attention to match the distributions of the key and query within each head.
It is simple to convert any models with self-attention, including pre-trained ones, to the proposed alignment attention.
On a variety of language understanding tasks, we show the effectiveness of our method in accuracy, uncertainty estimation, generalization across domains, and robustness to adversarial attacks.
arXiv Detail & Related papers (2021-10-25T00:54:57Z) - An Attention Module for Convolutional Neural Networks [5.333582981327498]
We propose an attention module for convolutional neural networks by developing an AW-convolution.
Experiments on several datasets for image classification and object detection tasks show the effectiveness of our proposed attention module.
arXiv Detail & Related papers (2021-08-18T15:36:18Z) - Bayesian Attention Belief Networks [59.183311769616466]
Attention-based neural networks have achieved state-of-the-art results on a wide range of tasks.
This paper introduces Bayesian attention belief networks, which construct a decoder network by modeling unnormalized attention weights.
We show that our method outperforms deterministic attention and state-of-the-art attention in accuracy, uncertainty estimation, generalization across domains, and adversarial attacks.
arXiv Detail & Related papers (2021-06-09T17:46:22Z) - SparseBERT: Rethinking the Importance Analysis in Self-attention [107.68072039537311]
Transformer-based models are popular for natural language processing (NLP) tasks due to its powerful capacity.
Attention map visualization of a pre-trained model is one direct method for understanding self-attention mechanism.
We propose a Differentiable Attention Mask (DAM) algorithm, which can be also applied in guidance of SparseBERT design.
arXiv Detail & Related papers (2021-02-25T14:13:44Z) - Unlocking Pixels for Reinforcement Learning via Implicit Attention [61.666538764049854]
We make use of new efficient attention algorithms, recently shown to be highly effective for Transformers.
This allows our attention-based controllers to scale to larger visual inputs, and facilitate the use of smaller patches.
In addition, we propose a new efficient algorithm approximating softmax attention with what we call hybrid random features.
arXiv Detail & Related papers (2021-02-08T17:00:26Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.