Convolutional Rectangular Attention Module
- URL: http://arxiv.org/abs/2503.10875v1
- Date: Thu, 13 Mar 2025 20:41:36 GMT
- Title: Convolutional Rectangular Attention Module
- Authors: Hai-Vy Nguyen, Fabrice Gamboa, Sixin Zhang, Reda Chhaibi, Serge Gratton, Thierry Giaccone,
- Abstract summary: We introduce a novel spatial attention module, that can be integrated to any convolutional network.<n>This module guides the model to pay attention to the most discriminative part of an image.
- Score: 3.3975558777609915
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: In this paper, we introduce a novel spatial attention module, that can be integrated to any convolutional network. This module guides the model to pay attention to the most discriminative part of an image. This enables the model to attain a better performance by an end-to-end training. In standard approaches, a spatial attention map is generated in a position-wise fashion. We observe that this results in very irregular boundaries. This could make it difficult to generalize to new samples. In our method, the attention region is constrained to be rectangular. This rectangle is parametrized by only 5 parameters, allowing for a better stability and generalization to new samples. In our experiments, our method systematically outperforms the position-wise counterpart. Thus, this provides us a novel useful spatial attention mechanism for convolutional models. Besides, our module also provides the interpretability concerning the ``where to look" question, as it helps to know the part of the input on which the model focuses to produce the prediction.
Related papers
- Semi-Supervised Crowd Counting with Contextual Modeling: Facilitating Holistic Understanding of Crowd Scenes [19.987151025364067]
This paper presents a new semi-supervised method for training a reliable crowd counting model.
We foster the model's intrinsic'subitizing' capability, which allows it to accurately estimate the count in regions.
Our method achieves the state-of-the-art performance, surpassing previous approaches by a large margin on challenging benchmarks.
arXiv Detail & Related papers (2023-10-16T12:42:43Z) - Exploring the Space of Key-Value-Query Models with Intention [8.585795909956726]
Two key components of Attention are the structure of its input (which consists of keys, values and queries) and the computations by which these three are combined.
We refer to this space as Keys-Values-Queries ( KVQ) Space.
Our goal is to determine whether there are any other stackable models in KVQ Space that Attention cannot efficiently approximate.
arXiv Detail & Related papers (2023-05-17T13:25:57Z) - Unsupervised Deep Learning Meets Chan-Vese Model [77.24463525356566]
We propose an unsupervised image segmentation approach that integrates the Chan-Vese (CV) model with deep neural networks.
Our basic idea is to apply a deep neural network that maps the image into a latent space to alleviate the violation of the piecewise constant assumption in image space.
arXiv Detail & Related papers (2022-04-14T13:23:57Z) - Rethinking the Zigzag Flattening for Image Reading [48.976491898131265]
We investigate the Hilbert fractal flattening (HF) as another method for sequence ordering in computer vision.
The HF has proven to be superior to other curves in maintaining spatial locality.
It can be easily plugged into most deep neural networks (DNNs)
arXiv Detail & Related papers (2022-02-21T13:53:04Z) - Low-Rank Constraints for Fast Inference in Structured Models [110.38427965904266]
This work demonstrates a simple approach to reduce the computational and memory complexity of a large class of structured models.
Experiments with neural parameterized structured models for language modeling, polyphonic music modeling, unsupervised grammar induction, and video modeling show that our approach matches the accuracy of standard models at large state spaces.
arXiv Detail & Related papers (2022-01-08T00:47:50Z) - Style Curriculum Learning for Robust Medical Image Segmentation [62.02435329931057]
Deep segmentation models often degrade due to distribution shifts in image intensities between the training and test data sets.
We propose a novel framework to ensure robust segmentation in the presence of such distribution shifts.
arXiv Detail & Related papers (2021-08-01T08:56:24Z) - Bayesian Attention Modules [65.52970388117923]
We propose a scalable version of attention that is easy to implement and optimize.
Our experiments show the proposed method brings consistent improvements over the corresponding baselines.
arXiv Detail & Related papers (2020-10-20T20:30:55Z) - S2RMs: Spatially Structured Recurrent Modules [105.0377129434636]
We take a step towards exploiting dynamic structure that are capable of simultaneously exploiting both modular andtemporal structures.
We find our models to be robust to the number of available views and better capable of generalization to novel tasks without additional training.
arXiv Detail & Related papers (2020-07-13T17:44:30Z) - Neural Subdivision [58.97214948753937]
This paper introduces Neural Subdivision, a novel framework for data-driven coarseto-fine geometry modeling.
We optimize for the same set of network weights across all local mesh patches, thus providing an architecture that is not constrained to a specific input mesh, fixed genus, or category.
We demonstrate that even when trained on a single high-resolution mesh our method generates reasonable subdivisions for novel shapes.
arXiv Detail & Related papers (2020-05-04T20:03:21Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.