Related papers: Assessing the Impact of Attention and Self-Attention Mechanisms on the Classification of Skin Lesions

Assessing the Impact of Attention and Self-Attention Mechanisms on the Classification of Skin Lesions

URL: http://arxiv.org/abs/2112.12748v1
Date: Thu, 23 Dec 2021 18:02:48 GMT
Title: Assessing the Impact of Attention and Self-Attention Mechanisms on the Classification of Skin Lesions
Authors: Rafael Pedro and Arlindo L. Oliveira
Abstract summary: We focus on two forms of attention mechanisms: attention modules and self-attention. Attention modules are used to reweight the features of each layer input tensor. Self-Attention, originally proposed in the area of Natural Language Processing makes it possible to relate all the items in an input sequence.
Score: 0.0
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Attention mechanisms have raised significant interest in the research community, since they promise significant improvements in the performance of neural network architectures. However, in any specific problem, we still lack a principled way to choose specific mechanisms and hyper-parameters that lead to guaranteed improvements. More recently, self-attention has been proposed and widely used in transformer-like architectures, leading to significant breakthroughs in some applications. In this work we focus on two forms of attention mechanisms: attention modules and self-attention. Attention modules are used to reweight the features of each layer input tensor. Different modules have different ways to perform this reweighting in fully connected or convolutional layers. The attention models studied are completely modular and in this work they will be used with the popular ResNet architecture. Self-Attention, originally proposed in the area of Natural Language Processing makes it possible to relate all the items in an input sequence. Self-Attention is becoming increasingly popular in Computer Vision, where it is sometimes combined with convolutional layers, although some recent architectures do away entirely with convolutions. In this work, we study and perform an objective comparison of a number of different attention mechanisms in a specific computer vision task, the classification of samples in the widely used Skin Cancer MNIST dataset. The results show that attention modules do sometimes improve the performance of convolutional neural network architectures, but also that this improvement, although noticeable and statistically significant, is not consistent in different settings. The results obtained with self-attention mechanisms, on the other hand, show consistent and significant improvements, leading to the best results even in architectures with a reduced number of parameters.

Related papers

An Efficient and Mixed Heterogeneous Model for Image Restoration [71.85124734060665]
Current mainstream approaches are based on three architectural paradigms: CNNs, Transformers, and Mambas. We propose RestorMixer, an efficient and general-purpose IR model based on mixed-architecture fusion.
arXiv Detail & Related papers (2025-04-15T08:19:12Z)
A Primal-Dual Framework for Transformers and Neural Networks [52.814467832108875]
Self-attention is key to the remarkable success of transformers in sequence modeling tasks. We show that the self-attention corresponds to the support vector expansion derived from a support vector regression problem. We propose two new attentions: Batch Normalized Attention (Attention-BN) and Attention with Scaled Head (Attention-SH)
arXiv Detail & Related papers (2024-06-19T19:11:22Z)
When Medical Imaging Met Self-Attention: A Love Story That Didn't Quite Work Out [8.113092414596679]
We extend two widely adopted convolutional architectures with different self-attention variants on two different medical datasets. We observe no significant improvement in balanced accuracy over fully convolutional models. We also find that important features, such as dermoscopic structures in skin lesion images, are still not learned by employing self-attention.
arXiv Detail & Related papers (2024-04-18T16:18:41Z)
Interpreting and Improving Attention From the Perspective of Large Kernel Convolution [51.06461246235176]
We introduce Large Kernel Convolutional Attention (LKCA), a novel formulation that reinterprets attention operations as a single large- Kernel convolution. LKCA achieves competitive performance across various visual tasks, particularly in data-constrained settings.
arXiv Detail & Related papers (2024-01-11T08:40:35Z)
Systematic Architectural Design of Scale Transformed Attention Condenser DNNs via Multi-Scale Class Representational Response Similarity Analysis [93.0013343535411]
We propose a novel type of analysis called Multi-Scale Class Representational Response Similarity Analysis (ClassRepSim) We show that adding STAC modules to ResNet style architectures can result in up to a 1.6% increase in top-1 accuracy. Results from ClassRepSim analysis can be used to select an effective parameterization of the STAC module resulting in competitive performance.
arXiv Detail & Related papers (2023-06-16T18:29:26Z)
Convolution-enhanced Evolving Attention Networks [41.684265133316096]
Evolving Attention-enhanced Dilated Convolutional (EA-DC-) Transformer outperforms state-of-the-art models significantly. This is the first work that explicitly models the layer-wise evolution of attention maps.
arXiv Detail & Related papers (2022-12-16T08:14:04Z)
A Generic Shared Attention Mechanism for Various Backbone Neural Networks [53.36677373145012]
Self-attention modules (SAMs) produce strongly correlated attention maps across different layers. Dense-and-Implicit Attention (DIA) shares SAMs across layers and employs a long short-term memory module. Our simple yet effective DIA can consistently enhance various network backbones.
arXiv Detail & Related papers (2022-10-27T13:24:08Z)
Parameter-Free Average Attention Improves Convolutional Neural Network Performance (Almost) Free of Charge [0.0]
We introduce a parameter-free attention mechanism called PfAAM, that is a simple yet effective module. PfAAM can be plugged into various convolutional neural network architectures with a little computational overhead and without affecting model size. This demonstrates its wide applicability as a general easy-to-use module for computer vision tasks.
arXiv Detail & Related papers (2022-10-14T13:56:43Z)
Switchable Self-attention Module [3.8992324495848356]
We propose a self-attention module SEM. Based on the input information of the attention module and alternative attention operators, SEM can automatically decide to select and integrate attention operators to compute attention maps. The effectiveness of SEM is demonstrated by extensive experiments on widely used benchmark datasets and popular self-attention networks.
arXiv Detail & Related papers (2022-09-13T01:19:38Z)
Learning Target-aware Representation for Visual Tracking via Informative Interactions [49.552877881662475]
We introduce a novel backbone architecture to improve target-perception ability of feature representation for tracking. The proposed GIM module and InBN mechanism are general and applicable to different backbone types including CNN and Transformer.
arXiv Detail & Related papers (2022-01-07T16:22:27Z)
Perceiver: General Perception with Iterative Attention [85.65927856589613]
We introduce the Perceiver - a model that builds upon Transformers. We show that this architecture performs competitively or beyond strong, specialized models on classification tasks. It also surpasses state-of-the-art results for all modalities in AudioSet.
arXiv Detail & Related papers (2021-03-04T18:20:50Z)
Attention that does not Explain Away [54.42960937271612]
Models based on the Transformer architecture have achieved better accuracy than the ones based on competing architectures for a large set of tasks. A unique feature of the Transformer is its universal application of a self-attention mechanism, which allows for free information flow at arbitrary distances. We propose a doubly-normalized attention scheme that is simple to implement and provides theoretical guarantees for avoiding the "explaining away" effect.
arXiv Detail & Related papers (2020-09-29T21:05:39Z)

This list is automatically generated from the titles and abstracts of the papers in this site.