Related papers: DAS: A Deformable Attention to Capture Salient Information in CNNs

DAS: A Deformable Attention to Capture Salient Information in CNNs

URL: http://arxiv.org/abs/2311.12091v1
Date: Mon, 20 Nov 2023 18:49:58 GMT
Title: DAS: A Deformable Attention to Capture Salient Information in CNNs
Authors: Farzad Salajegheh, Nader Asadi, Soroush Saryazdi, Sudhir Mudur
Abstract summary: Self-attention can improve a model's access to global information but increases computational overhead. We present a fast and simple fully convolutional method called DAS that helps focus attention on relevant information.
Score: 2.321323878201932
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Convolutional Neural Networks (CNNs) excel in local spatial pattern recognition. For many vision tasks, such as object recognition and segmentation, salient information is also present outside CNN's kernel boundaries. However, CNNs struggle in capturing such relevant information due to their confined receptive fields. Self-attention can improve a model's access to global information but increases computational overhead. We present a fast and simple fully convolutional method called DAS that helps focus attention on relevant information. It uses deformable convolutions for the location of pertinent image regions and separable convolutions for efficiency. DAS plugs into existing CNNs and propagates relevant information using a gating mechanism. Compared to the O(n^2) computational complexity of transformer-style attention, DAS is O(n). Our claim is that DAS's ability to pay increased attention to relevant features results in performance improvements when added to popular CNNs for Image Classification and Object Detection. For example, DAS yields an improvement on Stanford Dogs (4.47%), ImageNet (1.91%), and COCO AP (3.3%) with base ResNet50 backbone. This outperforms other CNN attention mechanisms while using similar or less FLOPs. Our code will be publicly available.

Related papers

OA-CNNs: Omni-Adaptive Sparse CNNs for 3D Semantic Segmentation [70.17681136234202]
We reexamine the design distinctions and test the limits of what a sparse CNN can achieve. We propose two key components, i.e., adaptive receptive fields (spatially) and adaptive relation, to bridge the gap. This exploration led to the creation of Omni-Adaptive 3D CNNs (OA-CNNs), a family of networks that integrates a lightweight module.
arXiv Detail & Related papers (2024-03-21T14:06:38Z)
SECNN: Squeeze-and-Excitation Convolutional Neural Network for Sentence Classification [0.0]
Convolution neural network (CNN) has the ability to extract n-grams features through convolutional filters. We propose a Squeeze-and-Excitation Convolutional neural Network (SECNN) for sentence classification.
arXiv Detail & Related papers (2023-12-11T03:26:36Z)
A novel feature-scrambling approach reveals the capacity of convolutional neural networks to learn spatial relations [0.0]
Convolutional neural networks (CNNs) are one of the most successful computer vision systems to solve object recognition. Yet it remains poorly understood how CNNs actually make their decisions, what the nature of their internal representations is, and how their recognition strategies differ from humans.
arXiv Detail & Related papers (2022-12-12T16:40:29Z)
Learning to ignore: rethinking attention in CNNs [87.01305532842878]
We propose to reformulate the attention mechanism in CNNs to learn to ignore instead of learning to attend. Specifically, we propose to explicitly learn irrelevant information in the scene and suppress it in the produced representation.
arXiv Detail & Related papers (2021-11-10T13:47:37Z)
The Mind's Eye: Visualizing Class-Agnostic Features of CNNs [92.39082696657874]
We propose an approach to visually interpret CNN features given a set of images by creating corresponding images that depict the most informative features of a specific layer. Our method uses a dual-objective activation and distance loss, without requiring a generator network nor modifications to the original model.
arXiv Detail & Related papers (2021-01-29T07:46:39Z)
3D CNNs with Adaptive Temporal Feature Resolutions [83.43776851586351]
Similarity Guided Sampling (SGS) module can be plugged into any existing 3D CNN architecture. SGS empowers 3D CNNs by learning the similarity of temporal features and grouping similar features together. Our evaluations show that the proposed module improves the state-of-the-art by reducing the computational cost (GFLOPs) by half while preserving or even improving the accuracy.
arXiv Detail & Related papers (2020-11-17T14:34:05Z)
A CNN-based Feature Space for Semi-supervised Incremental Learning in Assisted Living Applications [2.1485350418225244]
We propose using the feature space that results from the training dataset to automatically label problematic images. The resulting semi-supervised incremental learning process allows improving the classification accuracy of new instances by 40%.
arXiv Detail & Related papers (2020-11-11T12:31:48Z)
Decoding CNN based Object Classifier Using Visualization [6.666597301197889]
We visualize what type of features are extracted in different convolution layers of CNN. Visualizing heat map of activation helps us to understand how CNN classifies and localizes different objects in image.
arXiv Detail & Related papers (2020-07-15T05:01:27Z)
Curriculum By Smoothing [52.08553521577014]
Convolutional Neural Networks (CNNs) have shown impressive performance in computer vision tasks such as image classification, detection, and segmentation. We propose an elegant curriculum based scheme that smoothes the feature embedding of a CNN using anti-aliasing or low-pass filters. As the amount of information in the feature maps increases during training, the network is able to progressively learn better representations of the data.
arXiv Detail & Related papers (2020-03-03T07:27:44Z)
Approximation and Non-parametric Estimation of ResNet-type Convolutional Neural Networks [52.972605601174955]
We show a ResNet-type CNN can attain the minimax optimal error rates in important function classes. We derive approximation and estimation error rates of the aformentioned type of CNNs for the Barron and H"older classes.
arXiv Detail & Related papers (2019-03-24T19:42:39Z)

This list is automatically generated from the titles and abstracts of the papers in this site.