Dense Prediction with Attentive Feature Aggregation
- URL: http://arxiv.org/abs/2111.00770v1
- Date: Mon, 1 Nov 2021 08:44:45 GMT
- Title: Dense Prediction with Attentive Feature Aggregation
- Authors: Yung-Hsu Yang, Thomas E. Huang, Samuel Rota Bul\`o, Peter
Kontschieder, Fisher Yu
- Abstract summary: We introduce Attentive Feature Aggregation (AFA) to fuse different network layers with more expressive non-linear operations.
AFA exploits both spatial and channel attention to compute weighted average of the layer activations.
Our experiments show consistent and significant improvements on challenging semantic segmentation benchmarks.
- Score: 26.205279570906473
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Aggregating information from features across different layers is an essential
operation for dense prediction models. Despite its limited expressiveness,
feature concatenation dominates the choice of aggregation operations. In this
paper, we introduce Attentive Feature Aggregation (AFA) to fuse different
network layers with more expressive non-linear operations. AFA exploits both
spatial and channel attention to compute weighted average of the layer
activations. Inspired by neural volume rendering, we extend AFA with
Scale-Space Rendering (SSR) to perform late fusion of multi-scale predictions.
AFA is applicable to a wide range of existing network designs. Our experiments
show consistent and significant improvements on challenging semantic
segmentation benchmarks, including Cityscapes, BDD100K, and Mapillary Vistas,
at negligible computational and parameter overhead. In particular, AFA improves
the performance of the Deep Layer Aggregation (DLA) model by nearly 6% mIoU on
Cityscapes. Our experimental analyses show that AFA learns to progressively
refine segmentation maps and to improve boundary details, leading to new
state-of-the-art results on boundary detection benchmarks on BSDS500 and
NYUDv2. Code and video resources are available at http://vis.xyz/pub/dla-afa.
Related papers
- High-Performance Few-Shot Segmentation with Foundation Models: An Empirical Study [64.06777376676513]
We develop a few-shot segmentation (FSS) framework based on foundation models.
To be specific, we propose a simple approach to extract implicit knowledge from foundation models to construct coarse correspondence.
Experiments on two widely used datasets demonstrate the effectiveness of our approach.
arXiv Detail & Related papers (2024-09-10T08:04:11Z) - Dual Attention U-Net with Feature Infusion: Pushing the Boundaries of
Multiclass Defect Segmentation [1.487252325779766]
The proposed architecture, Dual Attentive U-Net with Feature Infusion (DAU-FI Net), addresses challenges in semantic segmentation.
DAU-FI Net integrates multiscale spatial-channel attention mechanisms and feature injection to enhance precision in object localization.
Comprehensive experiments on a challenging sewer pipe and culvert defect dataset and a benchmark dataset validate DAU-FI Net's capabilities.
arXiv Detail & Related papers (2023-12-21T17:23:49Z) - DiffSpectralNet : Unveiling the Potential of Diffusion Models for
Hyperspectral Image Classification [6.521187080027966]
We propose a new network called DiffSpectralNet, which combines diffusion and transformer techniques.
First, we use an unsupervised learning framework based on the diffusion model to extract both high-level and low-level spectral-spatial features.
The diffusion method is capable of extracting diverse and meaningful spectral-spatial features, leading to improvement in HSI classification.
arXiv Detail & Related papers (2023-10-29T15:26:37Z) - Physics Inspired Hybrid Attention for SAR Target Recognition [61.01086031364307]
We propose a physics inspired hybrid attention (PIHA) mechanism and the once-for-all (OFA) evaluation protocol to address the issues.
PIHA leverages the high-level semantics of physical information to activate and guide the feature group aware of local semantics of target.
Our method outperforms other state-of-the-art approaches in 12 test scenarios with same ASC parameters.
arXiv Detail & Related papers (2023-09-27T14:39:41Z) - Learning Implicit Feature Alignment Function for Semantic Segmentation [51.36809814890326]
Implicit Feature Alignment function (IFA) is inspired by the rapidly expanding topic of implicit neural representations.
We show that IFA implicitly aligns the feature maps at different levels and is capable of producing segmentation maps in arbitrary resolutions.
Our method can be combined with improvement on various architectures, and it achieves state-of-the-art accuracy trade-off on common benchmarks.
arXiv Detail & Related papers (2022-06-17T09:40:14Z) - Adversarial Feature Augmentation and Normalization for Visual
Recognition [109.6834687220478]
Recent advances in computer vision take advantage of adversarial data augmentation to ameliorate the generalization ability of classification models.
Here, we present an effective and efficient alternative that advocates adversarial augmentation on intermediate feature embeddings.
We validate the proposed approach across diverse visual recognition tasks with representative backbone networks.
arXiv Detail & Related papers (2021-03-22T20:36:34Z) - Learning Granularity-Aware Convolutional Neural Network for Fine-Grained
Visual Classification [0.0]
We propose a novel Granularity-Aware Congrainedal Neural Network (GA-CNN) that progressively explores discriminative features.
GA-CNN does not need bounding boxes/part annotations and can be trained end-to-end.
Our approach achieves state-of-the-art performances on three benchmark datasets.
arXiv Detail & Related papers (2021-03-04T02:18:07Z) - Lightweight Single-Image Super-Resolution Network with Attentive
Auxiliary Feature Learning [73.75457731689858]
We develop a computation efficient yet accurate network based on the proposed attentive auxiliary features (A$2$F) for SISR.
Experimental results on large-scale dataset demonstrate the effectiveness of the proposed model against the state-of-the-art (SOTA) SR methods.
arXiv Detail & Related papers (2020-11-13T06:01:46Z) - Semantic Segmentation With Multi Scale Spatial Attention For Self
Driving Cars [2.7317088388886384]
We present a novel neural network using multi scale feature fusion at various scales for accurate and efficient semantic image segmentation.
We used ResNet based feature extractor, dilated convolutional layers in downsampling part, atrous convolutional layers in the upsampling part and used concat operation to merge them.
A new attention module is proposed to encode more contextual information and enhance the receptive field of the network.
arXiv Detail & Related papers (2020-06-30T20:19:09Z) - Improving Deep Hyperspectral Image Classification Performance with
Spectral Unmixing [3.84448093764973]
We propose an abundance-based multi-HSI classification method.
We convert every HSI from the spectral domain to the abundance domain by a dataset-specific autoencoder.
Secondly, the abundance representations from multiple HSIs are collected to form an enlarged dataset.
arXiv Detail & Related papers (2020-04-01T17:14:05Z) - Global Context-Aware Progressive Aggregation Network for Salient Object
Detection [117.943116761278]
We propose a novel network named GCPANet to integrate low-level appearance features, high-level semantic features, and global context features.
We show that the proposed approach outperforms the state-of-the-art methods both quantitatively and qualitatively.
arXiv Detail & Related papers (2020-03-02T04:26:10Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.