AASeg: Attention Aware Network for Real Time Semantic Segmentation
- URL: http://arxiv.org/abs/2108.04349v2
- Date: Wed, 11 Aug 2021 06:05:36 GMT
- Title: AASeg: Attention Aware Network for Real Time Semantic Segmentation
- Authors: Abhinav Sagar
- Abstract summary: We present a new network named Attention Aware Network (AASeg) for real time semantic image segmentation.
Our network incorporates and channel information using Spatial Attention (SA) and Channel Attention (CA) Mean modules.
We demonstrate the effectiveness of our method using a comprehensive analysis, quantitative experimental results and ablation study using Cityscapes, ADE20K and Camvid datasets.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In this paper, we present a new network named Attention Aware Network (AASeg)
for real time semantic image segmentation. Our network incorporates spatial and
channel information using Spatial Attention (SA) and Channel Attention (CA)
modules respectively. It also uses dense local multi-scale context information
using Multi Scale Context (MSC) module. The feature maps are concatenated
individually to produce the final segmentation map. We demonstrate the
effectiveness of our method using a comprehensive analysis, quantitative
experimental results and ablation study using Cityscapes, ADE20K and Camvid
datasets. Our network performs better than most previous architectures with a
74.4\% Mean IOU on Cityscapes test dataset while running at 202.7 FPS.
Related papers
- Cross-CBAM: A Lightweight network for Scene Segmentation [2.064612766965483]
We present the Cross-CBAM network, a novel lightweight network for real-time semantic segmentation.
In experiments on the Cityscapes dataset and Camvid dataset, we achieve 73.4% mIoU with a speed of 240.9FPS and 77.2% mIoU with a speed of 88.6FPS on NVIDIA GTX 1080Ti.
arXiv Detail & Related papers (2023-06-04T09:03:05Z) - A Self-Training Framework Based on Multi-Scale Attention Fusion for
Weakly Supervised Semantic Segmentation [7.36778096476552]
We propose a self-training method that utilizes fused multi-scale class-aware attention maps.
We collect information from attention maps of different scales and obtain multi-scale attention maps.
We then apply denoising and reactivation strategies to enhance the potential regions and reduce noisy areas.
arXiv Detail & Related papers (2023-05-10T02:16:12Z) - De-coupling and De-positioning Dense Self-supervised Learning [65.56679416475943]
Dense Self-Supervised Learning (SSL) methods address the limitations of using image-level feature representations when handling images with multiple objects.
We show that they suffer from coupling and positional bias, which arise from the receptive field increasing with layer depth and zero-padding.
We demonstrate the benefits of our method on COCO and on a new challenging benchmark, OpenImage-MINI, for object classification, semantic segmentation, and object detection.
arXiv Detail & Related papers (2023-03-29T18:07:25Z) - Deep Multi-Branch Aggregation Network for Real-Time Semantic
Segmentation in Street Scenes [32.54045305607654]
Many state-of-the-art real-time semantic segmentation methods tend to sacrifice some spatial details or contextual information for fast inference.
We propose a novel Deep Multi-branch Aggregation Network (called DMA-Net) based on the encoder-decoder structure to perform real-time semantic segmentation in street scenes.
Our proposed DMA-Net respectively obtains 77.0% and 73.6% mean Intersection over Union (mIoU) at the inference speed of 46.7 FPS and 119.8 FPS by only using a single NVIDIA GTX 1080Ti GPU.
arXiv Detail & Related papers (2022-03-08T12:07:32Z) - Residual Moment Loss for Medical Image Segmentation [56.72261489147506]
Location information is proven to benefit the deep learning models on capturing the manifold structure of target objects.
Most existing methods encode the location information in an implicit way, for the network to learn.
We propose a novel loss function, namely residual moment (RM) loss, to explicitly embed the location information of segmentation targets.
arXiv Detail & Related papers (2021-06-27T09:31:49Z) - Prototypical Cross-Attention Networks for Multiple Object Tracking and
Segmentation [95.74244714914052]
Multiple object tracking and segmentation requires detecting, tracking, and segmenting objects belonging to a set of given classes.
We propose Prototypical Cross-Attention Network (PCAN), capable of leveraging rich-temporal information online.
PCAN outperforms current video instance tracking and segmentation competition winners on Youtube-VIS and BDD100K datasets.
arXiv Detail & Related papers (2021-06-22T17:57:24Z) - CTNet: Context-based Tandem Network for Semantic Segmentation [77.4337867789772]
This work proposes a novel Context-based Tandem Network (CTNet) by interactively exploring the spatial contextual information and the channel contextual information.
To further improve the performance of the learned representations for semantic segmentation, the results of the two context modules are adaptively integrated.
arXiv Detail & Related papers (2021-04-20T07:33:11Z) - STEP: Segmenting and Tracking Every Pixel [107.23184053133636]
We present a new benchmark: Segmenting and Tracking Every Pixel (STEP)
Our work is the first that targets this task in a real-world setting that requires dense interpretation in both spatial and temporal domains.
For measuring the performance, we propose a novel evaluation metric and Tracking Quality (STQ)
arXiv Detail & Related papers (2021-02-23T18:43:02Z) - Semantic Segmentation With Multi Scale Spatial Attention For Self
Driving Cars [2.7317088388886384]
We present a novel neural network using multi scale feature fusion at various scales for accurate and efficient semantic image segmentation.
We used ResNet based feature extractor, dilated convolutional layers in downsampling part, atrous convolutional layers in the upsampling part and used concat operation to merge them.
A new attention module is proposed to encode more contextual information and enhance the receptive field of the network.
arXiv Detail & Related papers (2020-06-30T20:19:09Z) - Real-Time High-Performance Semantic Image Segmentation of Urban Street
Scenes [98.65457534223539]
We propose a real-time high-performance DCNN-based method for robust semantic segmentation of urban street scenes.
The proposed method achieves the accuracy of 73.6% and 68.0% mean Intersection over Union (mIoU) with the inference speed of 51.0 fps and 39.3 fps.
arXiv Detail & Related papers (2020-03-11T08:45:53Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.