Distance Guided Channel Weighting for Semantic Segmentation
- URL: http://arxiv.org/abs/2004.12679v4
- Date: Fri, 13 May 2022 08:55:09 GMT
- Title: Distance Guided Channel Weighting for Semantic Segmentation
- Authors: Xuanyi Liu, Lanyun Zhu, Shiping Zhu, Li Luo
- Abstract summary: We introduce Distance Guided Channel Weighting Module (DGCW)
The DGCW module is constructed in a pixel-wise context extraction manner.
We propose the Distance Guided Channel Weighting Network (DGCWNet)
- Score: 4.10724123131976
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Recent works have achieved great success in improving the performance of
multiple computer vision tasks by capturing features with a high channel number
utilizing deep neural networks. However, many channels of extracted features
are not discriminative and contain a lot of redundant information. In this
paper, we address above issue by introducing the Distance Guided Channel
Weighting (DGCW) Module. The DGCW module is constructed in a pixel-wise context
extraction manner, which enhances the discriminativeness of features by
weighting different channels of each pixel's feature vector when modeling its
relationship with other pixels. It can make full use of the high-discriminative
information while ignore the low-discriminative information containing in
feature maps, as well as capture the long-range dependencies. Furthermore, by
incorporating the DGCW module with a baseline segmentation network, we propose
the Distance Guided Channel Weighting Network (DGCWNet). We conduct extensive
experiments to demonstrate the effectiveness of DGCWNet. In particular, it
achieves 81.6% mIoU on Cityscapes with only fine annotated data for training,
and also gains satisfactory performance on another two semantic segmentation
datasets, i.e. Pascal Context and ADE20K. Code will be available soon at
https://github.com/LanyunZhu/DGCWNet.
Related papers
- DiffCut: Catalyzing Zero-Shot Semantic Segmentation with Diffusion Features and Recursive Normalized Cut [62.63481844384229]
Foundation models have emerged as powerful tools across various domains including language, vision, and multimodal tasks.
In this paper, we use a diffusion UNet encoder as a foundation vision encoder and introduce DiffCut, an unsupervised zero-shot segmentation method.
Our work highlights the remarkably accurate semantic knowledge embedded within diffusion UNet encoders that could then serve as foundation vision encoders for downstream tasks.
arXiv Detail & Related papers (2024-06-05T01:32:31Z) - Efficient Multi-Scale Attention Module with Cross-Spatial Learning [4.046170185945849]
A novel efficient multi-scale attention (EMA) module is proposed.
We focus on retaining the information on per channel and decreasing the computational overhead.
We conduct extensive ablation studies and experiments on image classification and object detection tasks.
arXiv Detail & Related papers (2023-05-23T00:35:47Z) - DPANET:Dual Pooling Attention Network for Semantic Segmentation [0.0]
We propose a lightweight and flexible neural network named Dual Pool Attention Network(DPANet)
The first component is spatial pool attention module, we formulate an easy and powerful method densely to extract contextual characteristics.
The second component is channel pool attention module. So, the aim of this module is stripping them out, in order to construct relationship of all channels and heighten different channels semantic information selectively.
arXiv Detail & Related papers (2022-10-11T13:29:33Z) - EAA-Net: Rethinking the Autoencoder Architecture with Intra-class
Features for Medical Image Segmentation [4.777011444412729]
We propose a light-weight end-to-end segmentation framework based on multi-task learning, termed Edge Attention autoencoder Network (EAA-Net)
Our approach not only utilizes the segmentation network to obtain inter-class features, but also applies the reconstruction network to extract intra-class features among the foregrounds.
Experimental results show that our method performs well in medical image segmentation tasks.
arXiv Detail & Related papers (2022-08-19T07:42:55Z) - Specificity-preserving RGB-D Saliency Detection [103.3722116992476]
We propose a specificity-preserving network (SP-Net) for RGB-D saliency detection.
Two modality-specific networks and a shared learning network are adopted to generate individual and shared saliency maps.
Experiments on six benchmark datasets demonstrate that our SP-Net outperforms other state-of-the-art methods.
arXiv Detail & Related papers (2021-08-18T14:14:22Z) - Channel-wise Knowledge Distillation for Dense Prediction [73.99057249472735]
We propose to align features channel-wise between the student and teacher networks.
We consistently achieve superior performance on three benchmarks with various network structures.
arXiv Detail & Related papers (2020-11-26T12:00:38Z) - Multi-Attention-Network for Semantic Segmentation of Fine Resolution
Remote Sensing Images [10.835342317692884]
The accuracy of semantic segmentation in remote sensing images has been increased significantly by deep convolutional neural networks.
This paper proposes a Multi-Attention-Network (MANet) to address these issues.
A novel attention mechanism of kernel attention with linear complexity is proposed to alleviate the large computational demand in attention.
arXiv Detail & Related papers (2020-09-03T09:08:02Z) - MACU-Net for Semantic Segmentation of Fine-Resolution Remotely Sensed
Images [11.047174552053626]
MACU-Net is a multi-scale skip connected and asymmetric-convolution-based U-Net for fine-resolution remotely sensed images.
Our design has the following advantages: (1) The multi-scale skip connections combine and realign semantic features contained in both low-level and high-level feature maps; (2) the asymmetric convolution block strengthens the feature representation and feature extraction capability of a standard convolution layer.
Experiments conducted on two remotely sensed datasets demonstrate that the proposed MACU-Net transcends the U-Net, U-NetPPL, U-Net 3+, amongst other benchmark approaches.
arXiv Detail & Related papers (2020-07-26T08:56:47Z) - GSTO: Gated Scale-Transfer Operation for Multi-Scale Feature Learning in
Pixel Labeling [92.90448357454274]
We propose the Gated Scale-Transfer Operation (GSTO) to properly transit spatial-supervised features to another scale.
By plugging GSTO into HRNet, we get a more powerful backbone for pixel labeling.
Experiment results demonstrate that GSTO can also significantly boost the performance of multi-scale feature aggregation modules.
arXiv Detail & Related papers (2020-05-27T13:46:58Z) - ResNeSt: Split-Attention Networks [86.25490825631763]
We present a modularized architecture, which applies the channel-wise attention on different network branches to leverage their success in capturing cross-feature interactions and learning diverse representations.
Our model, named ResNeSt, outperforms EfficientNet in accuracy and latency trade-off on image classification.
arXiv Detail & Related papers (2020-04-19T20:40:31Z) - Real-Time High-Performance Semantic Image Segmentation of Urban Street
Scenes [98.65457534223539]
We propose a real-time high-performance DCNN-based method for robust semantic segmentation of urban street scenes.
The proposed method achieves the accuracy of 73.6% and 68.0% mean Intersection over Union (mIoU) with the inference speed of 51.0 fps and 39.3 fps.
arXiv Detail & Related papers (2020-03-11T08:45:53Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.