EfficientFCN: Holistically-guided Decoding for Semantic Segmentation
- URL: http://arxiv.org/abs/2008.10487v2
- Date: Thu, 27 Aug 2020 02:42:38 GMT
- Title: EfficientFCN: Holistically-guided Decoding for Semantic Segmentation
- Authors: Jianbo Liu, Junjun He, Jiawei Zhang, Jimmy S. Ren, Hongsheng Li
- Abstract summary: State-of-the-art semantic segmentation algorithms are mostly based on dilated Fully Convolutional Networks (dilatedFCN)
We propose the EfficientFCN, whose backbone is a common ImageNet pre-trained network without any dilated convolution.
Such a framework achieves comparable or even better performance than state-of-the-art methods with only 1/3 of the computational cost.
- Score: 49.27021844132522
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Both performance and efficiency are important to semantic segmentation.
State-of-the-art semantic segmentation algorithms are mostly based on dilated
Fully Convolutional Networks (dilatedFCN), which adopt dilated convolutions in
the backbone networks to extract high-resolution feature maps for achieving
high-performance segmentation performance. However, due to many convolution
operations are conducted on the high-resolution feature maps, such
dilatedFCN-based methods result in large computational complexity and memory
consumption. To balance the performance and efficiency, there also exist
encoder-decoder structures that gradually recover the spatial information by
combining multi-level feature maps from the encoder. However, the performances
of existing encoder-decoder methods are far from comparable with the
dilatedFCN-based methods. In this paper, we propose the EfficientFCN, whose
backbone is a common ImageNet pre-trained network without any dilated
convolution. A holistically-guided decoder is introduced to obtain the
high-resolution semantic-rich feature maps via the multi-scale features from
the encoder. The decoding task is converted to novel codebook generation and
codeword assembly task, which takes advantages of the high-level and low-level
features from the encoder. Such a framework achieves comparable or even better
performance than state-of-the-art methods with only 1/3 of the computational
cost. Extensive experiments on PASCAL Context, PASCAL VOC, ADE20K validate the
effectiveness of the proposed EfficientFCN.
Related papers
- LENet: Lightweight And Efficient LiDAR Semantic Segmentation Using
Multi-Scale Convolution Attention [0.0]
We propose a projection-based semantic segmentation network called LENet with an encoder-decoder structure for LiDAR-based semantic segmentation.
The encoder is composed of a novel multi-scale convolutional attention (MSCA) module with varying receptive field sizes to capture features.
We show that our proposed method is lighter, more efficient, and robust compared to state-of-the-art semantic segmentation methods.
arXiv Detail & Related papers (2023-01-11T02:51:38Z) - UNETR++: Delving into Efficient and Accurate 3D Medical Image Segmentation [93.88170217725805]
We propose a 3D medical image segmentation approach, named UNETR++, that offers both high-quality segmentation masks as well as efficiency in terms of parameters, compute cost, and inference speed.
The core of our design is the introduction of a novel efficient paired attention (EPA) block that efficiently learns spatial and channel-wise discriminative features.
Our evaluations on five benchmarks, Synapse, BTCV, ACDC, BRaTs, and Decathlon-Lung, reveal the effectiveness of our contributions in terms of both efficiency and accuracy.
arXiv Detail & Related papers (2022-12-08T18:59:57Z) - Attention guided global enhancement and local refinement network for
semantic segmentation [5.881350024099048]
A lightweight semantic segmentation network is developed using the encoder-decoder architecture.
A Global Enhancement Method is proposed to aggregate global information from high-level feature maps.
A Local Refinement Module is developed by utilizing the decoder features as the semantic guidance.
The two methods are integrated into a Context Fusion Block, and based on that, a novel Attention guided Global enhancement and Local refinement Network (AGLN) is elaborately designed.
arXiv Detail & Related papers (2022-04-09T02:32:24Z) - Towards Deep and Efficient: A Deep Siamese Self-Attention Fully
Efficient Convolutional Network for Change Detection in VHR Images [28.36808011351123]
We present a very deep and efficient CD network, entitled EffCDNet.
In EffCDNet, an efficient convolution consisting of depth-wise convolution and group convolution with a channel shuffle mechanism is introduced.
On two challenging CD datasets, our approach outperforms other SOTA FCN-based methods.
arXiv Detail & Related papers (2021-08-18T14:02:38Z) - Dynamic Neural Representational Decoders for High-Resolution Semantic
Segmentation [98.05643473345474]
We propose a novel decoder, termed dynamic neural representational decoder (NRD)
As each location on the encoder's output corresponds to a local patch of the semantic labels, in this work, we represent these local patches of labels with compact neural networks.
This neural representation enables our decoder to leverage the smoothness prior in the semantic label space, and thus makes our decoder more efficient.
arXiv Detail & Related papers (2021-07-30T04:50:56Z) - Quantized Neural Networks via {-1, +1} Encoding Decomposition and
Acceleration [83.84684675841167]
We propose a novel encoding scheme using -1, +1 to decompose quantized neural networks (QNNs) into multi-branch binary networks.
We validate the effectiveness of our method on large-scale image classification, object detection, and semantic segmentation tasks.
arXiv Detail & Related papers (2021-06-18T03:11:15Z) - MSCFNet: A Lightweight Network With Multi-Scale Context Fusion for
Real-Time Semantic Segmentation [27.232578592161673]
We devise a novel lightweight network using a multi-scale context fusion scheme (MSCFNet)
The proposed MSCFNet contains only 1.15M parameters, achieves 71.9% Mean IoU and can run at over 50 FPS on a single Titan XP GPU configuration.
arXiv Detail & Related papers (2021-03-24T08:28:26Z) - A Holistically-Guided Decoder for Deep Representation Learning with
Applications to Semantic Segmentation and Object Detection [74.88284082187462]
One common strategy is to adopt dilated convolutions in the backbone networks to extract high-resolution feature maps.
We propose one novel holistically-guided decoder which is introduced to obtain the high-resolution semantic-rich feature maps.
arXiv Detail & Related papers (2020-12-18T10:51:49Z) - Real-Time High-Performance Semantic Image Segmentation of Urban Street
Scenes [98.65457534223539]
We propose a real-time high-performance DCNN-based method for robust semantic segmentation of urban street scenes.
The proposed method achieves the accuracy of 73.6% and 68.0% mean Intersection over Union (mIoU) with the inference speed of 51.0 fps and 39.3 fps.
arXiv Detail & Related papers (2020-03-11T08:45:53Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.