Dynamic Neural Representational Decoders for High-Resolution Semantic
Segmentation
- URL: http://arxiv.org/abs/2107.14428v1
- Date: Fri, 30 Jul 2021 04:50:56 GMT
- Title: Dynamic Neural Representational Decoders for High-Resolution Semantic
Segmentation
- Authors: Bowen Zhang, Yifan Liu, Zhi Tian, Chunhua Shen
- Abstract summary: We propose a novel decoder, termed dynamic neural representational decoder (NRD)
As each location on the encoder's output corresponds to a local patch of the semantic labels, in this work, we represent these local patches of labels with compact neural networks.
This neural representation enables our decoder to leverage the smoothness prior in the semantic label space, and thus makes our decoder more efficient.
- Score: 98.05643473345474
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Semantic segmentation requires per-pixel prediction for a given image.
Typically, the output resolution of a segmentation network is severely reduced
due to the downsampling operations in the CNN backbone. Most previous methods
employ upsampling decoders to recover the spatial resolution. Various decoders
were designed in the literature. Here, we propose a novel decoder, termed
dynamic neural representational decoder (NRD), which is simple yet
significantly more efficient. As each location on the encoder's output
corresponds to a local patch of the semantic labels, in this work, we represent
these local patches of labels with compact neural networks. This neural
representation enables our decoder to leverage the smoothness prior in the
semantic label space, and thus makes our decoder more efficient. Furthermore,
these neural representations are dynamically generated and conditioned on the
outputs of the encoder networks. The desired semantic labels can be efficiently
decoded from the neural representations, resulting in high-resolution semantic
segmentation predictions. We empirically show that our proposed decoder can
outperform the decoder in DeeplabV3+ with only 30% computational complexity,
and achieve competitive performance with the methods using dilated encoders
with only 15% computation. Experiments on the Cityscapes, ADE20K, and PASCAL
Context datasets demonstrate the effectiveness and efficiency of our proposed
method.
Related papers
- NAF: Neural Attenuation Fields for Sparse-View CBCT Reconstruction [79.13750275141139]
This paper proposes a novel and fast self-supervised solution for sparse-view CBCT reconstruction.
The desired attenuation coefficients are represented as a continuous function of 3D spatial coordinates, parameterized by a fully-connected deep neural network.
A learning-based encoder entailing hash coding is adopted to help the network capture high-frequency details.
arXiv Detail & Related papers (2022-09-29T04:06:00Z) - Variable Bitrate Neural Fields [75.24672452527795]
We present a dictionary method for compressing feature grids, reducing their memory consumption by up to 100x.
We formulate the dictionary optimization as a vector-quantized auto-decoder problem which lets us learn end-to-end discrete neural representations in a space where no direct supervision is available.
arXiv Detail & Related papers (2022-06-15T17:58:34Z) - Reducing Redundancy in the Bottleneck Representation of the Autoencoders [98.78384185493624]
Autoencoders are a type of unsupervised neural networks, which can be used to solve various tasks.
We propose a scheme to explicitly penalize feature redundancies in the bottleneck representation.
We tested our approach across different tasks: dimensionality reduction using three different dataset, image compression using the MNIST dataset, and image denoising using fashion MNIST.
arXiv Detail & Related papers (2022-02-09T18:48:02Z) - Small Lesion Segmentation in Brain MRIs with Subpixel Embedding [105.1223735549524]
We present a method to segment MRI scans of the human brain into ischemic stroke lesion and normal tissues.
We propose a neural network architecture in the form of a standard encoder-decoder where predictions are guided by a spatial expansion embedding network.
arXiv Detail & Related papers (2021-09-18T00:21:17Z) - Atrous Residual Interconnected Encoder to Attention Decoder Framework
for Vertebrae Segmentation via 3D Volumetric CT Images [1.8146155083014204]
This paper proposes a novel algorithm for automated vertebrae segmentation via 3D volumetric spine CT images.
The proposed model is based on the structure of encoder to decoder, using layer normalization to optimize mini-batch training performance.
The experimental results show that our model achieves competitive performance compared with other state-of-the-art medical semantic segmentation methods.
arXiv Detail & Related papers (2021-04-08T12:09:16Z) - Beyond Single Stage Encoder-Decoder Networks: Deep Decoders for Semantic
Image Segmentation [56.44853893149365]
Single encoder-decoder methodologies for semantic segmentation are reaching their peak in terms of segmentation quality and efficiency per number of layers.
We propose a new architecture based on a decoder which uses a set of shallow networks for capturing more information content.
In order to further improve the architecture we introduce a weight function which aims to re-balance classes to increase the attention of the networks to under-represented objects.
arXiv Detail & Related papers (2020-07-19T18:44:34Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.