Beyond Single Stage Encoder-Decoder Networks: Deep Decoders for Semantic
Image Segmentation
- URL: http://arxiv.org/abs/2007.09746v1
- Date: Sun, 19 Jul 2020 18:44:34 GMT
- Title: Beyond Single Stage Encoder-Decoder Networks: Deep Decoders for Semantic
Image Segmentation
- Authors: Gabriel L. Oliveira, Senthil Yogamani, Wolfram Burgard and Thomas Brox
- Abstract summary: Single encoder-decoder methodologies for semantic segmentation are reaching their peak in terms of segmentation quality and efficiency per number of layers.
We propose a new architecture based on a decoder which uses a set of shallow networks for capturing more information content.
In order to further improve the architecture we introduce a weight function which aims to re-balance classes to increase the attention of the networks to under-represented objects.
- Score: 56.44853893149365
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Single encoder-decoder methodologies for semantic segmentation are reaching
their peak in terms of segmentation quality and efficiency per number of
layers. To address these limitations, we propose a new architecture based on a
decoder which uses a set of shallow networks for capturing more information
content. The new decoder has a new topology of skip connections, namely
backward and stacked residual connections. In order to further improve the
architecture we introduce a weight function which aims to re-balance classes to
increase the attention of the networks to under-represented objects. We carried
out an extensive set of experiments that yielded state-of-the-art results for
the CamVid, Gatech and Freiburg Forest datasets. Moreover, to further prove the
effectiveness of our decoder, we conducted a set of experiments studying the
impact of our decoder to state-of-the-art segmentation techniques.
Additionally, we present a set of experiments augmenting semantic segmentation
with optical flow information, showing that motion clues can boost pure image
based semantic segmentation approaches.
Related papers
- DiffCut: Catalyzing Zero-Shot Semantic Segmentation with Diffusion Features and Recursive Normalized Cut [62.63481844384229]
Foundation models have emerged as powerful tools across various domains including language, vision, and multimodal tasks.
In this paper, we use a diffusion UNet encoder as a foundation vision encoder and introduce DiffCut, an unsupervised zero-shot segmentation method.
Our work highlights the remarkably accurate semantic knowledge embedded within diffusion UNet encoders that could then serve as foundation vision encoders for downstream tasks.
arXiv Detail & Related papers (2024-06-05T01:32:31Z) - Triple-View Knowledge Distillation for Semi-Supervised Semantic
Segmentation [54.23510028456082]
We propose a Triple-view Knowledge Distillation framework, termed TriKD, for semi-supervised semantic segmentation.
The framework includes the triple-view encoder and the dual-frequency decoder.
arXiv Detail & Related papers (2023-09-22T01:02:21Z) - Transfer Learning for Segmentation Problems: Choose the Right Encoder
and Skip the Decoder [0.0]
It is common practice to reuse models initially trained on different data to increase downstream task performance.
In this work, we investigate the impact of transfer learning for segmentation problems, being pixel-wise classification problems.
We find that transfer learning the decoder does not help downstream segmentation tasks, while transfer learning the encoder is truly beneficial.
arXiv Detail & Related papers (2022-07-29T07:02:05Z) - Small Lesion Segmentation in Brain MRIs with Subpixel Embedding [105.1223735549524]
We present a method to segment MRI scans of the human brain into ischemic stroke lesion and normal tissues.
We propose a neural network architecture in the form of a standard encoder-decoder where predictions are guided by a spatial expansion embedding network.
arXiv Detail & Related papers (2021-09-18T00:21:17Z) - Dynamic Neural Representational Decoders for High-Resolution Semantic
Segmentation [98.05643473345474]
We propose a novel decoder, termed dynamic neural representational decoder (NRD)
As each location on the encoder's output corresponds to a local patch of the semantic labels, in this work, we represent these local patches of labels with compact neural networks.
This neural representation enables our decoder to leverage the smoothness prior in the semantic label space, and thus makes our decoder more efficient.
arXiv Detail & Related papers (2021-07-30T04:50:56Z) - Fractal Pyramid Networks [3.7384509727711923]
We propose a new network architecture, the Fractal Pyramid Networks (PFNs) for pixel-wise prediction tasks.
PFNs hold multiple information processing pathways and encode the information to multiple separate small-channel features.
Our models can compete or outperform the state-of-the-art methods on the KITTI dataset with much fewer parameters.
arXiv Detail & Related papers (2021-06-28T13:15:30Z) - Transformer Meets DCFAM: A Novel Semantic Segmentation Scheme for
Fine-Resolution Remote Sensing Images [6.171417925832851]
We introduce the Swin Transformer as the backbone to fully extract the context information.
We also design a novel decoder named densely connected feature aggregation module (DCFAM) to restore the resolution and generate the segmentation map.
arXiv Detail & Related papers (2021-04-25T11:34:22Z) - Rethinking and Improving Natural Language Generation with Layer-Wise
Multi-View Decoding [59.48857453699463]
In sequence-to-sequence learning, the decoder relies on the attention mechanism to efficiently extract information from the encoder.
Recent work has proposed to use representations from different encoder layers for diversified levels of information.
We propose layer-wise multi-view decoding, where for each decoder layer, together with the representations from the last encoder layer, which serve as a global view, those from other encoder layers are supplemented for a stereoscopic view of the source sequences.
arXiv Detail & Related papers (2020-05-16T20:00:39Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.