Rethinking Skip Connections in Encoder-decoder Networks for Monocular
Depth Estimation
- URL: http://arxiv.org/abs/2208.13441v1
- Date: Mon, 29 Aug 2022 09:20:53 GMT
- Title: Rethinking Skip Connections in Encoder-decoder Networks for Monocular
Depth Estimation
- Authors: Zhitong Lai, Haichao Sun, Rui Tian, Nannan Ding, Zhiguo Wu, Yanjie
Wang
- Abstract summary: We propose a full skip connection network (FSCN) for monocular depth estimation task.
In addition, to fuse features within skip connections more closely, we present an adaptive concatenation module (ACM)
- Score: 4.364863910305258
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Skip connections are fundamental units in encoder-decoder networks, which are
able to improve the feature propagtion of the neural networks. However, most
methods with skip connections just connected features with the same resolution
in the encoder and the decoder, which ignored the information loss in the
encoder with the layers going deeper. To leverage the information loss of the
features in shallower layers of the encoder, we propose a full skip connection
network (FSCN) for monocular depth estimation task. In addition, to fuse
features within skip connections more closely, we present an adaptive
concatenation module (ACM). Further more, we conduct extensive experiments on
the ourdoor and indoor datasets (i.e., the KITTI dataste and the NYU Depth V2
dataset) for FSCN and FSCN gets the state-of-the-art results.
Related papers
- LKASeg:Remote-Sensing Image Semantic Segmentation with Large Kernel Attention and Full-Scale Skip Connections [27.473573286685063]
We propose a remote-sensing image semantic segmentation network named LKASeg.
LKASeg combines Large Kernel Attention(LSKA) and Full-Scale Skip Connections(FSC)
On the ISPRS Vaihingen dataset, the mF1 and mIoU scores achieved 90.33% and 82.77%.
arXiv Detail & Related papers (2024-10-14T12:25:48Z) - Joint Channel Estimation and Feedback with Masked Token Transformers in
Massive MIMO Systems [74.52117784544758]
This paper proposes an encoder-decoder based network that unveils the intrinsic frequency-domain correlation within the CSI matrix.
The entire encoder-decoder network is utilized for channel compression.
Our method outperforms state-of-the-art channel estimation and feedback techniques in joint tasks.
arXiv Detail & Related papers (2023-06-08T06:15:17Z) - NAF: Neural Attenuation Fields for Sparse-View CBCT Reconstruction [79.13750275141139]
This paper proposes a novel and fast self-supervised solution for sparse-view CBCT reconstruction.
The desired attenuation coefficients are represented as a continuous function of 3D spatial coordinates, parameterized by a fully-connected deep neural network.
A learning-based encoder entailing hash coding is adopted to help the network capture high-frequency details.
arXiv Detail & Related papers (2022-09-29T04:06:00Z) - SoftPool++: An Encoder-Decoder Network for Point Cloud Completion [93.54286830844134]
We propose a novel convolutional operator for the task of point cloud completion.
The proposed operator does not require any max-pooling or voxelization operation.
We show that our approach achieves state-of-the-art performance in shape completion at low and high resolutions.
arXiv Detail & Related papers (2022-05-08T15:31:36Z) - HARP-Net: Hyper-Autoencoded Reconstruction Propagation for Scalable
Neural Audio Coding [25.51661602383911]
An autoencoder-based decoder employs quantization to turn its bottleneck layer activation into bitstrings.
To circumvent this issue, we employ additional skip connections between the corresponding pair of encoder-decoder layers.
We empirically verify that the proposed hyper-autoencoded architecture improves audio quality compared to an ordinary autoencoder baseline.
arXiv Detail & Related papers (2021-07-22T17:57:53Z) - Neural Distributed Source Coding [59.630059301226474]
We present a framework for lossy DSC that is agnostic to the correlation structure and can scale to high dimensions.
We evaluate our method on multiple datasets and show that our method can handle complex correlations and state-of-the-art PSNR.
arXiv Detail & Related papers (2021-06-05T04:50:43Z) - Latent Code-Based Fusion: A Volterra Neural Network Approach [21.25021807184103]
We propose a deep structure encoder using the recently introduced Volterra Neural Networks (VNNs)
We show that the proposed approach demonstrates a much-improved sample complexity over CNN-based auto-encoder with a superb robust classification performance.
arXiv Detail & Related papers (2021-04-10T18:29:01Z) - Beyond Single Stage Encoder-Decoder Networks: Deep Decoders for Semantic
Image Segmentation [56.44853893149365]
Single encoder-decoder methodologies for semantic segmentation are reaching their peak in terms of segmentation quality and efficiency per number of layers.
We propose a new architecture based on a decoder which uses a set of shallow networks for capturing more information content.
In order to further improve the architecture we introduce a weight function which aims to re-balance classes to increase the attention of the networks to under-represented objects.
arXiv Detail & Related papers (2020-07-19T18:44:34Z) - Suppress and Balance: A Simple Gated Network for Salient Object
Detection [89.88222217065858]
We propose a simple gated network (GateNet) to solve both issues at once.
With the help of multilevel gate units, the valuable context information from the encoder can be optimally transmitted to the decoder.
In addition, we adopt the atrous spatial pyramid pooling based on the proposed "Fold" operation (Fold-ASPP) to accurately localize salient objects of various scales.
arXiv Detail & Related papers (2020-07-16T02:00:53Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.