PlutoNet: An Efficient Polyp Segmentation Network with Modified Partial
Decoder and Decoder Consistency Training
- URL: http://arxiv.org/abs/2204.03652v4
- Date: Sat, 18 Mar 2023 16:25:15 GMT
- Title: PlutoNet: An Efficient Polyp Segmentation Network with Modified Partial
Decoder and Decoder Consistency Training
- Authors: Tugberk Erol and Duygu Sarikaya
- Abstract summary: We propose PlutoNet for polyp segmentation which requires only 2,626,537 parameters, less than 10% of the parameters required by its counterparts.
We train the modified partial decoder and the auxiliary decoder with a combined loss to enforce consistency, which helps improve the encoders representations.
We perform ablation studies and extensive experiments which show that PlutoNet performs significantly better than the state-of-the-art models.
- Score: 0.40611352512781856
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Deep learning models are used to minimize the number of polyps that goes
unnoticed by the experts and to accurately segment the detected polyps during
interventions. Although state-of-the-art models are proposed, it remains a
challenge to define representations that are able to generalize well and that
mediate between capturing low-level features and higher-level semantic details
without being redundant. Another challenge with these models is that they
require too many parameters, which can pose a problem with real-time
applications. To address these problems, we propose PlutoNet for polyp
segmentation which requires only 2,626,537 parameters, less than 10\% of the
parameters required by its counterparts. With PlutoNet, we propose a novel
\emph{decoder consistency training} approach that consists of a shared encoder,
the modified partial decoder which is a combination of the partial decoder and
full-scale connections that capture salient features at different scales
without being redundant, and the auxiliary decoder which focuses on
higher-level relevant semantic features. We train the modified partial decoder
and the auxiliary decoder with a combined loss to enforce consistency, which
helps improve the encoders representations. This way we are able to reduce
uncertainty and false positive rates. We perform ablation studies and extensive
experiments which show that PlutoNet performs significantly better than the
state-of-the-art models, particularly on unseen datasets and datasets across
different domains.
Related papers
- Pre-training Point Cloud Compact Model with Partial-aware Reconstruction [51.403810709250024]
We present a pre-trained Point cloud Compact Model with Partial-aware textbfReconstruction, named Point-CPR.
Our model exhibits strong performance across various tasks, especially surpassing the leading MPM-based model PointGPT-B with only 2% of its parameters.
arXiv Detail & Related papers (2024-07-12T15:18:14Z) - Global Context Aggregation Network for Lightweight Saliency Detection of
Surface Defects [70.48554424894728]
We develop a Global Context Aggregation Network (GCANet) for lightweight saliency detection of surface defects on the encoder-decoder structure.
First, we introduce a novel transformer encoder on the top layer of the lightweight backbone, which captures global context information through a novel Depth-wise Self-Attention (DSA) module.
The experimental results on three public defect datasets demonstrate that the proposed network achieves a better trade-off between accuracy and running efficiency compared with other 17 state-of-the-art methods.
arXiv Detail & Related papers (2023-09-22T06:19:11Z) - Variational Autoencoding of Dental Point Clouds [10.137124603866036]
This paper introduces the FDI 16 dataset, an extensive collection of tooth meshes and point clouds.
We present a novel approach: Variational FoldingNet (VF-Net), a fully probabilistic variational autoencoder for point clouds.
arXiv Detail & Related papers (2023-07-20T14:18:44Z) - Complexity Matters: Rethinking the Latent Space for Generative Modeling [65.64763873078114]
In generative modeling, numerous successful approaches leverage a low-dimensional latent space, e.g., Stable Diffusion.
In this study, we aim to shed light on this under-explored topic by rethinking the latent space from the perspective of model complexity.
arXiv Detail & Related papers (2023-07-17T07:12:29Z) - Discrete Key-Value Bottleneck [95.61236311369821]
Deep neural networks perform well on classification tasks where data streams are i.i.d. and labeled data is abundant.
One powerful approach that has addressed this challenge involves pre-training of large encoders on volumes of readily available data, followed by task-specific tuning.
Given a new task, however, updating the weights of these encoders is challenging as a large number of weights needs to be fine-tuned, and as a result, they forget information about the previous tasks.
We propose a model architecture to address this issue, building upon a discrete bottleneck containing pairs of separate and learnable key-value codes.
arXiv Detail & Related papers (2022-07-22T17:52:30Z) - Multifidelity data fusion in convolutional encoder/decoder networks [0.0]
We analyze the regression accuracy of convolutional neural networks assembled from encoders, decoders and skip connections.
We demonstrate their accuracy when trained on a few high-fidelity and many low-fidelity data.
arXiv Detail & Related papers (2022-05-10T21:51:22Z) - SoftPool++: An Encoder-Decoder Network for Point Cloud Completion [93.54286830844134]
We propose a novel convolutional operator for the task of point cloud completion.
The proposed operator does not require any max-pooling or voxelization operation.
We show that our approach achieves state-of-the-art performance in shape completion at low and high resolutions.
arXiv Detail & Related papers (2022-05-08T15:31:36Z) - Attention W-Net: Improved Skip Connections for better Representations [5.027571997864707]
We propose Attention W-Net, a new U-Net based architecture for retinal vessel segmentation.
We observe an AUC and F1-Score of 0.8407 and 0.9833 - a sizeable improvement over its LadderNet backbone.
arXiv Detail & Related papers (2021-10-17T12:44:36Z) - CarNet: A Lightweight and Efficient Encoder-Decoder Architecture for
High-quality Road Crack Detection [21.468229247797627]
We present a lightweight encoder-decoder architecture, CarNet, for efficient and high-quality crack detection.
In particular, we propose that the ideal encoder should present an olive-type distribution about the number of convolutional layers at different stages.
In the decoder, we introduce a lightweight up-sampling feature pyramid module to learn rich hierarchical features for crack detection.
arXiv Detail & Related papers (2021-09-13T05:01:34Z) - Suppress and Balance: A Simple Gated Network for Salient Object
Detection [89.88222217065858]
We propose a simple gated network (GateNet) to solve both issues at once.
With the help of multilevel gate units, the valuable context information from the encoder can be optimally transmitted to the decoder.
In addition, we adopt the atrous spatial pyramid pooling based on the proposed "Fold" operation (Fold-ASPP) to accurately localize salient objects of various scales.
arXiv Detail & Related papers (2020-07-16T02:00:53Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.