Related papers: Transfer Learning for Segmentation Problems: Choose the Right Encoder and Skip the Decoder

Transfer Learning for Segmentation Problems: Choose the Right Encoder and Skip the Decoder

URL: http://arxiv.org/abs/2207.14508v1
Date: Fri, 29 Jul 2022 07:02:05 GMT
Title: Transfer Learning for Segmentation Problems: Choose the Right Encoder and Skip the Decoder
Authors: Jonas Dippel, Matthias Lenga, Thomas Goerttler, Klaus Obermayer, Johannes H\"ohne
Abstract summary: It is common practice to reuse models initially trained on different data to increase downstream task performance. In this work, we investigate the impact of transfer learning for segmentation problems, being pixel-wise classification problems. We find that transfer learning the decoder does not help downstream segmentation tasks, while transfer learning the encoder is truly beneficial.
Score: 0.0
License: http://creativecommons.org/licenses/by/4.0/
Abstract: It is common practice to reuse models initially trained on different data to increase downstream task performance. Especially in the computer vision domain, ImageNet-pretrained weights have been successfully used for various tasks. In this work, we investigate the impact of transfer learning for segmentation problems, being pixel-wise classification problems that can be tackled with encoder-decoder architectures. We find that transfer learning the decoder does not help downstream segmentation tasks, while transfer learning the encoder is truly beneficial. We demonstrate that pretrained weights for a decoder may yield faster convergence, but they do not improve the overall model performance as one can obtain equivalent results with randomly initialized decoders. However, we show that it is more effective to reuse encoder weights trained on a segmentation or reconstruction task than reusing encoder weights trained on classification tasks. This finding implicates that using ImageNet-pretrained encoders for downstream segmentation problems is suboptimal. We also propose a contrastive self-supervised approach with multiple self-reconstruction tasks, which provides encoders that are suitable for transfer learning in segmentation problems in the absence of segmentation labels.

Related papers

Should we pre-train a decoder in contrastive learning for dense prediction tasks? [0.7237068561453082]
We propose a framework-agnostic adaptation to convert an encoder-only self-supervised learning (SSL) contrastive approach to an efficient encoder-decoder framework. We first update the existing architecture to accommodate a decoder and its respective contrastive loss. We then introduce a weighted encoder-decoder contrastive loss with non-competing objectives that facilitates the joint encoder-decoder architecture pre-training.
arXiv Detail & Related papers (2025-03-21T20:19:13Z)
DiffCut: Catalyzing Zero-Shot Semantic Segmentation with Diffusion Features and Recursive Normalized Cut [62.63481844384229]
Foundation models have emerged as powerful tools across various domains including language, vision, and multimodal tasks. In this paper, we use a diffusion UNet encoder as a foundation vision encoder and introduce DiffCut, an unsupervised zero-shot segmentation method. Our work highlights the remarkably accurate semantic knowledge embedded within diffusion UNet encoders that could then serve as foundation vision encoders for downstream tasks.
arXiv Detail & Related papers (2024-06-05T01:32:31Z)
Regress Before Construct: Regress Autoencoder for Point Cloud Self-supervised Learning [18.10704604275133]
Masked Autoencoders (MAE) have demonstrated promising performance in self-supervised learning for 2D and 3D computer vision. We propose Point Regress AutoEncoder (Point-RAE), a new scheme for regressive autoencoders for point cloud self-supervised learning. Our approach is efficient during pre-training and generalizes well on various downstream tasks.
arXiv Detail & Related papers (2023-09-25T17:23:33Z)
AWEncoder: Adversarial Watermarking Pre-trained Encoders in Contrastive Learning [18.90841192412555]
We introduce AWEncoder, an adversarial method for watermarking the pre-trained encoder in contrastive learning. The proposed work enjoys pretty good effectiveness and robustness on different contrastive learning algorithms and downstream tasks.
arXiv Detail & Related papers (2022-08-08T07:23:37Z)
Discrete Key-Value Bottleneck [95.61236311369821]
Deep neural networks perform well on classification tasks where data streams are i.i.d. and labeled data is abundant. One powerful approach that has addressed this challenge involves pre-training of large encoders on volumes of readily available data, followed by task-specific tuning. Given a new task, however, updating the weights of these encoders is challenging as a large number of weights needs to be fine-tuned, and as a result, they forget information about the previous tasks. We propose a model architecture to address this issue, building upon a discrete bottleneck containing pairs of separate and learnable key-value codes.
arXiv Detail & Related papers (2022-07-22T17:52:30Z)
Watermarking Pre-trained Encoders in Contrastive Learning [9.23485246108653]
The pre-trained encoders are an important intellectual property that needs to be carefully protected. It is challenging to migrate existing watermarking techniques from the classification tasks to the contrastive learning scenario. We introduce a task-agnostic loss function to effectively embed into the encoder a backdoor as the watermark.
arXiv Detail & Related papers (2022-01-20T15:14:31Z)
Adversarial Neural Networks for Error Correcting Codes [76.70040964453638]
We introduce a general framework to boost the performance and applicability of machine learning (ML) models. We propose to combine ML decoders with a competing discriminator network that tries to distinguish between codewords and noisy words. Our framework is game-theoretic, motivated by generative adversarial networks (GANs)
arXiv Detail & Related papers (2021-12-21T19:14:44Z)
A manifold learning perspective on representation learning: Learning decoder and representations without an encoder [0.0]
Autoencoders are commonly used in representation learning. Inspired by manifold learning, we show that the decoder can be trained on its own by learning the representations of the training samples. Our approach of training the decoder alone facilitates representation learning even on small data sets.
arXiv Detail & Related papers (2021-08-31T15:08:50Z)
Dynamic Neural Representational Decoders for High-Resolution Semantic Segmentation [98.05643473345474]
We propose a novel decoder, termed dynamic neural representational decoder (NRD) As each location on the encoder's output corresponds to a local patch of the semantic labels, in this work, we represent these local patches of labels with compact neural networks. This neural representation enables our decoder to leverage the smoothness prior in the semantic label space, and thus makes our decoder more efficient.
arXiv Detail & Related papers (2021-07-30T04:50:56Z)
Beyond Single Stage Encoder-Decoder Networks: Deep Decoders for Semantic Image Segmentation [56.44853893149365]
Single encoder-decoder methodologies for semantic segmentation are reaching their peak in terms of segmentation quality and efficiency per number of layers. We propose a new architecture based on a decoder which uses a set of shallow networks for capturing more information content. In order to further improve the architecture we introduce a weight function which aims to re-balance classes to increase the attention of the networks to under-represented objects.
arXiv Detail & Related papers (2020-07-19T18:44:34Z)
Rethinking and Improving Natural Language Generation with Layer-Wise Multi-View Decoding [59.48857453699463]
In sequence-to-sequence learning, the decoder relies on the attention mechanism to efficiently extract information from the encoder. Recent work has proposed to use representations from different encoder layers for diversified levels of information. We propose layer-wise multi-view decoding, where for each decoder layer, together with the representations from the last encoder layer, which serve as a global view, those from other encoder layers are supplemented for a stereoscopic view of the source sequences.
arXiv Detail & Related papers (2020-05-16T20:00:39Z)

This list is automatically generated from the titles and abstracts of the papers in this site.