DAFormer: Improving Network Architectures and Training Strategies for
Domain-Adaptive Semantic Segmentation
- URL: http://arxiv.org/abs/2111.14887v1
- Date: Mon, 29 Nov 2021 19:00:46 GMT
- Title: DAFormer: Improving Network Architectures and Training Strategies for
Domain-Adaptive Semantic Segmentation
- Authors: Lukas Hoyer, Dengxin Dai, Luc Van Gool
- Abstract summary: We study the unsupervised domain adaptation (UDA) process.
We propose a novel UDA method, DAFormer, based on the benchmark results.
DAFormer significantly improves the state-of-the-art performance by 10.8 mIoU for GTA->Cityscapes and 5.4 mIoU for Synthia->Cityscapes.
- Score: 99.88539409432916
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: As acquiring pixel-wise annotations of real-world images for semantic
segmentation is a costly process, a model can instead be trained with more
accessible synthetic data and adapted to real images without requiring their
annotations. This process is studied in unsupervised domain adaptation (UDA).
Even though a large number of methods propose new adaptation strategies, they
are mostly based on outdated network architectures. As the influence of recent
network architectures has not been systematically studied, we first benchmark
different network architectures for UDA and then propose a novel UDA method,
DAFormer, based on the benchmark results. The DAFormer network consists of a
Transformer encoder and a multi-level context-aware feature fusion decoder. It
is enabled by three simple but crucial training strategies to stabilize the
training and to avoid overfitting DAFormer to the source domain: While the Rare
Class Sampling on the source domain improves the quality of pseudo-labels by
mitigating the confirmation bias of self-training towards common classes, the
Thing-Class ImageNet Feature Distance and a learning rate warmup promote
feature transfer from ImageNet pretraining. DAFormer significantly improves the
state-of-the-art performance by 10.8 mIoU for GTA->Cityscapes and 5.4 mIoU for
Synthia->Cityscapes and enables learning even difficult classes such as train,
bus, and truck well. The implementation is available at
https://github.com/lhoyer/DAFormer.
Related papers
- Intra-task Mutual Attention based Vision Transformer for Few-Shot Learning [12.5354658533836]
Humans possess remarkable ability to accurately classify new, unseen images after being exposed to only a few examples.
For artificial neural network models, determining the most relevant features for distinguishing between two images with limited samples presents a challenge.
We propose an intra-task mutual attention method for few-shot learning, that involves splitting the support and query samples into patches.
arXiv Detail & Related papers (2024-05-06T02:02:57Z) - Domain Adaptive and Generalizable Network Architectures and Training
Strategies for Semantic Image Segmentation [108.33885637197614]
Unsupervised domain adaptation (UDA) and domain generalization (DG) enable machine learning models trained on a source domain to perform well on unlabeled or unseen target domains.
We propose HRDA, a multi-resolution framework for UDA&DG, that combines the strengths of small high-resolution crops to preserve fine segmentation details and large low-resolution crops to capture long-range context dependencies with a learned scale attention.
arXiv Detail & Related papers (2023-04-26T15:18:45Z) - Novel transfer learning schemes based on Siamese networks and synthetic
data [6.883906273999368]
Transfer learning schemes based on deep networks offer state-of-the-art technologies in computer vision.
Such applications are currently restricted to application domains where suitable deepnetwork models are readily available.
We propose a novel transfer learning scheme which expands a recently introduced Twin-VAE architecture.
arXiv Detail & Related papers (2022-11-21T09:48:21Z) - Prompt Tuning for Parameter-efficient Medical Image Segmentation [79.09285179181225]
We propose and investigate several contributions to achieve a parameter-efficient but effective adaptation for semantic segmentation on two medical imaging datasets.
We pre-train this architecture with a dedicated dense self-supervision scheme based on assignments to online generated prototypes.
We demonstrate that the resulting neural network model is able to attenuate the gap between fully fine-tuned and parameter-efficiently adapted models.
arXiv Detail & Related papers (2022-11-16T21:55:05Z) - Adaptive Convolutional Dictionary Network for CT Metal Artifact
Reduction [62.691996239590125]
We propose an adaptive convolutional dictionary network (ACDNet) for metal artifact reduction.
Our ACDNet can automatically learn the prior for artifact-free CT images via training data and adaptively adjust the representation kernels for each input CT image.
Our method inherits the clear interpretability of model-based methods and maintains the powerful representation ability of learning-based methods.
arXiv Detail & Related papers (2022-05-16T06:49:36Z) - ProCST: Boosting Semantic Segmentation using Progressive Cyclic
Style-Transfer [38.03127458140549]
We propose a novel two-stage framework for improving domain adaptation techniques.
In the first step, we progressively train a multi-scale neural network to perform an initial transfer between the source data to the target data.
This new data has a reduced domain gap from the desired target domain, and the applied UDA approach further closes the gap.
arXiv Detail & Related papers (2022-04-25T18:01:05Z) - DSP: Dual Soft-Paste for Unsupervised Domain Adaptive Semantic
Segmentation [97.74059510314554]
Unsupervised domain adaptation (UDA) for semantic segmentation aims to adapt a segmentation model trained on the labeled source domain to the unlabeled target domain.
Existing methods try to learn domain invariant features while suffering from large domain gaps.
We propose a novel Dual Soft-Paste (DSP) method in this paper.
arXiv Detail & Related papers (2021-07-20T16:22:40Z) - PixMatch: Unsupervised Domain Adaptation via Pixelwise Consistency
Training [4.336877104987131]
Unsupervised domain adaptation is a promising technique for semantic segmentation.
We present a novel framework for unsupervised domain adaptation based on the notion of target-domain consistency training.
Our approach is simpler, easier to implement, and more memory-efficient during training.
arXiv Detail & Related papers (2021-05-17T19:36:28Z) - Cream of the Crop: Distilling Prioritized Paths For One-Shot Neural
Architecture Search [60.965024145243596]
One-shot weight sharing methods have recently drawn great attention in neural architecture search due to high efficiency and competitive performance.
To alleviate this problem, we present a simple yet effective architecture distillation method.
We introduce the concept of prioritized path, which refers to the architecture candidates exhibiting superior performance during training.
Since the prioritized paths are changed on the fly depending on their performance and complexity, the final obtained paths are the cream of the crop.
arXiv Detail & Related papers (2020-10-29T17:55:05Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.