TransAdapt: A Transformative Framework for Online Test Time Adaptive
Semantic Segmentation
- URL: http://arxiv.org/abs/2302.14611v1
- Date: Fri, 24 Feb 2023 01:45:29 GMT
- Title: TransAdapt: A Transformative Framework for Online Test Time Adaptive
Semantic Segmentation
- Authors: Debasmit Das, Shubhankar Borse, Hyojin Park, Kambiz Azarian, Hong Cai,
Risheek Garrepalli, Fatih Porikli
- Abstract summary: Test-time adaptive (TTA) semantic segmentation adapts a source pre-trained image semantic segmentation model to unlabeled batches of target domain test images.
We propose TransAdapt, a framework that uses transformer and input transformations to improve segmentation performance.
- Score: 43.31250660146429
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Test-time adaptive (TTA) semantic segmentation adapts a source pre-trained
image semantic segmentation model to unlabeled batches of target domain test
images, different from real-world, where samples arrive one-by-one in an online
fashion. To tackle online settings, we propose TransAdapt, a framework that
uses transformer and input transformations to improve segmentation performance.
Specifically, we pre-train a transformer-based module on a segmentation network
that transforms unsupervised segmentation output to a more reliable supervised
output, without requiring test-time online training. To also facilitate
test-time adaptation, we propose an unsupervised loss based on the transformed
input that enforces the model to be invariant and equivariant to photometric
and geometric perturbations, respectively. Overall, our framework produces
higher quality segmentation masks with up to 17.6% and 2.8% mIOU improvement
over no-adaptation and competitive baselines, respectively.
Related papers
- Investigating Shift Equivalence of Convolutional Neural Networks in
Industrial Defect Segmentation [3.843350895842836]
In industrial defect segmentation tasks, output consistency (also referred to equivalence) of the model is often overlooked.
A novel pair of down/upsampling layers called component attention polyphase sampling (CAPS) is proposed as a replacement for the conventional sampling layers in CNNs.
The experimental results on the micro surface defect (MSD) dataset and four real-world industrial defect datasets demonstrate that the proposed method exhibits higher equivalence and segmentation performance.
arXiv Detail & Related papers (2023-09-29T00:04:47Z) - AdaptiveClick: Clicks-aware Transformer with Adaptive Focal Loss for Interactive Image Segmentation [51.82915587228898]
We introduce AdaptiveClick -- a transformer-based, mask-adaptive segmentation framework for Interactive Image (IIS)
The key ingredient of our method is the Click-Aware Mask-adaptive transformer Decoder (CAMD), which enhances the interaction between click and image features.
With a plain ViT backbone, extensive experimental results on nine datasets demonstrate the superiority of AdaptiveClick compared to state-of-the-art methods.
arXiv Detail & Related papers (2023-05-07T13:47:35Z) - DynaST: Dynamic Sparse Transformer for Exemplar-Guided Image Generation [56.514462874501675]
We propose a dynamic sparse attention based Transformer model to achieve fine-level matching with favorable efficiency.
The heart of our approach is a novel dynamic-attention unit, dedicated to covering the variation on the optimal number of tokens one position should focus on.
Experiments on three applications, pose-guided person image generation, edge-based face synthesis, and undistorted image style transfer, demonstrate that DynaST achieves superior performance in local details.
arXiv Detail & Related papers (2022-07-13T11:12:03Z) - UniDAformer: Unified Domain Adaptive Panoptic Segmentation Transformer
via Hierarchical Mask Calibration [49.16591283724376]
We design UniDAformer, a unified domain adaptive panoptic segmentation transformer that is simple but can achieve domain adaptive instance segmentation and semantic segmentation simultaneously within a single network.
UniDAformer introduces Hierarchical Mask (HMC) that rectifies inaccurate predictions at the level of regions, superpixels and annotated pixels via online self-training on the fly.
It has three unique features: 1) it enables unified domain adaptive panoptic adaptation; 2) it mitigates false predictions and improves domain adaptive panoptic segmentation effectively; 3) it is end-to-end trainable with a much simpler training and inference pipeline.
arXiv Detail & Related papers (2022-06-30T07:32:23Z) - Style Mixing and Patchwise Prototypical Matching for One-Shot
Unsupervised Domain Adaptive Semantic Segmentation [21.01132797297286]
In one-shot unsupervised domain adaptation, segmentors only see one unlabeled target image during training.
We propose a new OSUDA method that can effectively relieve such computational burden.
Our method achieves new state-of-the-art performance on two commonly used benchmarks for domain adaptive semantic segmentation.
arXiv Detail & Related papers (2021-12-09T02:47:46Z) - Self-supervised Augmentation Consistency for Adapting Semantic
Segmentation [56.91850268635183]
We propose an approach to domain adaptation for semantic segmentation that is both practical and highly accurate.
We employ standard data augmentation techniques $-$ photometric noise, flipping and scaling $-$ and ensure consistency of the semantic predictions.
We achieve significant improvements of the state-of-the-art segmentation accuracy after adaptation, consistent both across different choices of the backbone architecture and adaptation scenarios.
arXiv Detail & Related papers (2021-04-30T21:32:40Z) - Mixup-Transformer: Dynamic Data Augmentation for NLP Tasks [75.69896269357005]
Mixup is the latest data augmentation technique that linearly interpolates input examples and the corresponding labels.
In this paper, we explore how to apply mixup to natural language processing tasks.
We incorporate mixup to transformer-based pre-trained architecture, named "mixup-transformer", for a wide range of NLP tasks.
arXiv Detail & Related papers (2020-10-05T23:37:30Z) - Probabilistic Spatial Transformer Networks [0.6999740786886537]
We propose a probabilistic extension that estimates a transformation rather than a deterministic one.
We show that these two properties lead to improved classification performance, robustness and model calibration.
We further demonstrate that the approach generalizes to non-visual domains by improving model performance on time-series data.
arXiv Detail & Related papers (2020-04-07T18:22:02Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.