Intra-Source Style Augmentation for Improved Domain Generalization
- URL: http://arxiv.org/abs/2210.10175v2
- Date: Mon, 29 May 2023 07:19:55 GMT
- Title: Intra-Source Style Augmentation for Improved Domain Generalization
- Authors: Yumeng Li, Dan Zhang, Margret Keuper, Anna Khoreva
- Abstract summary: We propose an intra-source style augmentation (ISSA) method to improve domain generalization in semantic segmentation.
ISSA is model-agnostic and straightforwardly applicable with CNNs and Transformers.
It is also complementary to other domain generalization techniques, e.g., it improves the recent state-of-the-art solution RobustNet by $3%$ mIoU in Cityscapes to Dark Z"urich.
- Score: 21.591831983223997
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The generalization with respect to domain shifts, as they frequently appear
in applications such as autonomous driving, is one of the remaining big
challenges for deep learning models. Therefore, we propose an intra-source
style augmentation (ISSA) method to improve domain generalization in semantic
segmentation. Our method is based on a novel masked noise encoder for StyleGAN2
inversion. The model learns to faithfully reconstruct the image preserving its
semantic layout through noise prediction. Random masking of the estimated noise
enables the style mixing capability of our model, i.e. it allows to alter the
global appearance without affecting the semantic layout of an image. Using the
proposed masked noise encoder to randomize style and content combinations in
the training set, ISSA effectively increases the diversity of training data and
reduces spurious correlation. As a result, we achieve up to $12.4\%$ mIoU
improvements on driving-scene semantic segmentation under different types of
data shifts, i.e., changing geographic locations, adverse weather conditions,
and day to night. ISSA is model-agnostic and straightforwardly applicable with
CNNs and Transformers. It is also complementary to other domain generalization
techniques, e.g., it improves the recent state-of-the-art solution RobustNet by
$3\%$ mIoU in Cityscapes to Dark Z\"urich.
Related papers
- MoreStyle: Relax Low-frequency Constraint of Fourier-based Image Reconstruction in Generalizable Medical Image Segmentation [53.24011398381715]
We introduce a Plug-and-Play module for data augmentation called MoreStyle.
MoreStyle diversifies image styles by relaxing low-frequency constraints in Fourier space.
With the help of adversarial learning, MoreStyle pinpoints the most intricate style combinations within latent features.
arXiv Detail & Related papers (2024-03-18T11:38:47Z) - Improving the Transferability of Adversarial Examples with Arbitrary
Style Transfer [32.644062141738246]
A style transfer network can alter the distribution of low-level visual features in an image while preserving semantic content for humans.
We propose a novel attack method named Style Transfer Method (STM) that utilizes a proposed arbitrary style transfer network to transform the images into different domains.
Our proposed method can significantly improve the adversarial transferability on either normally trained models or adversarially trained models.
arXiv Detail & Related papers (2023-08-21T09:58:13Z) - Improving Diversity in Zero-Shot GAN Adaptation with Semantic Variations [61.132408427908175]
zero-shot GAN adaptation aims to reuse well-trained generators to synthesize images of an unseen target domain.
With only a single representative text feature instead of real images, the synthesized images gradually lose diversity.
We propose a novel method to find semantic variations of the target text in the CLIP space.
arXiv Detail & Related papers (2023-08-21T08:12:28Z) - Intra- & Extra-Source Exemplar-Based Style Synthesis for Improved Domain
Generalization [21.591831983223997]
We propose an exemplar-based style synthesis pipeline to improve domain generalization in semantic segmentation.
Our method is based on a novel masked noise encoder for StyleGAN2 inversion.
We achieve up to $12.4%$ mIoU improvements on driving-scene semantic segmentation under different types of data shifts.
arXiv Detail & Related papers (2023-07-02T19:56:43Z) - Learning Content-enhanced Mask Transformer for Domain Generalized
Urban-Scene Segmentation [28.165600284392042]
Domain-generalized urban-scene semantic segmentation (USSS) aims to learn generalized semantic predictions across diverse urban-scene styles.
Existing approaches typically rely on convolutional neural networks (CNNs) to learn the content of urban scenes.
We propose a Content-enhanced Mask TransFormer (CMFormer) for domain-generalized USSS.
arXiv Detail & Related papers (2023-07-01T15:48:33Z) - Condition-Invariant Semantic Segmentation [77.10045325743644]
We implement Condition-Invariant Semantic (CISS) on the current state-of-the-art domain adaptation architecture.
Our method achieves the second-best performance on the normal-to-adverse Cityscapes$to$ACDC benchmark.
CISS is shown to generalize well to domains unseen during training, such as BDD100K-night and ACDC-night.
arXiv Detail & Related papers (2023-05-27T03:05:07Z) - Uncovering the Disentanglement Capability in Text-to-Image Diffusion
Models [60.63556257324894]
A key desired property of image generative models is the ability to disentangle different attributes.
We propose a simple, light-weight image editing algorithm where the mixing weights of the two text embeddings are optimized for style matching and content preservation.
Experiments show that the proposed method can modify a wide range of attributes, with the performance outperforming diffusion-model-based image-editing algorithms.
arXiv Detail & Related papers (2022-12-16T19:58:52Z) - DGSS : Domain Generalized Semantic Segmentation using Iterative Style
Mining and Latent Representation Alignment [38.05196030226661]
Current state-of-the-art (SoTA) have proposed different mechanisms to bridge the domain gap, but they still perform poorly in low illumination conditions.
We propose a two-step framework wherein we first identify an adversarial style that maximizes the domain gap between stylized and source images.
We then propose a style mixing mechanism wherein the same objects from different styles are mixed to construct a new training image.
arXiv Detail & Related papers (2022-02-26T13:54:57Z) - MixStyle Neural Networks for Domain Generalization and Adaptation [122.36901703868321]
MixStyle is a plug-and-play module that can improve domain generalization performance without the need to collect more data or increase model capacity.
Our experiments show that MixStyle can significantly boost out-of-distribution generalization performance across a wide range of tasks including image recognition, instance retrieval and reinforcement learning.
arXiv Detail & Related papers (2021-07-05T14:29:19Z) - Domain Generalization with MixStyle [120.52367818581608]
Domain generalization aims to address this problem by learning from a set of source domains a model that is generalizable to any unseen domain.
Our method, termed MixStyle, is motivated by the observation that visual domain is closely related to image style.
MixStyle fits into mini-batch training perfectly and is extremely easy to implement.
arXiv Detail & Related papers (2021-04-05T16:58:09Z) - Domain-invariant Similarity Activation Map Contrastive Learning for
Retrieval-based Long-term Visual Localization [30.203072945001136]
In this work, a general architecture is first formulated probabilistically to extract domain invariant feature through multi-domain image translation.
And then a novel gradient-weighted similarity activation mapping loss (Grad-SAM) is incorporated for finer localization with high accuracy.
Extensive experiments have been conducted to validate the effectiveness of the proposed approach on the CMUSeasons dataset.
Our performance is on par with or even outperforms the state-of-the-art image-based localization baselines in medium or high precision.
arXiv Detail & Related papers (2020-09-16T14:43:22Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.