Intra- & Extra-Source Exemplar-Based Style Synthesis for Improved Domain
Generalization
- URL: http://arxiv.org/abs/2307.00648v1
- Date: Sun, 2 Jul 2023 19:56:43 GMT
- Title: Intra- & Extra-Source Exemplar-Based Style Synthesis for Improved Domain
Generalization
- Authors: Yumeng Li, Dan Zhang, Margret Keuper, Anna Khoreva
- Abstract summary: We propose an exemplar-based style synthesis pipeline to improve domain generalization in semantic segmentation.
Our method is based on a novel masked noise encoder for StyleGAN2 inversion.
We achieve up to $12.4%$ mIoU improvements on driving-scene semantic segmentation under different types of data shifts.
- Score: 21.591831983223997
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The generalization with respect to domain shifts, as they frequently appear
in applications such as autonomous driving, is one of the remaining big
challenges for deep learning models. Therefore, we propose an exemplar-based
style synthesis pipeline to improve domain generalization in semantic
segmentation. Our method is based on a novel masked noise encoder for StyleGAN2
inversion. The model learns to faithfully reconstruct the image, preserving its
semantic layout through noise prediction. Using the proposed masked noise
encoder to randomize style and content combinations in the training set, i.e.,
intra-source style augmentation (ISSA) effectively increases the diversity of
training data and reduces spurious correlation. As a result, we achieve up to
$12.4\%$ mIoU improvements on driving-scene semantic segmentation under
different types of data shifts, i.e., changing geographic locations, adverse
weather conditions, and day to night. ISSA is model-agnostic and
straightforwardly applicable with CNNs and Transformers. It is also
complementary to other domain generalization techniques, e.g., it improves the
recent state-of-the-art solution RobustNet by $3\%$ mIoU in Cityscapes to Dark
Z\"urich. In addition, we demonstrate the strong plug-n-play ability of the
proposed style synthesis pipeline, which is readily usable for extra-source
exemplars e.g., web-crawled images, without any retraining or fine-tuning.
Moreover, we study a new use case to indicate neural network's generalization
capability by building a stylized proxy validation set. This application has
significant practical sense for selecting models to be deployed in the
open-world environment. Our code is available at
\url{https://github.com/boschresearch/ISSA}.
Related papers
- StyDeSty: Min-Max Stylization and Destylization for Single Domain Generalization [85.18995948334592]
Single domain generalization (single DG) aims at learning a robust model generalizable to unseen domains from only one training domain.
State-of-the-art approaches have mostly relied on data augmentations, such as adversarial perturbation and style enhancement, to synthesize new data.
We propose emphStyDeSty, which explicitly accounts for the alignment of the source and pseudo domains in the process of data augmentation.
arXiv Detail & Related papers (2024-06-01T02:41:34Z) - Text-image Alignment for Diffusion-based Perception [12.98777134700767]
Diffusion models are generative models with impressive text-to-image synthesis capabilities.
It is unclear how to use the prompting interface when applying diffusion backbones to vision tasks.
We find that automatically generated captions can improve text-image alignment and significantly enhance a model's cross-attention maps.
arXiv Detail & Related papers (2023-09-29T05:16:41Z) - In-Domain GAN Inversion for Faithful Reconstruction and Editability [132.68255553099834]
We propose in-domain GAN inversion, which consists of a domain-guided domain-regularized and a encoder to regularize the inverted code in the native latent space of the pre-trained GAN model.
We make comprehensive analyses on the effects of the encoder structure, the starting inversion point, as well as the inversion parameter space, and observe the trade-off between the reconstruction quality and the editing property.
arXiv Detail & Related papers (2023-09-25T08:42:06Z) - Condition-Invariant Semantic Segmentation [77.10045325743644]
We implement Condition-Invariant Semantic (CISS) on the current state-of-the-art domain adaptation architecture.
Our method achieves the second-best performance on the normal-to-adverse Cityscapes$to$ACDC benchmark.
CISS is shown to generalize well to domains unseen during training, such as BDD100K-night and ACDC-night.
arXiv Detail & Related papers (2023-05-27T03:05:07Z) - Intra-Source Style Augmentation for Improved Domain Generalization [21.591831983223997]
We propose an intra-source style augmentation (ISSA) method to improve domain generalization in semantic segmentation.
ISSA is model-agnostic and straightforwardly applicable with CNNs and Transformers.
It is also complementary to other domain generalization techniques, e.g., it improves the recent state-of-the-art solution RobustNet by $3%$ mIoU in Cityscapes to Dark Z"urich.
arXiv Detail & Related papers (2022-10-18T21:33:25Z) - Style Interleaved Learning for Generalizable Person Re-identification [69.03539634477637]
We propose a novel style interleaved learning (IL) framework for DG ReID training.
Unlike conventional learning strategies, IL incorporates two forward propagations and one backward propagation for each iteration.
We show that our model consistently outperforms state-of-the-art methods on large-scale benchmarks for DG ReID.
arXiv Detail & Related papers (2022-07-07T07:41:32Z) - Local Augmentation for Graph Neural Networks [78.48812244668017]
We introduce the local augmentation, which enhances node features by its local subgraph structures.
Based on the local augmentation, we further design a novel framework: LA-GNN, which can apply to any GNN models in a plug-and-play manner.
arXiv Detail & Related papers (2021-09-08T18:10:08Z) - MixStyle Neural Networks for Domain Generalization and Adaptation [122.36901703868321]
MixStyle is a plug-and-play module that can improve domain generalization performance without the need to collect more data or increase model capacity.
Our experiments show that MixStyle can significantly boost out-of-distribution generalization performance across a wide range of tasks including image recognition, instance retrieval and reinforcement learning.
arXiv Detail & Related papers (2021-07-05T14:29:19Z) - RobustNet: Improving Domain Generalization in Urban-Scene Segmentation
via Instance Selective Whitening [40.98892593362837]
Enhancing generalization capability of deep neural networks to unseen domains is crucial for safety-critical applications in the real world such as autonomous driving.
This paper proposes a novel instance selective whitening loss to improve the robustness of the segmentation networks for unseen domains.
arXiv Detail & Related papers (2021-03-29T13:19:37Z) - Regularizing Deep Networks with Semantic Data Augmentation [44.53483945155832]
We propose a novel semantic data augmentation algorithm to complement traditional approaches.
The proposed method is inspired by the intriguing property that deep networks are effective in learning linearized features.
We show that the proposed implicit semantic data augmentation (ISDA) algorithm amounts to minimizing a novel robust CE loss.
arXiv Detail & Related papers (2020-07-21T00:32:44Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.