Global and Local Texture Randomization for Synthetic-to-Real Semantic
Segmentation
- URL: http://arxiv.org/abs/2108.02376v2
- Date: Fri, 6 Aug 2021 03:43:21 GMT
- Title: Global and Local Texture Randomization for Synthetic-to-Real Semantic
Segmentation
- Authors: Duo Peng, Yinjie Lei, Lingqiao Liu, Pingping Zhang, and Jun Liu
- Abstract summary: We propose two simple yet effective texture randomization mechanisms, Global Randomization (GTR) and Local Texture Randomization (LTR)
GTR is proposed to randomize the texture of source images into diverse texture styles.
LTR is proposed to generate diverse local regions for partially stylizing the source images.
- Score: 40.556020857447535
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Semantic segmentation is a crucial image understanding task, where each pixel
of image is categorized into a corresponding label. Since the pixel-wise
labeling for ground-truth is tedious and labor intensive, in practical
applications, many works exploit the synthetic images to train the model for
real-word image semantic segmentation, i.e., Synthetic-to-Real Semantic
Segmentation (SRSS). However, Deep Convolutional Neural Networks (CNNs) trained
on the source synthetic data may not generalize well to the target real-world
data. In this work, we propose two simple yet effective texture randomization
mechanisms, Global Texture Randomization (GTR) and Local Texture Randomization
(LTR), for Domain Generalization based SRSS. GTR is proposed to randomize the
texture of source images into diverse unreal texture styles. It aims to
alleviate the reliance of the network on texture while promoting the learning
of the domain-invariant cues. In addition, we find the texture difference is
not always occurred in entire image and may only appear in some local areas.
Therefore, we further propose a LTR mechanism to generate diverse local regions
for partially stylizing the source images. Finally, we implement a
regularization of Consistency between GTR and LTR (CGL) aiming to harmonize the
two proposed mechanisms during training. Extensive experiments on five publicly
available datasets (i.e., GTA5, SYNTHIA, Cityscapes, BDDS and Mapillary) with
various SRSS settings (i.e., GTA5/SYNTHIA to Cityscapes/BDDS/Mapillary)
demonstrate that the proposed method is superior to the state-of-the-art
methods for domain generalization based SRSS.
Related papers
- SeD: Semantic-Aware Discriminator for Image Super-Resolution [20.646975821512395]
Generative Adversarial Networks (GANs) have been widely used to recover vivid textures in image super-resolution (SR) tasks.
One discriminator is utilized to enable the SR network to learn the distribution of real-world high-quality images in an adversarial training manner.
We propose the simple and effective Semantic-aware Discriminator ( SeD)
SeD encourages the SR network to learn the fine-grained distributions by introducing the semantics of images as a condition.
arXiv Detail & Related papers (2024-02-29T17:38:54Z) - Semantic RGB-D Image Synthesis [22.137419841504908]
We introduce semantic RGB-D image synthesis to address this problem.
Current approaches, however, are uni-modal and cannot cope with multi-modal data.
We propose a generator for multi-modal data that separates modal-independent information of the semantic layout from the modal-dependent information.
arXiv Detail & Related papers (2023-08-22T11:16:24Z) - LD-ZNet: A Latent Diffusion Approach for Text-Based Image Segmentation [10.623430999818925]
We present a technique for segmenting real and AI-generated images using latent diffusion models (LDMs) trained on internet-scale datasets.
We show up to 6% improvement over standard baselines for text-to-image segmentation on natural images.
For AI-generated imagery, we show close to 20% improvement compared to state-of-the-art techniques.
arXiv Detail & Related papers (2023-03-22T06:55:01Z) - Joint Learning of Deep Texture and High-Frequency Features for
Computer-Generated Image Detection [24.098604827919203]
We propose a joint learning strategy with deep texture and high-frequency features for CG image detection.
A semantic segmentation map is generated to guide the affine transformation operation.
The combination of the original image and the high-frequency components of the original and rendered images are fed into a multi-branch neural network equipped with attention mechanisms.
arXiv Detail & Related papers (2022-09-07T17:30:40Z) - Semantic Image Synthesis via Diffusion Models [159.4285444680301]
Denoising Diffusion Probabilistic Models (DDPMs) have achieved remarkable success in various image generation tasks.
Recent work on semantic image synthesis mainly follows the emphde facto Generative Adversarial Nets (GANs)
arXiv Detail & Related papers (2022-06-30T18:31:51Z) - Low Light Image Enhancement via Global and Local Context Modeling [164.85287246243956]
We introduce a context-aware deep network for low-light image enhancement.
First, it features a global context module that models spatial correlations to find complementary cues over full spatial domain.
Second, it introduces a dense residual block that captures local context with a relatively large receptive field.
arXiv Detail & Related papers (2021-01-04T09:40:54Z) - DF-GAN: A Simple and Effective Baseline for Text-to-Image Synthesis [80.54273334640285]
We propose a novel one-stage text-to-image backbone that directly synthesizes high-resolution images without entanglements between different generators.
We also propose a novel Target-Aware Discriminator composed of Matching-Aware Gradient Penalty and One-Way Output.
Compared with current state-of-the-art methods, our proposed DF-GAN is simpler but more efficient to synthesize realistic and text-matching images.
arXiv Detail & Related papers (2020-08-13T12:51:17Z) - Component Divide-and-Conquer for Real-World Image Super-Resolution [143.24770911629807]
We present a large-scale Diverse Real-world image Super-Resolution dataset, i.e., DRealSR, as well as a divide-and-conquer Super-Resolution network.
DRealSR establishes a new SR benchmark with diverse real-world degradation processes.
We propose a Component Divide-and-Conquer (CDC) model and a Gradient-Weighted (GW) loss for SR.
arXiv Detail & Related papers (2020-08-05T04:26:26Z) - A U-Net Based Discriminator for Generative Adversarial Networks [86.67102929147592]
We propose an alternative U-Net based discriminator architecture for generative adversarial networks (GANs)
The proposed architecture allows to provide detailed per-pixel feedback to the generator while maintaining the global coherence of synthesized images.
The novel discriminator improves over the state of the art in terms of the standard distribution and image quality metrics.
arXiv Detail & Related papers (2020-02-28T11:16:54Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.