Style-Hallucinated Dual Consistency Learning for Domain Generalized
Semantic Segmentation
- URL: http://arxiv.org/abs/2204.02548v1
- Date: Wed, 6 Apr 2022 02:49:06 GMT
- Title: Style-Hallucinated Dual Consistency Learning for Domain Generalized
Semantic Segmentation
- Authors: Yuyang Zhao, Zhun Zhong, Na Zhao, Nicu Sebe, Gim Hee Lee
- Abstract summary: We propose the Style-HAllucinated Dual consistEncy learning (SHADE) framework to handle domain shift.
Our SHADE yields significant improvement and outperforms state-of-the-art methods by 5.07% and 8.35% on the average mIoU of three real-world datasets.
- Score: 117.3856882511919
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In this paper, we study the task of synthetic-to-real domain generalized
semantic segmentation, which aims to learn a model that is robust to unseen
real-world scenes using only synthetic data. The large domain shift between
synthetic and real-world data, including the limited source environmental
variations and the large distribution gap between synthetic and real-world
data, significantly hinders the model performance on unseen real-world scenes.
In this work, we propose the Style-HAllucinated Dual consistEncy learning
(SHADE) framework to handle such domain shift. Specifically, SHADE is
constructed based on two consistency constraints, Style Consistency (SC) and
Retrospection Consistency (RC). SC enriches the source situations and
encourages the model to learn consistent representation across
style-diversified samples. RC leverages real-world knowledge to prevent the
model from overfitting to synthetic data and thus largely keeps the
representation consistent between the synthetic and real-world models.
Furthermore, we present a novel style hallucination module (SHM) to generate
style-diversified samples that are essential to consistency learning. SHM
selects basis styles from the source distribution, enabling the model to
dynamically generate diverse and realistic samples during training. Experiments
show that our SHADE yields significant improvement and outperforms
state-of-the-art methods by 5.07% and 8.35% on the average mIoU of three
real-world datasets on single- and multi-source settings respectively.
Related papers
- Hybrid Training Approaches for LLMs: Leveraging Real and Synthetic Data to Enhance Model Performance in Domain-Specific Applications [0.0]
This research explores a hybrid approach to fine-tuning large language models (LLMs)
By leveraging a dataset combining transcribed real interactions with high-quality synthetic sessions, we aimed to overcome the limitations of domain-specific real data.
The study evaluated three models: a base foundational model, a model fine-tuned with real data, and a hybrid fine-tuned model.
arXiv Detail & Related papers (2024-10-11T18:16:03Z) - Synthetic location trajectory generation using categorical diffusion
models [50.809683239937584]
Diffusion models (DPMs) have rapidly evolved to be one of the predominant generative models for the simulation of synthetic data.
We propose using DPMs for the generation of synthetic individual location trajectories (ILTs) which are sequences of variables representing physical locations visited by individuals.
arXiv Detail & Related papers (2024-02-19T15:57:39Z) - Steering Language Generation: Harnessing Contrastive Expert Guidance and
Negative Prompting for Coherent and Diverse Synthetic Data Generation [0.0]
Large Language Models (LLMs) hold immense potential to generate synthetic data of high quality and utility.
We introduce contrastive expert guidance, where the difference between the logit distributions of fine-tuned and base language models is emphasised.
We deem this dual-pronged approach to logit reshaping as STEER: Semantic Text Enhancement via Embedding Repositioning.
arXiv Detail & Related papers (2023-08-15T08:49:14Z) - Pre-training Contextualized World Models with In-the-wild Videos for
Reinforcement Learning [54.67880602409801]
In this paper, we study the problem of pre-training world models with abundant in-the-wild videos for efficient learning of visual control tasks.
We introduce Contextualized World Models (ContextWM) that explicitly separate context and dynamics modeling.
Our experiments show that in-the-wild video pre-training equipped with ContextWM can significantly improve the sample efficiency of model-based reinforcement learning.
arXiv Detail & Related papers (2023-05-29T14:29:12Z) - HaDR: Applying Domain Randomization for Generating Synthetic Multimodal
Dataset for Hand Instance Segmentation in Cluttered Industrial Environments [0.0]
This study uses domain randomization to generate a synthetic RGB-D dataset for training multimodal instance segmentation models.
We show that our approach enables the models to outperform corresponding models trained on existing state-of-the-art datasets.
arXiv Detail & Related papers (2023-04-12T13:02:08Z) - Domain Adaptation of Synthetic Driving Datasets for Real-World
Autonomous Driving [0.11470070927586014]
Network trained with synthetic data for certain computer vision tasks degrade significantly when tested on real world data.
In this paper, we propose and evaluate novel ways for the betterment of such approaches.
We propose a novel method to efficiently incorporate semantic supervision into this pair selection, which helps in boosting the performance of the model.
arXiv Detail & Related papers (2023-02-08T15:51:54Z) - Style-Hallucinated Dual Consistency Learning: A Unified Framework for
Visual Domain Generalization [113.03189252044773]
We propose a unified framework, Style-HAllucinated Dual consistEncy learning (SHADE), to handle domain shift in various visual tasks.
Our versatile SHADE can significantly enhance the generalization in various visual recognition tasks, including image classification, semantic segmentation and object detection.
arXiv Detail & Related papers (2022-12-18T11:42:51Z) - Synthetic-to-Real Domain Generalized Semantic Segmentation for 3D Indoor
Point Clouds [69.64240235315864]
This paper introduces the synthetic-to-real domain generalization setting to this task.
The domain gap between synthetic and real-world point cloud data mainly lies in the different layouts and point patterns.
Experiments on the synthetic-to-real benchmark demonstrate that both CINMix and multi-prototypes can narrow the distribution gap.
arXiv Detail & Related papers (2022-12-09T05:07:43Z) - CAFE: Learning to Condense Dataset by Aligning Features [72.99394941348757]
We propose a novel scheme to Condense dataset by Aligning FEatures (CAFE)
At the heart of our approach is an effective strategy to align features from the real and synthetic data across various scales.
We validate the proposed CAFE across various datasets, and demonstrate that it generally outperforms the state of the art.
arXiv Detail & Related papers (2022-03-03T05:58:49Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.