BatStyler: Advancing Multi-category Style Generation for Source-free Domain Generalization
- URL: http://arxiv.org/abs/2501.01109v1
- Date: Thu, 02 Jan 2025 07:14:23 GMT
- Title: BatStyler: Advancing Multi-category Style Generation for Source-free Domain Generalization
- Authors: Xiusheng Xu, Lei Qi, Jingyang Zhou, Xin Geng,
- Abstract summary: Source-Free Domain Generalization aims to develop a model that performs on unseen domains without relying on any source domains.
Research on SFDG focus on knowledge transfer of multi-modal models and style synthesis based on joint space of multiple modalities.
We propose a method called BatStyler, which is utilized to improve the capability of style synthesis in multi-category scenarios.
- Score: 39.7856695215463
- License:
- Abstract: Source-Free Domain Generalization (SFDG) aims to develop a model that performs on unseen domains without relying on any source domains. However, the implementation remains constrained due to the unavailability of training data. Research on SFDG focus on knowledge transfer of multi-modal models and style synthesis based on joint space of multiple modalities, thus eliminating the dependency on source domain images. However, existing works primarily work for multi-domain and less-category configuration, but performance on multi-domain and multi-category configuration is relatively poor. In addition, the efficiency of style synthesis also deteriorates in multi-category scenarios. How to efficiently synthesize sufficiently diverse data and apply it to multi-category configuration is a direction with greater practical value. In this paper, we propose a method called BatStyler, which is utilized to improve the capability of style synthesis in multi-category scenarios. BatStyler consists of two modules: Coarse Semantic Generation and Uniform Style Generation modules. The Coarse Semantic Generation module extracts coarse-grained semantics to prevent the compression of space for style diversity learning in multi-category configuration, while the Uniform Style Generation module provides a template of styles that are uniformly distributed in space and implements parallel training. Extensive experiments demonstrate that our method exhibits comparable performance on less-category datasets, while surpassing state-of-the-art methods on multi-category datasets.
Related papers
- DPStyler: Dynamic PromptStyler for Source-Free Domain Generalization [43.67213274161226]
Source-Free Domain Generalization (SFDG) aims to develop a model that works for unseen target domains without relying on any source domain.
Research in SFDG primarily bulids upon the existing knowledge of large-scale vision-language models.
We introduce Dynamic PromptStyler (DPStyler), comprising Style Generation and Style Removal modules.
arXiv Detail & Related papers (2024-03-25T12:31:01Z) - DGInStyle: Domain-Generalizable Semantic Segmentation with Image Diffusion Models and Stylized Semantic Control [68.14798033899955]
Large, pretrained latent diffusion models (LDMs) have demonstrated an extraordinary ability to generate creative content.
However, are they usable as large-scale data generators, e.g., to improve tasks in the perception stack, like semantic segmentation?
We investigate this question in the context of autonomous driving, and answer it with a resounding "yes"
arXiv Detail & Related papers (2023-12-05T18:34:12Z) - Multi-Domain Long-Tailed Learning by Augmenting Disentangled
Representations [80.76164484820818]
There is an inescapable long-tailed class-imbalance issue in many real-world classification problems.
We study this multi-domain long-tailed learning problem and aim to produce a model that generalizes well across all classes and domains.
Built upon a proposed selective balanced sampling strategy, TALLY achieves this by mixing the semantic representation of one example with the domain-associated nuisances of another.
arXiv Detail & Related papers (2022-10-25T21:54:26Z) - Style Interleaved Learning for Generalizable Person Re-identification [69.03539634477637]
We propose a novel style interleaved learning (IL) framework for DG ReID training.
Unlike conventional learning strategies, IL incorporates two forward propagations and one backward propagation for each iteration.
We show that our model consistently outperforms state-of-the-art methods on large-scale benchmarks for DG ReID.
arXiv Detail & Related papers (2022-07-07T07:41:32Z) - Low Resource Style Transfer via Domain Adaptive Meta Learning [30.323491061441857]
We propose DAML-ATM (Domain Adaptive Meta-Learning with Adversarial Transfer Model), which consists of two parts: DAML and ATM.
DAML is a domain adaptive meta-learning approach to learn general knowledge in multiple heterogeneous source domains, capable of adapting to new unseen domains with a small amount of data.
We also propose a new unsupervised TST approach Adversarial Transfer Model (ATM), composed of a sequence-to-sequence pre-trained language model and uses adversarial style training for better content preservation and style transfer.
arXiv Detail & Related papers (2022-05-25T03:58:24Z) - Distribution Aligned Multimodal and Multi-Domain Image Stylization [76.74823384524814]
We propose a unified framework for multimodal and multi-domain style transfer.
The key component of our method is a novel style distribution alignment module.
We validate our proposed framework on painting style transfer with a variety of different artistic styles and genres.
arXiv Detail & Related papers (2020-06-02T07:25:53Z) - Unsupervised multi-modal Styled Content Generation [61.040392094140245]
UMMGAN is a novel architecture designed to better model multi-modal distributions in an unsupervised fashion.
We show that UMMGAN effectively disentangles between modes and style, thereby providing an independent degree of control over the generated content.
arXiv Detail & Related papers (2020-01-10T19:36:08Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.