Style Evolving along Chain-of-Thought for Unknown-Domain Object Detection
- URL: http://arxiv.org/abs/2503.09968v1
- Date: Thu, 13 Mar 2025 02:14:10 GMT
- Title: Style Evolving along Chain-of-Thought for Unknown-Domain Object Detection
- Authors: Zihao Zhang, Aming Wu, Yahong Han,
- Abstract summary: A task of Single-Domain Generalized Object Detection (Single-DGOD) is proposed, aiming to generalize a detector to multiple unknown domains never seen before during training.<n>We propose a new method, i.e., Style Evolving along Chain-of-Thought, which aims to progressively integrate and expand style information along the chain of thought.
- Score: 35.35239718038119
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Recently, a task of Single-Domain Generalized Object Detection (Single-DGOD) is proposed, aiming to generalize a detector to multiple unknown domains never seen before during training. Due to the unavailability of target-domain data, some methods leverage the multimodal capabilities of vision-language models, using textual prompts to estimate cross-domain information, enhancing the model's generalization capability. These methods typically use a single textual prompt, often referred to as the one-step prompt method. However, when dealing with complex styles such as the combination of rain and night, we observe that the performance of the one-step prompt method tends to be relatively weak. The reason may be that many scenes incorporate not just a single style but a combination of multiple styles. The one-step prompt method may not effectively synthesize combined information involving various styles. To address this limitation, we propose a new method, i.e., Style Evolving along Chain-of-Thought, which aims to progressively integrate and expand style information along the chain of thought, enabling the continual evolution of styles. Specifically, by progressively refining style descriptions and guiding the diverse evolution of styles, this approach enables more accurate simulation of various style characteristics and helps the model gradually learn and adapt to subtle differences between styles. Additionally, it exposes the model to a broader range of style features with different data distributions, thereby enhancing its generalization capability in unseen domains. The significant performance gains over five adverse-weather scenarios and the Real to Art benchmark demonstrate the superiorities of our method.
Related papers
- Pluggable Style Representation Learning for Multi-Style Transfer [41.09041735653436]
We develop a style transfer framework by decoupling the style modeling and transferring.
For style modeling, we propose a style representation learning scheme to encode the style information into a compact representation.
For style transferring, we develop a style-aware multi-style transfer network (SaMST) to adapt to diverse styles using pluggable style representations.
arXiv Detail & Related papers (2025-03-26T09:44:40Z) - Style Aligned Image Generation via Shared Attention [61.121465570763085]
We introduce StyleAligned, a technique designed to establish style alignment among a series of generated images.
By employing minimal attention sharing' during the diffusion process, our method maintains style consistency across images within T2I models.
Our method's evaluation across diverse styles and text prompts demonstrates high-quality and fidelity.
arXiv Detail & Related papers (2023-12-04T18:55:35Z) - A Unified Arbitrary Style Transfer Framework via Adaptive Contrastive
Learning [84.8813842101747]
Unified Contrastive Arbitrary Style Transfer (UCAST) is a novel style representation learning and transfer framework.
We present an adaptive contrastive learning scheme for style transfer by introducing an input-dependent temperature.
Our framework consists of three key components, i.e., a parallel contrastive learning scheme for style representation and style transfer, a domain enhancement module for effective learning of style distribution, and a generative network for style transfer.
arXiv Detail & Related papers (2023-03-09T04:35:00Z) - Domain Generalization with Correlated Style Uncertainty [4.844240089234632]
Style augmentation is a strong DG method taking advantage of instance-specific feature statistics.
We introduce Correlated Style Uncertainty (CSU), surpassing the limitations of linear generalization in style statistic space.
Our method's efficacy is established through extensive experimentation on diverse cross-domain computer vision and medical imaging classification tasks.
arXiv Detail & Related papers (2022-12-20T01:59:27Z) - Adversarial Style Augmentation for Domain Generalized Urban-Scene
Segmentation [120.96012935286913]
We propose a novel adversarial style augmentation approach, which can generate hard stylized images during training.
Experiments on two synthetic-to-real semantic segmentation benchmarks demonstrate that AdvStyle can significantly improve the model performance on unseen real domains.
arXiv Detail & Related papers (2022-07-11T14:01:25Z) - Style Interleaved Learning for Generalizable Person Re-identification [69.03539634477637]
We propose a novel style interleaved learning (IL) framework for DG ReID training.
Unlike conventional learning strategies, IL incorporates two forward propagations and one backward propagation for each iteration.
We show that our model consistently outperforms state-of-the-art methods on large-scale benchmarks for DG ReID.
arXiv Detail & Related papers (2022-07-07T07:41:32Z) - Domain Enhanced Arbitrary Image Style Transfer via Contrastive Learning [84.8813842101747]
Contrastive Arbitrary Style Transfer (CAST) is a new style representation learning and style transfer method via contrastive learning.
Our framework consists of three key components, i.e., a multi-layer style projector for style code encoding, a domain enhancement module for effective learning of style distribution, and a generative network for image style transfer.
arXiv Detail & Related papers (2022-05-19T13:11:24Z) - Distribution Aligned Multimodal and Multi-Domain Image Stylization [76.74823384524814]
We propose a unified framework for multimodal and multi-domain style transfer.
The key component of our method is a novel style distribution alignment module.
We validate our proposed framework on painting style transfer with a variety of different artistic styles and genres.
arXiv Detail & Related papers (2020-06-02T07:25:53Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.