Related papers: Uncertainty-Aware ControlNet: Bridging Domain Gaps with Synthetic Image Generation

Uncertainty-Aware ControlNet: Bridging Domain Gaps with Synthetic Image Generation

URL: http://arxiv.org/abs/2510.11346v1
Date: Mon, 13 Oct 2025 12:41:28 GMT
Title: Uncertainty-Aware ControlNet: Bridging Domain Gaps with Synthetic Image Generation
Authors: Joshua Niemeijer, Jan Ehrhardt, Heinz Handels, Hristina Uzunova,
Abstract summary: This work introduces a method to utilize data from unlabeled domains to train ControlNets.<n>The uncertainty indicates that a given image was not part of the training distribution of a downstream task.<n>The resulting ControlNet allows us to create annotated data with high uncertainty from the target domain.
Score: 0.9509895098323274
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Generative Models are a valuable tool for the controlled creation of high-quality image data. Controlled diffusion models like the ControlNet have allowed the creation of labeled distributions. Such synthetic datasets can augment the original training distribution when discriminative models, like semantic segmentation, are trained. However, this augmentation effect is limited since ControlNets tend to reproduce the original training distribution. This work introduces a method to utilize data from unlabeled domains to train ControlNets by introducing the concept of uncertainty into the control mechanism. The uncertainty indicates that a given image was not part of the training distribution of a downstream task, e.g., segmentation. Thus, two types of control are engaged in the final network: an uncertainty control from an unlabeled dataset and a semantic control from the labeled dataset. The resulting ControlNet allows us to create annotated data with high uncertainty from the target domain, i.e., synthetic data from the unlabeled distribution with labels. In our scenario, we consider retinal OCTs, where typically high-quality Spectralis images are available with given ground truth segmentations, enabling the training of segmentation networks. The recent development in Home-OCT devices, however, yields retinal OCTs with lower quality and a large domain shift, such that out-of-the-pocket segmentation networks cannot be applied for this type of data. Synthesizing annotated images from the Home-OCT domain using the proposed approach closes this gap and leads to significantly improved segmentation results without adding any further supervision. The advantage of uncertainty-guidance becomes obvious when compared to style transfer: it enables arbitrary domain shifts without any strict learning of an image style. This is also demonstrated in a traffic scene experiment.

Related papers

Mitigating Long-Tail Bias via Prompt-Controlled Diffusion Augmentation [36.94429692322632]
We present a prompt-controlled diffusion augmentation framework that synthesizes paired label--image samples with explicit control of both domain and semantic composition.<n>We show gains concentrated on minority classes and improved Urban and Rural generalization, demonstrating controllable augmentation as a practical mechanism to mitigate long-tail bias in remote-sensing segmentation.
arXiv Detail & Related papers (2026-02-04T16:49:16Z)
Unsupervised Domain Adaptation for 3D LiDAR Semantic Segmentation Using Contrastive Learning and Multi-Model Pseudo Labeling [0.7373617024876725]
Unsupervised contrastive learning at the segment level is used to pre-train a backbone network.<n>A multi-model pseudo-labeling strategy is introduced, utilizing an ensemble of diverse state-of-the-art architectures.<n>Experiments adapting from Semantic KITTI to unlabeled target datasets demonstrate significant improvements in segmentation accuracy.
arXiv Detail & Related papers (2025-07-24T08:21:43Z)
SiamSeg: Self-Training with Contrastive Learning for Unsupervised Domain Adaptation Semantic Segmentation in Remote Sensing [13.549403813487022]
Unsupervised domain adaptation (UDA) enables models to learn from unlabeled target domain data while leveraging labeled source domain data.<n>We propose integrating contrastive learning into UDA, enhancing the model's ability to capture semantic information in the target domain.<n>Our method, SimSeg, outperforms existing approaches, achieving state-of-the-art results.
arXiv Detail & Related papers (2024-10-17T11:59:39Z)
AdaptDiff: Cross-Modality Domain Adaptation via Weak Conditional Semantic Diffusion for Retinal Vessel Segmentation [10.958821619282748]
We present an unsupervised domain adaptation (UDA) method named AdaptDiff. It enables a retinal vessel segmentation network trained on fundus photography (FP) to produce satisfactory results on unseen modalities. Our results demonstrate a significant improvement in segmentation performance across all unseen datasets.
arXiv Detail & Related papers (2024-10-06T23:04:29Z)
Domain-knowledge Inspired Pseudo Supervision (DIPS) for Unsupervised Image-to-Image Translation Models to Support Cross-Domain Classification [16.4151067682813]
This paper introduces a new method called Domain-knowledge Inspired Pseudo Supervision (DIPS) DIPS uses domain-informed Gaussian Mixture Models to generate pseudo annotations to enable the use of traditional supervised metrics. It proves its effectiveness by outperforming various GAN evaluation metrics, including FID, when selecting the optimal saved checkpoint model.
arXiv Detail & Related papers (2023-03-18T02:42:18Z)
Semantic Image Synthesis via Diffusion Models [174.24523061460704]
Denoising Diffusion Probabilistic Models (DDPMs) have achieved remarkable success in various image generation tasks.<n>Recent work on semantic image synthesis mainly follows the de facto GAN-based approaches.<n>We propose a novel framework based on DDPM for semantic image synthesis.
arXiv Detail & Related papers (2022-06-30T18:31:51Z)
Unsupervised Foggy Scene Understanding via Self Spatial-Temporal Label Diffusion [51.11295961195151]
We exploit the characteristics of the foggy image sequence of driving scenes to densify the confident pseudo labels. Based on the two discoveries of local spatial similarity and adjacent temporal correspondence of the sequential image data, we propose a novel Target-Domain driven pseudo label Diffusion scheme. Our scheme helps the adaptive model achieve 51.92% and 53.84% mean intersection-over-union (mIoU) on two publicly available natural foggy datasets.
arXiv Detail & Related papers (2022-06-10T05:16:50Z)
Attentive Prototypes for Source-free Unsupervised Domain Adaptive 3D Object Detection [85.11649974840758]
3D object detection networks tend to be biased towards the data they are trained on. We propose a single-frame approach for source-free, unsupervised domain adaptation of lidar-based 3D object detectors.
arXiv Detail & Related papers (2021-11-30T18:42:42Z)
Semi-weakly Supervised Contrastive Representation Learning for Retinal Fundus Images [0.2538209532048867]
We propose a semi-weakly supervised contrastive learning framework for representation learning using semi-weakly annotated images. We empirically validate the transfer learning performance of SWCL on seven public retinal fundus datasets.
arXiv Detail & Related papers (2021-08-04T15:50:09Z)
Semi-Supervised Domain Adaptation with Prototypical Alignment and Consistency Learning [86.6929930921905]
This paper studies how much it can help address domain shifts if we further have a few target samples labeled. To explore the full potential of landmarks, we incorporate a prototypical alignment (PA) module which calculates a target prototype for each class from the landmarks. Specifically, we severely perturb the labeled images, making PA non-trivial to achieve and thus promoting model generalizability.
arXiv Detail & Related papers (2021-04-19T08:46:08Z)
A Free Lunch for Unsupervised Domain Adaptive Object Detection without Source Data [69.091485888121]
Unsupervised domain adaptation assumes that source and target domain data are freely available and usually trained together to reduce the domain gap. We propose a source data-free domain adaptive object detection (SFOD) framework via modeling it into a problem of learning with noisy labels.
arXiv Detail & Related papers (2020-12-10T01:42:35Z)
Group-Wise Semantic Mining for Weakly Supervised Semantic Segmentation [49.90178055521207]
This work addresses weakly supervised semantic segmentation (WSSS), with the goal of bridging the gap between image-level annotations and pixel-level segmentation. We formulate WSSS as a novel group-wise learning task that explicitly models semantic dependencies in a group of images to estimate more reliable pseudo ground-truths. In particular, we devise a graph neural network (GNN) for group-wise semantic mining, wherein input images are represented as graph nodes.
arXiv Detail & Related papers (2020-12-09T12:40:13Z)

This list is automatically generated from the titles and abstracts of the papers in this site.