Related papers: Mitigating Long-Tail Bias via Prompt-Controlled Diffusion Augmentation

Mitigating Long-Tail Bias via Prompt-Controlled Diffusion Augmentation

URL: http://arxiv.org/abs/2602.04749v1
Date: Wed, 04 Feb 2026 16:49:16 GMT
Title: Mitigating Long-Tail Bias via Prompt-Controlled Diffusion Augmentation
Authors: Buddhi Wijenayake, Nichula Wasalathilake, Roshan Godaliyadda, Vijitha Herath, Parakrama Ekanayake, Vishal M. Patel,
Abstract summary: We present a prompt-controlled diffusion augmentation framework that synthesizes paired label--image samples with explicit control of both domain and semantic composition.<n>We show gains concentrated on minority classes and improved Urban and Rural generalization, demonstrating controllable augmentation as a practical mechanism to mitigate long-tail bias in remote-sensing segmentation.
Score: 36.94429692322632
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Semantic segmentation of high-resolution remote-sensing imagery is critical for urban mapping and land-cover monitoring, yet training data typically exhibits severe long-tailed pixel imbalance. In the dataset LoveDA, this challenge is compounded by an explicit Urban/Rural split with distinct appearance and inconsistent class-frequency statistics across domains. We present a prompt-controlled diffusion augmentation framework that synthesizes paired label--image samples with explicit control of both domain and semantic composition. Stage~A uses a domain-aware, masked ratio-conditioned discrete diffusion model to generate layouts that satisfy user-specified class-ratio targets while respecting learned co-occurrence structure. Stage~B translates layouts into photorealistic, domain-consistent images using Stable Diffusion with ControlNet guidance. Mixing the resulting ratio and domain-controlled synthetic pairs with real data yields consistent improvements across multiple segmentation backbones, with gains concentrated on minority classes and improved Urban and Rural generalization, demonstrating controllable augmentation as a practical mechanism to mitigate long-tail bias in remote-sensing segmentation. Source codes, pretrained models, and synthetic datasets are available at \href{https://github.com/Buddhi19/SyntheticGen.git}{Github}

Related papers

Uncertainty-Aware ControlNet: Bridging Domain Gaps with Synthetic Image Generation [0.9509895098323274]
This work introduces a method to utilize data from unlabeled domains to train ControlNets.<n>The uncertainty indicates that a given image was not part of the training distribution of a downstream task.<n>The resulting ControlNet allows us to create annotated data with high uncertainty from the target domain.
arXiv Detail & Related papers (2025-10-13T12:41:28Z)
Frequency Domain-Based Diffusion Model for Unpaired Image Dehazing [92.61216319417208]
We propose a novel frequency domain-based diffusion model, named ours, for fully exploiting the beneficial knowledge in unpaired clear data.<n>Inspired by the strong generative ability shown by Diffusion Models (DMs), we tackle the dehazing task from the perspective of frequency domain reconstruction.
arXiv Detail & Related papers (2025-07-02T01:22:46Z)
Style Transfer with Diffusion Models for Synthetic-to-Real Domain Adaptation [3.7051961231919393]
We introduce two novel techniques for semantically consistent style transfer using diffusion models.<n>Experiments using GTA5 as source and Cityscapes/ACDC as target domains show that our approach produces higher quality images with lower FID scores and better content preservation.
arXiv Detail & Related papers (2025-05-22T08:11:10Z)
Adapting Diffusion Models for Improved Prompt Compliance and Controllable Image Synthesis [43.481539150288434]
This work introduces a new family of. factor graph Diffusion Models (FG-DMs) FG-DMs models the joint distribution of. images and conditioning variables, such as semantic, sketch,. deep or normal maps via a factor graph decomposition.
arXiv Detail & Related papers (2024-10-29T00:54:00Z)
SiamSeg: Self-Training with Contrastive Learning for Unsupervised Domain Adaptation Semantic Segmentation in Remote Sensing [13.549403813487022]
Unsupervised domain adaptation (UDA) enables models to learn from unlabeled target domain data while leveraging labeled source domain data.<n>We propose integrating contrastive learning into UDA, enhancing the model's ability to capture semantic information in the target domain.<n>Our method, SimSeg, outperforms existing approaches, achieving state-of-the-art results.
arXiv Detail & Related papers (2024-10-17T11:59:39Z)
SatSynth: Augmenting Image-Mask Pairs through Diffusion Models for Aerial Semantic Segmentation [69.42764583465508]
We explore the potential of generative image diffusion to address the scarcity of annotated data in earth observation tasks. To the best of our knowledge, we are the first to generate both images and corresponding masks for satellite segmentation.
arXiv Detail & Related papers (2024-03-25T10:30:22Z)
Divide and Contrast: Source-free Domain Adaptation via Adaptive Contrastive Learning [122.62311703151215]
Divide and Contrast (DaC) aims to connect the good ends of both worlds while bypassing their limitations. DaC divides the target data into source-like and target-specific samples, where either group of samples is treated with tailored goals. We further align the source-like domain with the target-specific samples using a memory bank-based Maximum Mean Discrepancy (MMD) loss to reduce the distribution mismatch.
arXiv Detail & Related papers (2022-11-12T09:21:49Z)
Unsupervised Foggy Scene Understanding via Self Spatial-Temporal Label Diffusion [51.11295961195151]
We exploit the characteristics of the foggy image sequence of driving scenes to densify the confident pseudo labels. Based on the two discoveries of local spatial similarity and adjacent temporal correspondence of the sequential image data, we propose a novel Target-Domain driven pseudo label Diffusion scheme. Our scheme helps the adaptive model achieve 51.92% and 53.84% mean intersection-over-union (mIoU) on two publicly available natural foggy datasets.
arXiv Detail & Related papers (2022-06-10T05:16:50Z)
Transferring and Regularizing Prediction for Semantic Segmentation [115.88957139226966]
In this paper, we exploit the intrinsic properties of semantic segmentation to alleviate such problem for model transfer. We present a Regularizer of Prediction Transfer (RPT) that imposes the intrinsic properties as constraints to regularize model transfer in an unsupervised fashion. Extensive experiments are conducted to verify the proposal of RPT on the transfer of models trained on GTA5 and SYNTHIA (synthetic data) to Cityscapes dataset (urban street scenes)
arXiv Detail & Related papers (2020-06-11T16:19:41Z)
Image Fine-grained Inpainting [89.17316318927621]
We present a one-stage model that utilizes dense combinations of dilated convolutions to obtain larger and more effective receptive fields. To better train this efficient generator, except for frequently-used VGG feature matching loss, we design a novel self-guided regression loss. We also employ a discriminator with local and global branches to ensure local-global contents consistency.
arXiv Detail & Related papers (2020-02-07T03:45:25Z)

This list is automatically generated from the titles and abstracts of the papers in this site.