Related papers: Synthetic Data Matters: Re-training with Geo-typical Synthetic Labels for Building Detection

Synthetic Data Matters: Re-training with Geo-typical Synthetic Labels for Building Detection

URL: http://arxiv.org/abs/2507.16657v1
Date: Tue, 22 Jul 2025 14:53:13 GMT
Title: Synthetic Data Matters: Re-training with Geo-typical Synthetic Labels for Building Detection
Authors: Shuang Song, Yang Tang, Rongjun Qin,
Abstract summary: We propose re-training models at test time using synthetic data tailored to the target region's city layout.<n>This method generates geo-typical synthetic data that closely replicates the urban structure of a target area.<n>Experiments demonstrate significant performance enhancements, with median improvements of up to 12%, depending on the domain gap.
Score: 13.550020274133866
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Deep learning has significantly advanced building segmentation in remote sensing, yet models struggle to generalize on data of diverse geographic regions due to variations in city layouts and the distribution of building types, sizes and locations. However, the amount of time-consuming annotated data for capturing worldwide diversity may never catch up with the demands of increasingly data-hungry models. Thus, we propose a novel approach: re-training models at test time using synthetic data tailored to the target region's city layout. This method generates geo-typical synthetic data that closely replicates the urban structure of a target area by leveraging geospatial data such as street network from OpenStreetMap. Using procedural modeling and physics-based rendering, very high-resolution synthetic images are created, incorporating domain randomization in building shapes, materials, and environmental illumination. This enables the generation of virtually unlimited training samples that maintain the essential characteristics of the target environment. To overcome synthetic-to-real domain gaps, our approach integrates geo-typical data into an adversarial domain adaptation framework for building segmentation. Experiments demonstrate significant performance enhancements, with median improvements of up to 12%, depending on the domain gap. This scalable and cost-effective method blends partial geographic knowledge with synthetic imagery, providing a promising solution to the "model collapse" issue in purely synthetic datasets. It offers a practical pathway to improving generalization in remote sensing building segmentation without extensive real-world annotations.

Related papers

Harnessing Rich Multi-Modal Data for Spatial-Temporal Homophily-Embedded Graph Learning Across Domains and Localities [2.5065738436850835]
This research proposes a heterogeneous data pipeline that performs cross-domain data fusion.<n>We aim to address complex urban problems across multiple domains and localities by harnessing the rich information over 50 data sources.
arXiv Detail & Related papers (2025-12-11T23:51:54Z)
A Framework for Low-Effort Training Data Generation for Urban Semantic Segmentation [15.541453405140485]
Synthetic datasets are widely used for training urban scene recognition models, but even highly realistic renderings show a noticeable gap to real imagery.<n>We present a new framework that adapts an off-the-shelf diffusion model to a target domain using only imperfect pseudo-labels.<n>It generates high-fidelity, target-aligned images from semantic maps of any synthetic dataset, including low-effort sources created in hours rather than months.
arXiv Detail & Related papers (2025-10-13T16:12:29Z)
Scaling Transformer-Based Novel View Synthesis Models with Token Disentanglement and Synthetic Data [53.040873127309766]
We propose a token disentanglement process within the transformer architecture, enhancing feature separation and ensuring more effective learning.<n>Our method outperforms existing models on both in-dataset and cross-dataset evaluations.
arXiv Detail & Related papers (2025-09-08T17:58:06Z)
SynthGenNet: a self-supervised approach for test-time generalization using synthetic multi-source domain mixing of street view images [8.23277995673829]
We introduce SynthGenNet, a self-supervised student-teacher architecture to enable robust test-time domain generalization.<n>Our contributions include the novel ClassMix++ algorithm, which blends labeled data from various synthetic sources.<n>We show our model outperforms the state-of-the-art (relying on single source) by achieving 50% Mean Intersection-Over-Union (mIoU) value on real-world datasets.
arXiv Detail & Related papers (2025-09-02T13:08:03Z)
Exploring the Landscape for Generative Sequence Models for Specialized Data Synthesis [0.0]
This paper introduces a novel approach that leverages three generative models of varying complexity to synthesize Malicious Network Traffic. Our approach transforms numerical data into text, re-framing data generation as a language modeling task. Our method surpasses state-of-the-art generative models in producing high-fidelity synthetic data.
arXiv Detail & Related papers (2024-11-04T09:51:10Z)
Learning from Synthetic Data for Visual Grounding [55.21937116752679]
We show that SynGround can improve the localization capabilities of off-the-shelf vision-and-language models.<n>Data generated with SynGround improves the pointing game accuracy of a pretrained ALBEF and BLIP models by 4.81% and 17.11% absolute percentage points, respectively.
arXiv Detail & Related papers (2024-03-20T17:59:43Z)
SARN: Structurally-Aware Recurrent Network for Spatio-Temporal Disaggregation [8.636014676778682]
Open data is frequently released spatially aggregated, usually to comply with privacy policies. But coarse, heterogeneous aggregations complicate coherent learning and integration for downstream AI/ML systems. We propose an overarching model named Structurally-Aware Recurrent Network (SARN), which integrates structurally-aware spatial attention layers into the Gated Recurrent Unit (GRU) model. For scenarios with limited historical training data, we show that a model pre-trained on one city variable can be fine-tuned for another city variable using only a few hundred samples.
arXiv Detail & Related papers (2023-06-09T21:01:29Z)
Domain Adaptation of Synthetic Driving Datasets for Real-World Autonomous Driving [0.11470070927586014]
Network trained with synthetic data for certain computer vision tasks degrade significantly when tested on real world data. In this paper, we propose and evaluate novel ways for the betterment of such approaches. We propose a novel method to efficiently incorporate semantic supervision into this pair selection, which helps in boosting the performance of the model.
arXiv Detail & Related papers (2023-02-08T15:51:54Z)
Synthetic-to-Real Domain Generalized Semantic Segmentation for 3D Indoor Point Clouds [69.64240235315864]
This paper introduces the synthetic-to-real domain generalization setting to this task. The domain gap between synthetic and real-world point cloud data mainly lies in the different layouts and point patterns. Experiments on the synthetic-to-real benchmark demonstrate that both CINMix and multi-prototypes can narrow the distribution gap.
arXiv Detail & Related papers (2022-12-09T05:07:43Z)
Towards Understanding and Mitigating Dimensional Collapse in Heterogeneous Federated Learning [112.69497636932955]
Federated learning aims to train models across different clients without the sharing of data for privacy considerations. We study how data heterogeneity affects the representations of the globally aggregated models. We propose sc FedDecorr, a novel method that can effectively mitigate dimensional collapse in federated learning.
arXiv Detail & Related papers (2022-10-01T09:04:17Z)
TRoVE: Transforming Road Scene Datasets into Photorealistic Virtual Environments [84.6017003787244]
This work proposes a synthetic data generation pipeline to address the difficulties and domain-gaps present in simulated datasets. We show that using annotations and visual cues from existing datasets, we can facilitate automated multi-modal data generation.
arXiv Detail & Related papers (2022-08-16T20:46:08Z)
Robust Self-Tuning Data Association for Geo-Referencing Using Lane Markings [44.4879068879732]
This paper presents a complete pipeline for resolving ambiguities during the data association. Its core is a robust self-tuning data association that adapts the search area depending on the entropy of the measurements. We evaluate our method on real data from urban and rural scenarios around the city of Karlsruhe in Germany.
arXiv Detail & Related papers (2022-07-28T12:29:39Z)
PetroGAN: A novel GAN-based approach to generate realistic, label-free petrographic datasets [0.0]
We develop a novel deep learning framework based on generative adversarial networks (GANs) to create the first realistic synthetic petrographic dataset. The training dataset consists of 10070 images of rock thin sections both in plane- and cross-polarized light. The algorithm trained for 264 GPU hours and reached a state-of-the-art Fr'echet Inception Distance (FID) score of 12.49 for petrographic images.
arXiv Detail & Related papers (2022-04-07T01:55:53Z)
Style-Hallucinated Dual Consistency Learning for Domain Generalized Semantic Segmentation [117.3856882511919]
We propose the Style-HAllucinated Dual consistEncy learning (SHADE) framework to handle domain shift. Our SHADE yields significant improvement and outperforms state-of-the-art methods by 5.07% and 8.35% on the average mIoU of three real-world datasets.
arXiv Detail & Related papers (2022-04-06T02:49:06Z)
CAFE: Learning to Condense Dataset by Aligning Features [72.99394941348757]
We propose a novel scheme to Condense dataset by Aligning FEatures (CAFE) At the heart of our approach is an effective strategy to align features from the real and synthetic data across various scales. We validate the proposed CAFE across various datasets, and demonstrate that it generally outperforms the state of the art.
arXiv Detail & Related papers (2022-03-03T05:58:49Z)
CrossLoc: Scalable Aerial Localization Assisted by Multimodal Synthetic Data [2.554905387213586]
We present a visual localization system that learns to estimate camera poses in the real world with the help of synthetic data. To mitigate the data scarcity issue, we introduce TOPO-DataGen, a versatile synthetic data generation tool. We also introduce CrossLoc, a cross-modal visual representation learning approach to pose estimation.
arXiv Detail & Related papers (2021-12-16T18:05:48Z)
Methodological Foundation of a Numerical Taxonomy of Urban Form [62.997667081978825]
We present a method for numerical taxonomy of urban form derived from biological systematics. We derive homogeneous urban tissue types and, by determining overall morphological similarity between them, generate a hierarchical classification of urban form. After framing and presenting the method, we test it on two cities - Prague and Amsterdam.
arXiv Detail & Related papers (2021-04-30T12:47:52Z)

This list is automatically generated from the titles and abstracts of the papers in this site.