Data Augmentation and Resolution Enhancement using GANs and Diffusion Models for Tree Segmentation
- URL: http://arxiv.org/abs/2505.15077v1
- Date: Wed, 21 May 2025 03:57:10 GMT
- Title: Data Augmentation and Resolution Enhancement using GANs and Diffusion Models for Tree Segmentation
- Authors: Alessandro dos Santos Ferreira, Ana Paula Marques Ramos, José Marcato Junior, Wesley Nunes Gonçalves,
- Abstract summary: Urban forests play a key role in enhancing environmental quality and supporting biodiversity in cities.<n> accurately detecting trees is challenging due to complex landscapes and the variability in image resolution caused by different satellite sensors or UAV flight altitudes.<n>We propose a novel pipeline that integrates domain adaptation with GANs and Diffusion models to enhance the quality of low-resolution aerial images.
- Score: 49.13393683126712
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Urban forests play a key role in enhancing environmental quality and supporting biodiversity in cities. Mapping and monitoring these green spaces are crucial for urban planning and conservation, yet accurately detecting trees is challenging due to complex landscapes and the variability in image resolution caused by different satellite sensors or UAV flight altitudes. While deep learning architectures have shown promise in addressing these challenges, their effectiveness remains strongly dependent on the availability of large and manually labeled datasets, which are often expensive and difficult to obtain in sufficient quantity. In this work, we propose a novel pipeline that integrates domain adaptation with GANs and Diffusion models to enhance the quality of low-resolution aerial images. Our proposed pipeline enhances low-resolution imagery while preserving semantic content, enabling effective tree segmentation without requiring large volumes of manually annotated data. Leveraging models such as pix2pix, Real-ESRGAN, Latent Diffusion, and Stable Diffusion, we generate realistic and structurally consistent synthetic samples that expand the training dataset and unify scale across domains. This approach not only improves the robustness of segmentation models across different acquisition conditions but also provides a scalable and replicable solution for remote sensing scenarios with scarce annotation resources. Experimental results demonstrated an improvement of over 50% in IoU for low-resolution images, highlighting the effectiveness of our method compared to traditional pipelines.
Related papers
- InstaRevive: One-Step Image Enhancement via Dynamic Score Matching [66.97989469865828]
InstaRevive is an image enhancement framework that employs score-based diffusion distillation to harness potent generative capability.<n>Our framework delivers high-quality and visually appealing results across a diverse array of challenging tasks and datasets.
arXiv Detail & Related papers (2025-04-22T01:19:53Z) - A Diffusion-Based Framework for Terrain-Aware Remote Sensing Image Reconstruction [4.824120664293887]
SatelliteMaker is a diffusion-based method that reconstructs missing data across varying levels of data loss.<n>Digital Elevation Model (DEM) as a conditioning input and use tailored prompts to generate realistic images.<n>VGG-Adapter module based on Distribution Loss, which reduces distribution discrepancy and ensures style consistency.
arXiv Detail & Related papers (2025-04-16T14:19:57Z) - Deep Change Monitoring: A Hyperbolic Representative Learning Framework and a Dataset for Long-term Fine-grained Tree Change Detection [24.830328687562695]
Existing datasets fail to capture continuous fine-grained changes in trees due to low-resolution images and high acquisition costs.<n>We introduce UAVTC, a large-scale, long-term, high-resolution dataset collected using UAVs equipped with cameras.<n>We propose a novel Hyperbolic Siamese Network (HSN) for TC detection, enabling compact and hierarchical representations of dynamic tree changes.
arXiv Detail & Related papers (2025-03-01T22:29:29Z) - BD-Diff: Generative Diffusion Model for Image Deblurring on Unknown Domains with Blur-Decoupled Learning [55.21345354747609]
BD-Diff is a generative-diffusion-based model designed to enhance deblurring performance on unknown domains.<n>We employ two Q-Formers as structural representations and blur patterns extractors separately.<n>We introduce a reconstruction task to make the structural features and blur patterns complementary.
arXiv Detail & Related papers (2025-02-03T17:00:40Z) - SatFlow: Generative model based framework for producing High Resolution Gap Free Remote Sensing Imagery [0.0]
We present SatFlow, a generative model-based framework that fuses low-resolution MODIS imagery and Landsat observations to produce frequent, high-resolution, gap-free surface reflectance imagery.<n>Our model, trained via Conditional Flow Matching, demonstrates better performance in generating imagery with preserved structural and spectral integrity.<n>This capability is crucial for downstream applications such as crop phenology tracking, environmental change detection etc.
arXiv Detail & Related papers (2025-02-03T06:40:13Z) - Efficient Unsupervised Domain Adaptation Regression for Spatial-Temporal Sensor Fusion [6.963971634605796]
Low-cost, distributed sensor networks in environmental and biomedical domains have enabled continuous, large-scale health monitoring.<n>These systems often face challenges related to degraded data quality caused by sensor drift, noise, and insufficient calibration.<n>Traditional machine learning methods for sensor fusion and calibration rely on extensive feature engineering.<n>We propose a novel unsupervised domain adaptation (UDA) method tailored for regression tasks.
arXiv Detail & Related papers (2024-11-11T12:20:57Z) - Memory-efficient High-resolution OCT Volume Synthesis with Cascaded Amortized Latent Diffusion Models [48.87160158792048]
We introduce a cascaded amortized latent diffusion model (CA-LDM) that can synthesis high-resolution OCT volumes in a memory-efficient way.
Experiments on a public high-resolution OCT dataset show that our synthetic data have realistic high-resolution and global features, surpassing the capabilities of existing methods.
arXiv Detail & Related papers (2024-05-26T10:58:22Z) - Robust Disaster Assessment from Aerial Imagery Using Text-to-Image Synthetic Data [66.49494950674402]
We leverage emerging text-to-image generative models in creating large-scale synthetic supervision for the task of damage assessment from aerial images.
We build an efficient and easily scalable pipeline to generate thousands of post-disaster images from low-resource domains.
We validate the strength of our proposed framework under cross-geography domain transfer setting from xBD and SKAI images in both single-source and multi-source settings.
arXiv Detail & Related papers (2024-05-22T16:07:05Z) - DGNet: Dynamic Gradient-Guided Network for Water-Related Optics Image
Enhancement [77.0360085530701]
Underwater image enhancement (UIE) is a challenging task due to the complex degradation caused by underwater environments.
Previous methods often idealize the degradation process, and neglect the impact of medium noise and object motion on the distribution of image features.
Our approach utilizes predicted images to dynamically update pseudo-labels, adding a dynamic gradient to optimize the network's gradient space.
arXiv Detail & Related papers (2023-12-12T06:07:21Z) - RSDiff: Remote Sensing Image Generation from Text Using Diffusion Model [0.8747606955991705]
This research introduces a two-stage diffusion model methodology for synthesizing high-resolution satellite images from textual prompts.
The pipeline comprises a Low-Resolution Diffusion Model (LRDM) that generates initial images based on text inputs and a Super-Resolution Diffusion Model (SRDM) that refines these images into high-resolution outputs.
arXiv Detail & Related papers (2023-09-03T09:34:49Z) - Efficient texture-aware multi-GAN for image inpainting [5.33024001730262]
Recent GAN-based (Generative adversarial networks) inpainting methods show remarkable improvements.
We propose a multi-GAN architecture improving both the performance and rendering efficiency.
arXiv Detail & Related papers (2020-09-30T14:58:03Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.