Related papers: Spatiotemporal Satellite Image Downscaling with Transfer Encoders and Autoregressive Generative Models

Spatiotemporal Satellite Image Downscaling with Transfer Encoders and Autoregressive Generative Models

URL: http://arxiv.org/abs/2512.05139v1
Date: Mon, 01 Dec 2025 06:00:45 GMT
Title: Spatiotemporal Satellite Image Downscaling with Transfer Encoders and Autoregressive Generative Models
Authors: Yang Xiang, Jingwen Zhong, Yige Yan, Petros Koutrakis, Eric Garshick, Meredith Franklin,
Abstract summary: We present a transfer-learning generative downscaling framework to reconstruct fine satellite images from coarse scale inputs.<n>Our approach combines a lightweight U-Net transfer encoder with a diffusion-based generative generative model.
Score: 6.277747770841536
License: http://creativecommons.org/licenses/by/4.0/
Abstract: We present a transfer-learning generative downscaling framework to reconstruct fine resolution satellite images from coarse scale inputs. Our approach combines a lightweight U-Net transfer encoder with a diffusion-based generative model. The simpler U-Net is first pretrained on a long time series of coarse resolution data to learn spatiotemporal representations; its encoder is then frozen and transferred to a larger downscaling model as physically meaningful latent features. Our application uses NASA's MERRA-2 reanalysis as the low resolution source domain (50 km) and the GEOS-5 Nature Run (G5NR) as the high resolution target (7 km). Our study area included a large area in Asia, which was made computationally tractable by splitting into two subregions and four seasons. We conducted domain similarity analysis using Wasserstein distances confirmed minimal distributional shift between MERRA-2 and G5NR, validating the safety of parameter frozen transfer. Across seasonal regional splits, our model achieved excellent performance (R2 = 0.65 to 0.94), outperforming comparison models including deterministic U-Nets, variational autoencoders, and prior transfer learning baselines. Out of data evaluations using semivariograms, ACF/PACF, and lag-based RMSE/R2 demonstrated that the predicted downscaled images preserved physically consistent spatial variability and temporal autocorrelation, enabling stable autoregressive reconstruction beyond the G5NR record. These results show that transfer enhanced diffusion models provide a robust and physically coherent solution for downscaling a long time series of coarse resolution images with limited training periods. This advancement has significant implications for improving environmental exposure assessment and long term environmental monitoring.

Related papers

Scaling Text-to-Image Diffusion Transformers with Representation Autoencoders [74.72147962028265]
Representation Autoencoders (RAEs) have shown distinct advantages in diffusion modeling on ImageNet.<n>We investigate whether this framework can scale to large-scale, freeform text-to-image (T2I) generation.
arXiv Detail & Related papers (2026-01-22T18:58:16Z)
ReconMOST: Multi-Layer Sea Temperature Reconstruction with Observations-Guided Diffusion [48.540756751934836]
ReconMOST is a data-driven guided diffusion model framework for multi-layer sea temperature reconstruction.<n>Our method extends ML-based SST reconstruction to a global, multi-layer setting, handling over 92.5% missing data.
arXiv Detail & Related papers (2025-06-12T06:27:22Z)
Data Augmentation and Resolution Enhancement using GANs and Diffusion Models for Tree Segmentation [49.13393683126712]
Urban forests play a key role in enhancing environmental quality and supporting biodiversity in cities.<n> accurately detecting trees is challenging due to complex landscapes and the variability in image resolution caused by different satellite sensors or UAV flight altitudes.<n>We propose a novel pipeline that integrates domain adaptation with GANs and Diffusion models to enhance the quality of low-resolution aerial images.
arXiv Detail & Related papers (2025-05-21T03:57:10Z)
A Statistical Evaluation of Indoor LoRaWAN Environment-Aware Propagation for 6G: MLR, ANOVA, and Residual Distribution Analysis [6.8093214146903875]
This study proposes a two-stage approach to capture and analyze complexities using an extensive dataset of 1,328,334 field measurements.<n>We implement a multiple linear regression model that includes traditional propagation metrics and an extension with proposed environmental variables.<n>Using analysis of variance, we demonstrate that adding these environmental factors can reduce unexplained variance by 42.32 percent.
arXiv Detail & Related papers (2025-04-23T13:19:35Z)
Spatio-Temporal Forecasting of PM2.5 via Spatial-Diffusion guided Encoder-Decoder Architecture [9.955223104442755]
We present a novel S-Temporal Graph Network architecture that specifically captures dependencies to forecast PM2.5 concentration.<n>Our model is based on an encoder-decoder architecture where the decoder parts leverage recurrent units (GRU) augmented with a graph neural network (Transformerv) to account for spatial diffusion.
arXiv Detail & Related papers (2024-12-18T15:18:12Z)
Residual Corrective Diffusion Modeling for Km-scale Atmospheric Downscaling [58.456404022536425]
State of the art for physical hazard prediction from weather and climate requires expensive km-scale numerical simulations driven by coarser resolution global inputs. Here, a generative diffusion architecture is explored for downscaling such global inputs to km-scale, as a cost-effective machine learning alternative. The model is trained to predict 2km data from a regional weather model over Taiwan, conditioned on a 25km global reanalysis.
arXiv Detail & Related papers (2023-09-24T19:57:22Z)
GIFD: A Generative Gradient Inversion Method with Feature Domain Optimization [52.55628139825667]
Federated Learning (FL) has emerged as a promising distributed machine learning framework to preserve clients' privacy. Recent studies find that an attacker can invert the shared gradients and recover sensitive data against an FL system by leveraging pre-trained generative adversarial networks (GAN) as prior knowledge. We propose textbfGradient textbfInversion over textbfFeature textbfDomains (GIFD), which disassembles the GAN model and searches the feature domains of the intermediate layers.
arXiv Detail & Related papers (2023-08-09T04:34:21Z)
Complexity Matters: Rethinking the Latent Space for Generative Modeling [65.64763873078114]
In generative modeling, numerous successful approaches leverage a low-dimensional latent space, e.g., Stable Diffusion. In this study, we aim to shed light on this under-explored topic by rethinking the latent space from the perspective of model complexity.
arXiv Detail & Related papers (2023-07-17T07:12:29Z)
A Generative Deep Learning Approach to Stochastic Downscaling of Precipitation Forecasts [0.5906031288935515]
Generative adversarial networks (GANs) have been demonstrated by the computer vision community to be successful at super-resolution problems. We show that GANs and VAE-GANs can match the statistical properties of state-of-the-art pointwise post-processing methods whilst creating high-resolution, spatially coherent precipitation maps.
arXiv Detail & Related papers (2022-04-05T07:19:42Z)
Vision in adverse weather: Augmentation using CycleGANs with various object detectors for robust perception in autonomous racing [70.16043883381677]
In autonomous racing, the weather can change abruptly, causing significant degradation in perception, resulting in ineffective manoeuvres. In order to improve detection in adverse weather, deep-learning-based models typically require extensive datasets captured in such conditions. We introduce an approach of using synthesised adverse condition datasets in autonomous racing (generated using CycleGAN) to improve the performance of four out of five state-of-the-art detectors.
arXiv Detail & Related papers (2022-01-10T10:02:40Z)
Statistical Downscaling of Temperature Distributions from the Synoptic Scale to the Mesoscale Using Deep Convolutional Neural Networks [0.0]
One of the promising applications is developing a statistical surrogate model that converts the output images of low-resolution dynamic models to high-resolution images. Our study evaluates a surrogate model that downscales synoptic temperature fields to mesoscale temperature fields every 6 hours. If the surrogate models are implemented at short time intervals, they will provide high-resolution weather forecast guidance or environment emergency alerts at low cost.
arXiv Detail & Related papers (2020-07-20T06:24:08Z)
Transferring and Regularizing Prediction for Semantic Segmentation [115.88957139226966]
In this paper, we exploit the intrinsic properties of semantic segmentation to alleviate such problem for model transfer. We present a Regularizer of Prediction Transfer (RPT) that imposes the intrinsic properties as constraints to regularize model transfer in an unsupervised fashion. Extensive experiments are conducted to verify the proposal of RPT on the transfer of models trained on GTA5 and SYNTHIA (synthetic data) to Cityscapes dataset (urban street scenes)
arXiv Detail & Related papers (2020-06-11T16:19:41Z)
An Unsupervised Generative Neural Approach for InSAR Phase Filtering and Coherence Estimation [3.8218584696400484]
We propose "GenInSAR", a CNN-based generative model for joint phase filtering and coherence estimation. GenInSAR's unsupervised training on satellite and simulated noisy InSAR images outperforms other related methods in total residue reduction. Phase, Coherence Root-Mean-Squared-Error and Phase Cosine Error have average improvements of 0.54, 0.07, and 0.05 respectively compared to the related methods.
arXiv Detail & Related papers (2020-01-27T08:50:39Z)

This list is automatically generated from the titles and abstracts of the papers in this site.