RSFAKE-1M: A Large-Scale Dataset for Detecting Diffusion-Generated Remote Sensing Forgeries
- URL: http://arxiv.org/abs/2505.23283v1
- Date: Thu, 29 May 2025 09:30:46 GMT
- Title: RSFAKE-1M: A Large-Scale Dataset for Detecting Diffusion-Generated Remote Sensing Forgeries
- Authors: Zhihong Tan, Jiayi Wang, Huiying Shi, Binyuan Huang, Hongchen Wei, Zhenzhong Chen,
- Abstract summary: We introduce RSFAKE-1M, a large-scale dataset of 500K forged and 500K real remote sensing images.<n>The fake images are generated by ten diffusion models fine-tuned on remote sensing data.<n>The results reveal that diffusion-based remote sensing forgeries remain challenging for current methods.
- Score: 33.844219602982555
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Detecting forged remote sensing images is becoming increasingly critical, as such imagery plays a vital role in environmental monitoring, urban planning, and national security. While diffusion models have emerged as the dominant paradigm for image generation, their impact on remote sensing forgery detection remains underexplored. Existing benchmarks primarily target GAN-based forgeries or focus on natural images, limiting progress in this critical domain. To address this gap, we introduce RSFAKE-1M, a large-scale dataset of 500K forged and 500K real remote sensing images. The fake images are generated by ten diffusion models fine-tuned on remote sensing data, covering six generation conditions such as text prompts, structural guidance, and inpainting. This paper presents the construction of RSFAKE-1M along with a comprehensive experimental evaluation using both existing detectors and unified baselines. The results reveal that diffusion-based remote sensing forgeries remain challenging for current methods, and that models trained on RSFAKE-1M exhibit notably improved generalization and robustness. Our findings underscore the importance of RSFAKE-1M as a foundation for developing and evaluating next-generation forgery detection approaches in the remote sensing domain. The dataset and other supplementary materials are available at https://huggingface.co/datasets/TZHSW/RSFAKE/.
Related papers
- A Diffusion-Based Framework for Terrain-Aware Remote Sensing Image Reconstruction [4.824120664293887]
SatelliteMaker is a diffusion-based method that reconstructs missing data across varying levels of data loss.<n>Digital Elevation Model (DEM) as a conditioning input and use tailored prompts to generate realistic images.<n>VGG-Adapter module based on Distribution Loss, which reduces distribution discrepancy and ensures style consistency.
arXiv Detail & Related papers (2025-04-16T14:19:57Z) - Understanding and Improving Training-Free AI-Generated Image Detections with Vision Foundation Models [68.90917438865078]
Deepfake techniques for facial synthesis and editing pose serious risks for generative models.<n>In this paper, we investigate how detection performance varies across model backbones, types, and datasets.<n>We introduce Contrastive Blur, which enhances performance on facial images, and MINDER, which addresses noise type bias, balancing performance across domains.
arXiv Detail & Related papers (2024-11-28T13:04:45Z) - Semi-Truths: A Large-Scale Dataset of AI-Augmented Images for Evaluating Robustness of AI-Generated Image detectors [62.63467652611788]
We introduce SEMI-TRUTHS, featuring 27,600 real images, 223,400 masks, and 1,472,700 AI-augmented images.
Each augmented image is accompanied by metadata for standardized and targeted evaluation of detector robustness.
Our findings suggest that state-of-the-art detectors exhibit varying sensitivities to the types and degrees of perturbations, data distributions, and augmentation methods used.
arXiv Detail & Related papers (2024-11-12T01:17:27Z) - Semantic Guided Large Scale Factor Remote Sensing Image Super-resolution with Generative Diffusion Prior [13.148815217684277]
Large scale factor super-resolution (SR) algorithms are vital for maximizing the utilization of low-resolution (LR) satellite data captured from orbit.
Existing methods confront challenges in recovering SR images with clear textures and correct ground objects.
We introduce a novel framework, the Semantic Guided Diffusion Model (SGDM), designed for large scale factor remote sensing image super-resolution.
arXiv Detail & Related papers (2024-05-11T16:06:16Z) - Remote Diffusion [0.0]
I explored adapting Stable Diffusion v1.5 for generating domain-specific satellite and aerial images in remote sensing.
I used the RSICD dataset to train a Stable Diffusion model with a loss of 0.2.
I created a synthetic dataset for a Land Use Land Classification (LULC) task, employing prompting techniques with RAG and ChatGPT.
Despite extensive fine-tuning and dataset iterations, results indicated subpar image quality and realism, as indicated by high FID scores and domain-expert evaluation.
arXiv Detail & Related papers (2024-05-07T23:44:09Z) - GenFace: A Large-Scale Fine-Grained Face Forgery Benchmark and Cross Appearance-Edge Learning [50.7702397913573]
The rapid advancement of photorealistic generators has reached a critical juncture where the discrepancy between authentic and manipulated images is increasingly indistinguishable.
Although there have been a number of publicly available face forgery datasets, the forgery faces are mostly generated using GAN-based synthesis technology.
We propose a large-scale, diverse, and fine-grained high-fidelity dataset, namely GenFace, to facilitate the advancement of deepfake detection.
arXiv Detail & Related papers (2024-02-03T03:13:50Z) - Diffusion Facial Forgery Detection [56.69763252655695]
This paper introduces DiFF, a comprehensive dataset dedicated to face-focused diffusion-generated images.
We conduct extensive experiments on the DiFF dataset via a human test and several representative forgery detection methods.
The results demonstrate that the binary detection accuracy of both human observers and automated detectors often falls below 30%.
arXiv Detail & Related papers (2024-01-29T03:20:19Z) - DiAD: A Diffusion-based Framework for Multi-class Anomaly Detection [55.48770333927732]
We propose a Difusion-based Anomaly Detection (DiAD) framework for multi-class anomaly detection.
It consists of a pixel-space autoencoder, a latent-space Semantic-Guided (SG) network with a connection to the stable diffusion's denoising network, and a feature-space pre-trained feature extractor.
Experiments on MVTec-AD and VisA datasets demonstrate the effectiveness of our approach.
arXiv Detail & Related papers (2023-12-11T18:38:28Z) - On quantifying and improving realism of images generated with diffusion [50.37578424163951]
We propose a metric, called Image Realism Score (IRS), computed from five statistical measures of a given image.
IRS is easily usable as a measure to classify a given image as real or fake.
We experimentally establish the model- and data-agnostic nature of the proposed IRS by successfully detecting fake images generated by Stable Diffusion Model (SDM), Dalle2, Midjourney and BigGAN.
Our efforts have also led to Gen-100 dataset, which provides 1,000 samples for 100 classes generated by four high-quality models.
arXiv Detail & Related papers (2023-09-26T08:32:55Z) - Manifold Constraint Regularization for Remote Sensing Image Generation [34.68714863219855]
Generative Adversarial Networks (GANs) have shown notable accomplishments in remote sensing domain.
This paper analyzes the characteristics of remote sensing images and proposes manifold constraint regularization.
arXiv Detail & Related papers (2023-05-31T02:35:41Z) - Recognition of polar lows in Sentinel-1 SAR images with deep learning [5.571369922847262]
We introduce a novel dataset consisting of Sentinel-1 images labeled as positive; representing a maritime mesocyclone, or negative; representing a normal sea state.
The dataset is used to train a deep learning model to classify the labeled images.
The model yields an F-1 score of 0.95, indicating that polar lows can be consistently detected from SAR images.
arXiv Detail & Related papers (2022-03-30T15:32:39Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.