AeroGen: Enhancing Remote Sensing Object Detection with Diffusion-Driven Data Generation
- URL: http://arxiv.org/abs/2411.15497v2
- Date: Tue, 26 Nov 2024 09:54:02 GMT
- Title: AeroGen: Enhancing Remote Sensing Object Detection with Diffusion-Driven Data Generation
- Authors: Datao Tang, Xiangyong Cao, Xuan Wu, Jialin Li, Jing Yao, Xueru Bai, Deyu Meng,
- Abstract summary: Remote sensing image object detection (RSIOD) aims to identify and locate specific objects within satellite or aerial imagery.
There is a scarcity of labeled data in current RSIOD datasets, which significantly limits the performance of current detection algorithms.
This paper proposes a layout-controllable diffusion generative model (i.e. AeroGen) tailored for RSIOD.
- Score: 38.89367726721828
- License:
- Abstract: Remote sensing image object detection (RSIOD) aims to identify and locate specific objects within satellite or aerial imagery. However, there is a scarcity of labeled data in current RSIOD datasets, which significantly limits the performance of current detection algorithms. Although existing techniques, e.g., data augmentation and semi-supervised learning, can mitigate this scarcity issue to some extent, they are heavily dependent on high-quality labeled data and perform worse in rare object classes. To address this issue, this paper proposes a layout-controllable diffusion generative model (i.e. AeroGen) tailored for RSIOD. To our knowledge, AeroGen is the first model to simultaneously support horizontal and rotated bounding box condition generation, thus enabling the generation of high-quality synthetic images that meet specific layout and object category requirements. Additionally, we propose an end-to-end data augmentation framework that integrates a diversity-conditioned generator and a filtering mechanism to enhance both the diversity and quality of generated data. Experimental results demonstrate that the synthetic data produced by our method are of high quality and diversity. Furthermore, the synthetic RSIOD data can significantly improve the detection performance of existing RSIOD models, i.e., the mAP metrics on DIOR, DIOR-R, and HRSC datasets are improved by 3.7%, 4.3%, and 2.43%, respectively. The code is available at https://github.com/Sonettoo/AeroGen.
Related papers
- DODA: Diffusion for Object-detection Domain Adaptation in Agriculture [4.549305421261851]
We propose DODA, a data synthesizer that can generate high-quality object detection data for new domains in agriculture.
Specifically, we improve the controllability of layout-to-image through encoding layout as an image, thereby improving the quality of labels.
arXiv Detail & Related papers (2024-03-27T08:16:33Z) - DetDiffusion: Synergizing Generative and Perceptive Models for Enhanced Data Generation and Perception [78.26734070960886]
Current perceptive models heavily depend on resource-intensive datasets.
We introduce perception-aware loss (P.A. loss) through segmentation, improving both quality and controllability.
Our method customizes data augmentation by extracting and utilizing perception-aware attribute (P.A. Attr) during generation.
arXiv Detail & Related papers (2024-03-20T04:58:03Z) - SIRST-5K: Exploring Massive Negatives Synthesis with Self-supervised
Learning for Robust Infrared Small Target Detection [53.19618419772467]
Single-frame infrared small target (SIRST) detection aims to recognize small targets from clutter backgrounds.
With the development of Transformer, the scale of SIRST models is constantly increasing.
With a rich diversity of infrared small target data, our algorithm significantly improves the model performance and convergence speed.
arXiv Detail & Related papers (2024-03-08T16:14:54Z) - GenFace: A Large-Scale Fine-Grained Face Forgery Benchmark and Cross Appearance-Edge Learning [50.7702397913573]
The rapid advancement of photorealistic generators has reached a critical juncture where the discrepancy between authentic and manipulated images is increasingly indistinguishable.
Although there have been a number of publicly available face forgery datasets, the forgery faces are mostly generated using GAN-based synthesis technology.
We propose a large-scale, diverse, and fine-grained high-fidelity dataset, namely GenFace, to facilitate the advancement of deepfake detection.
arXiv Detail & Related papers (2024-02-03T03:13:50Z) - Innovative Horizons in Aerial Imagery: LSKNet Meets DiffusionDet for
Advanced Object Detection [55.2480439325792]
We present an in-depth evaluation of an object detection model that integrates the LSKNet backbone with the DiffusionDet head.
The proposed model achieves a mean average precision (MAP) of approximately 45.7%, which is a significant improvement.
This advancement underscores the effectiveness of the proposed modifications and sets a new benchmark in aerial image analysis.
arXiv Detail & Related papers (2023-11-21T19:49:13Z) - DiffusionEngine: Diffusion Model is Scalable Data Engine for Object
Detection [41.436817746749384]
Diffusion Model is a scalable data engine for object detection.
DiffusionEngine (DE) provides high-quality detection-oriented training pairs in a single stage.
arXiv Detail & Related papers (2023-09-07T17:55:01Z) - Generative adversarial networks for data-scarce spectral applications [0.0]
We report on an application of GANs in the domain of synthetic spectral data generation.
We show that CWGANs can act as a surrogate model with improved performance in the low-data regime.
arXiv Detail & Related papers (2023-07-14T16:27:24Z) - Detecting Anomalies using Generative Adversarial Networks on Images [0.0]
This paper proposes a novel Generative Adversarial Network (GAN) based model for anomaly detection.
It uses normal (non-anomalous) images to learn about the normality based on which it detects if an input image contains an anomalous/threat object.
Experiments are performed on three datasets, viz. CIFAR-10, MVTec AD (for industrial applications) and SIXray (for X-ray baggage security)
arXiv Detail & Related papers (2022-11-24T21:52:25Z) - Synthetic Data Supervised Salient Object Detection [40.991558165686136]
We propose a novel yet effective method for SOD, coined SODGAN, which can generate infinite high-quality image-mask pairs.
For the first time, our SODGAN tackles SOD with synthetic data directly generated from the generative model.
Our approach achieves a new SOTA performance in semi/weakly-supervised methods, and even outperforms several fully-supervised SOTA methods.
arXiv Detail & Related papers (2022-10-25T08:36:29Z) - DAE : Discriminatory Auto-Encoder for multivariate time-series anomaly
detection in air transportation [68.8204255655161]
We propose a novel anomaly detection model called Discriminatory Auto-Encoder (DAE)
It uses the baseline of a regular LSTM-based auto-encoder but with several decoders, each getting data of a specific flight phase.
Results show that the DAE achieves better results in both accuracy and speed of detection.
arXiv Detail & Related papers (2021-09-08T14:07:55Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.