Related papers: Gen2Det: Generate to Detect

Gen2Det: Generate to Detect

URL: http://arxiv.org/abs/2312.04566v1
Date: Thu, 7 Dec 2023 18:59:58 GMT
Title: Gen2Det: Generate to Detect
Authors: Saksham Suri, Fanyi Xiao, Animesh Sinha, Sean Chang Culatana, Raghuraman Krishnamoorthi, Chenchen Zhu, Abhinav Shrivastava
Abstract summary: We motivate and present Gen2Det, a simple modular pipeline to create synthetic training data for object detection for free. In addition to the synthetic data, Gen2Det proposes a suite of techniques to best utilize the generated data, including image-level filtering, instance-level filtering, and better training recipe.
Score: 42.13657805295144
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Recently diffusion models have shown improvement in synthetic image quality as well as better control in generation. We motivate and present Gen2Det, a simple modular pipeline to create synthetic training data for object detection for free by leveraging state-of-the-art grounded image generation methods. Unlike existing works which generate individual object instances, require identifying foreground followed by pasting on other images, we simplify to directly generating scene-centric images. In addition to the synthetic data, Gen2Det also proposes a suite of techniques to best utilize the generated data, including image-level filtering, instance-level filtering, and better training recipe to account for imperfections in the generation. Using Gen2Det, we show healthy improvements on object detection and segmentation tasks under various settings and agnostic to detection methods. In the long-tailed detection setting on LVIS, Gen2Det improves the performance on rare categories by a large margin while also significantly improving the performance on other categories, e.g. we see an improvement of 2.13 Box AP and 1.84 Mask AP over just training on real data on LVIS with Mask R-CNN. In the low-data regime setting on COCO, Gen2Det consistently improves both Box and Mask AP by 2.27 and 1.85 points. In the most general detection setting, Gen2Det still demonstrates robust performance gains, e.g. it improves the Box and Mask AP on COCO by 0.45 and 0.32 points.

Related papers

Practical Manipulation Model for Robust Deepfake Detection [55.2480439325792]
We develop a more real-world degradation model in the area of image super-resolution.<n>We extend the space of pseudo-fakes by using Poisson blending, more diverse masks, generator artifacts, and distractors.<n>We show clear increases of $3.51%$ and $6.21%$ AUC on the DFDC and DFDCP datasets, respectively.
arXiv Detail & Related papers (2025-06-05T15:06:16Z)
Understanding and Improving Training-Free AI-Generated Image Detections with Vision Foundation Models [68.90917438865078]
Deepfake techniques for facial synthesis and editing pose serious risks for generative models. In this paper, we investigate how detection performance varies across model backbones, types, and datasets. We introduce Contrastive Blur, which enhances performance on facial images, and MINDER, which addresses noise type bias, balancing performance across domains.
arXiv Detail & Related papers (2024-11-28T13:04:45Z)
AeroGen: Enhancing Remote Sensing Object Detection with Diffusion-Driven Data Generation [38.89367726721828]
Remote sensing image object detection (RSIOD) aims to identify and locate specific objects within satellite or aerial imagery. There is a scarcity of labeled data in current RSIOD datasets, which significantly limits the performance of current detection algorithms. This paper proposes a layout-controllable diffusion generative model (i.e. AeroGen) tailored for RSIOD.
arXiv Detail & Related papers (2024-11-23T09:04:33Z)
GenView: Enhancing View Quality with Pretrained Generative Model for Self-Supervised Learning [90.13980177575809]
GenView is a controllable framework that augments the diversity of positive views. We introduce a quality-driven contrastive loss, which assesses the quality of positive pairs. Thanks to the improved positive view quality and the quality-driven contrastive loss, GenView significantly improves self-supervised learning across various tasks.
arXiv Detail & Related papers (2024-03-18T17:41:26Z)
Active Generation for Image Classification [45.93535669217115]
We propose to address the efficiency of image generation by focusing on the specific needs and characteristics of the model. With a central tenet of active learning, our method, named ActGen, takes a training-aware approach to image generation.
arXiv Detail & Related papers (2024-03-11T08:45:31Z)
InstaGen: Enhancing Object Detection by Training on Synthetic Dataset [59.445498550159755]
We present a novel paradigm to enhance the ability of object detector, e.g., expanding categories or improving detection performance. We integrate an instance-level grounding head into a pre-trained, generative diffusion model, to augment it with the ability of localising instances in the generated images. We conduct thorough experiments to show that, this enhanced version of diffusion model, termed as InstaGen, can serve as a data synthesizer.
arXiv Detail & Related papers (2024-02-08T18:59:53Z)
Randomize to Generalize: Domain Randomization for Runway FOD Detection [1.4249472316161877]
Tiny Object Detection is challenging due to small size, low resolution, occlusion, background clutter, lighting conditions and small object-to-image ratio. We propose a novel two-stage methodology Synthetic Image Augmentation (SRIA) to enhance generalization capabilities of models encountering 2D datasets. We report that detection accuracy improved from an initial 41% to 92% for OOD test set.
arXiv Detail & Related papers (2023-09-23T05:02:31Z)
GeoDiffusion: Text-Prompted Geometric Control for Object Detection Data Generation [91.01581867841894]
We propose the GeoDiffusion, a simple framework that can flexibly translate various geometric conditions into text prompts. Our GeoDiffusion is able to encode not only the bounding boxes but also extra geometric conditions such as camera views in self-driving scenes.
arXiv Detail & Related papers (2023-06-07T17:17:58Z)
Performance of GAN-based augmentation for deep learning COVID-19 image classification [57.1795052451257]
The biggest challenge in the application of deep learning to the medical domain is the availability of training data. Data augmentation is a typical methodology used in machine learning when confronted with a limited data set. In this work, a StyleGAN2-ADA model of Generative Adversarial Networks is trained on the limited COVID-19 chest X-ray image set.
arXiv Detail & Related papers (2023-04-18T15:39:58Z)
Augment and Criticize: Exploring Informative Samples for Semi-Supervised Monocular 3D Object Detection [64.65563422852568]
We improve the challenging monocular 3D object detection problem with a general semi-supervised framework. We introduce a novel, simple, yet effective Augment and Criticize' framework that explores abundant informative samples from unlabeled data. The two new detectors, dubbed 3DSeMo_DLE and 3DSeMo_FLEX, achieve state-of-the-art results with remarkable improvements for over 3.5% AP_3D/BEV (Easy) on KITTI.
arXiv Detail & Related papers (2023-03-20T16:28:15Z)
Towards Fine-grained Image Classification with Generative Adversarial Networks and Facial Landmark Detection [0.0]
We use GAN-based data augmentation to generate extra dataset instances. We validated our work by evaluating the accuracy of fine-grained image classification on the recent Vision Transformer (ViT) Model.
arXiv Detail & Related papers (2021-08-28T06:32:42Z)

This list is automatically generated from the titles and abstracts of the papers in this site.