Gen2Det: Generate to Detect
- URL: http://arxiv.org/abs/2312.04566v1
- Date: Thu, 7 Dec 2023 18:59:58 GMT
- Title: Gen2Det: Generate to Detect
- Authors: Saksham Suri, Fanyi Xiao, Animesh Sinha, Sean Chang Culatana,
Raghuraman Krishnamoorthi, Chenchen Zhu, Abhinav Shrivastava
- Abstract summary: We motivate and present Gen2Det, a simple modular pipeline to create synthetic training data for object detection for free.
In addition to the synthetic data, Gen2Det proposes a suite of techniques to best utilize the generated data, including image-level filtering, instance-level filtering, and better training recipe.
- Score: 42.13657805295144
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Recently diffusion models have shown improvement in synthetic image quality
as well as better control in generation. We motivate and present Gen2Det, a
simple modular pipeline to create synthetic training data for object detection
for free by leveraging state-of-the-art grounded image generation methods.
Unlike existing works which generate individual object instances, require
identifying foreground followed by pasting on other images, we simplify to
directly generating scene-centric images. In addition to the synthetic data,
Gen2Det also proposes a suite of techniques to best utilize the generated data,
including image-level filtering, instance-level filtering, and better training
recipe to account for imperfections in the generation. Using Gen2Det, we show
healthy improvements on object detection and segmentation tasks under various
settings and agnostic to detection methods. In the long-tailed detection
setting on LVIS, Gen2Det improves the performance on rare categories by a large
margin while also significantly improving the performance on other categories,
e.g. we see an improvement of 2.13 Box AP and 1.84 Mask AP over just training
on real data on LVIS with Mask R-CNN. In the low-data regime setting on COCO,
Gen2Det consistently improves both Box and Mask AP by 2.27 and 1.85 points. In
the most general detection setting, Gen2Det still demonstrates robust
performance gains, e.g. it improves the Box and Mask AP on COCO by 0.45 and
0.32 points.
Related papers
- GenView: Enhancing View Quality with Pretrained Generative Model for Self-Supervised Learning [90.13980177575809]
GenView is a controllable framework that augments the diversity of positive views.
We introduce a quality-driven contrastive loss, which assesses the quality of positive pairs.
Thanks to the improved positive view quality and the quality-driven contrastive loss, GenView significantly improves self-supervised learning across various tasks.
arXiv Detail & Related papers (2024-03-18T17:41:26Z) - Active Generation for Image Classification [45.93535669217115]
We propose to address the efficiency of image generation by focusing on the specific needs and characteristics of the model.
With a central tenet of active learning, our method, named ActGen, takes a training-aware approach to image generation.
arXiv Detail & Related papers (2024-03-11T08:45:31Z) - InstaGen: Enhancing Object Detection by Training on Synthetic Dataset [59.445498550159755]
We present a novel paradigm to enhance the ability of object detector, e.g., expanding categories or improving detection performance.
We integrate an instance-level grounding head into a pre-trained, generative diffusion model, to augment it with the ability of localising instances in the generated images.
We conduct thorough experiments to show that, this enhanced version of diffusion model, termed as InstaGen, can serve as a data synthesizer.
arXiv Detail & Related papers (2024-02-08T18:59:53Z) - Randomize to Generalize: Domain Randomization for Runway FOD Detection [1.4249472316161877]
Tiny Object Detection is challenging due to small size, low resolution, occlusion, background clutter, lighting conditions and small object-to-image ratio.
We propose a novel two-stage methodology Synthetic Image Augmentation (SRIA) to enhance generalization capabilities of models encountering 2D datasets.
We report that detection accuracy improved from an initial 41% to 92% for OOD test set.
arXiv Detail & Related papers (2023-09-23T05:02:31Z) - GeoDiffusion: Text-Prompted Geometric Control for Object Detection Data
Generation [91.01581867841894]
We propose the GeoDiffusion, a simple framework that can flexibly translate various geometric conditions into text prompts.
Our GeoDiffusion is able to encode not only the bounding boxes but also extra geometric conditions such as camera views in self-driving scenes.
arXiv Detail & Related papers (2023-06-07T17:17:58Z) - Performance of GAN-based augmentation for deep learning COVID-19 image
classification [57.1795052451257]
The biggest challenge in the application of deep learning to the medical domain is the availability of training data.
Data augmentation is a typical methodology used in machine learning when confronted with a limited data set.
In this work, a StyleGAN2-ADA model of Generative Adversarial Networks is trained on the limited COVID-19 chest X-ray image set.
arXiv Detail & Related papers (2023-04-18T15:39:58Z) - Adaptive Sparse Convolutional Networks with Global Context Enhancement
for Faster Object Detection on Drone Images [26.51970603200391]
This paper investigates optimizing the detection head based on the sparse convolution.
It suffers from inadequate integration of contextual information of tiny objects.
We propose a novel global context-enhanced adaptive sparse convolutional network.
arXiv Detail & Related papers (2023-03-25T14:42:50Z) - Augment and Criticize: Exploring Informative Samples for Semi-Supervised
Monocular 3D Object Detection [64.65563422852568]
We improve the challenging monocular 3D object detection problem with a general semi-supervised framework.
We introduce a novel, simple, yet effective Augment and Criticize' framework that explores abundant informative samples from unlabeled data.
The two new detectors, dubbed 3DSeMo_DLE and 3DSeMo_FLEX, achieve state-of-the-art results with remarkable improvements for over 3.5% AP_3D/BEV (Easy) on KITTI.
arXiv Detail & Related papers (2023-03-20T16:28:15Z) - Towards Fine-grained Image Classification with Generative Adversarial
Networks and Facial Landmark Detection [0.0]
We use GAN-based data augmentation to generate extra dataset instances.
We validated our work by evaluating the accuracy of fine-grained image classification on the recent Vision Transformer (ViT) Model.
arXiv Detail & Related papers (2021-08-28T06:32:42Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.