Unsupervised data augmentation for object detection
- URL: http://arxiv.org/abs/2104.14965v1
- Date: Fri, 30 Apr 2021 13:02:42 GMT
- Title: Unsupervised data augmentation for object detection
- Authors: Yichen Zhang, Zeyang Song, Wenbo Li
- Abstract summary: We propose a framework making use of Generative Adversarial Networks(GAN) to perform unsupervised data augmentation.
Based on the recently supreme performance of YOLOv4, we propose a two-step pipeline that enables us to generate an image where the object lies in a certain position.
- Score: 13.465808931940595
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Data augmentation has always been an effective way to overcome overfitting
issue when the dataset is small. There are already lots of augmentation
operations such as horizontal flip, random crop or even Mixup. However, unlike
image classification task, we cannot simply perform these operations for object
detection task because of the lack of labeled bounding boxes information for
corresponding generated images. To address this challenge, we propose a
framework making use of Generative Adversarial Networks(GAN) to perform
unsupervised data augmentation. To be specific, based on the recently supreme
performance of YOLOv4, we propose a two-step pipeline that enables us to
generate an image where the object lies in a certain position. In this way, we
can accomplish the goal that generating an image with bounding box label.
Related papers
- Knowledge Combination to Learn Rotated Detection Without Rotated
Annotation [53.439096583978504]
Rotated bounding boxes drastically reduce output ambiguity of elongated objects.
Despite the effectiveness, rotated detectors are not widely employed.
We propose a framework that allows the model to predict precise rotated boxes.
arXiv Detail & Related papers (2023-04-05T03:07:36Z) - High-Quality Entity Segmentation [110.55724145851725]
CropFormer is designed to tackle the intractability of instance-level segmentation on high-resolution images.
It improves mask prediction by fusing high-res image crops that provide more fine-grained image details and the full image.
With CropFormer, we achieve a significant AP gain of $1.9$ on the challenging entity segmentation task.
arXiv Detail & Related papers (2022-11-10T18:58:22Z) - BigDatasetGAN: Synthesizing ImageNet with Pixel-wise Annotations [89.42397034542189]
We synthesize a large labeled dataset via a generative adversarial network (GAN)
We take image samples from the class-conditional generative model BigGAN trained on ImageNet, and manually annotate 5 images per class, for all 1k classes.
We create a new ImageNet benchmark by labeling an additional set of 8k real images and evaluate segmentation performance in a variety of settings.
arXiv Detail & Related papers (2022-01-12T20:28:34Z) - Bridging the Gap between Events and Frames through Unsupervised Domain
Adaptation [57.22705137545853]
We propose a task transfer method that allows models to be trained directly with labeled images and unlabeled event data.
We leverage the generative event model to split event features into content and motion features.
Our approach unlocks the vast amount of existing image datasets for the training of event-based neural networks.
arXiv Detail & Related papers (2021-09-06T17:31:37Z) - Semantic Segmentation with Generative Models: Semi-Supervised Learning
and Strong Out-of-Domain Generalization [112.68171734288237]
We propose a novel framework for discriminative pixel-level tasks using a generative model of both images and labels.
We learn a generative adversarial network that captures the joint image-label distribution and is trained efficiently using a large set of unlabeled images.
We demonstrate strong in-domain performance compared to several baselines, and are the first to showcase extreme out-of-domain generalization.
arXiv Detail & Related papers (2021-04-12T21:41:25Z) - Data Augmentation for Object Detection via Differentiable Neural
Rendering [71.00447761415388]
It is challenging to train a robust object detector when annotated data is scarce.
Existing approaches to tackle this problem include semi-supervised learning that interpolates labeled data from unlabeled data.
We introduce an offline data augmentation method for object detection, which semantically interpolates the training data with novel views.
arXiv Detail & Related papers (2021-03-04T06:31:06Z) - Six-channel Image Representation for Cross-domain Object Detection [17.854940064699985]
Deep learning models are data-driven and the excellent performance is highly dependent on the abundant and diverse datasets.
Some image-to-image translation techniques are employed to generate some fake data of some specific scenes to train the models.
We propose to inspire the original 3-channel images and their corresponding GAN-generated fake images to form 6-channel representations of the dataset.
arXiv Detail & Related papers (2021-01-03T04:50:03Z) - Object Segmentation Without Labels with Large-Scale Generative Models [43.679717400251924]
Recent rise of unsupervised and self-supervised learning has dramatically reduced the dependency on labeled data.
Large-scale unsupervised models can also perform a more challenging object segmentation task, requiring neither pixel-level nor image-level labeling.
We show that recent unsupervised GANs allow to differentiate between foreground/background pixels, providing high-quality saliency masks.
arXiv Detail & Related papers (2020-06-08T23:30:43Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.