Data Augmentation for Object Detection via Differentiable Neural
Rendering
- URL: http://arxiv.org/abs/2103.02852v1
- Date: Thu, 4 Mar 2021 06:31:06 GMT
- Title: Data Augmentation for Object Detection via Differentiable Neural
Rendering
- Authors: Guanghan Ning, Guang Chen, Chaowei Tan, Si Luo, Liefeng Bo, Heng Huang
- Abstract summary: It is challenging to train a robust object detector when annotated data is scarce.
Existing approaches to tackle this problem include semi-supervised learning that interpolates labeled data from unlabeled data.
We introduce an offline data augmentation method for object detection, which semantically interpolates the training data with novel views.
- Score: 71.00447761415388
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: It is challenging to train a robust object detector when annotated data is
scarce. Existing approaches to tackle this problem include semi-supervised
learning that interpolates labeled data from unlabeled data, self-supervised
learning that exploit signals within unlabeled data via pretext tasks. Without
changing the supervised learning paradigm, we introduce an offline data
augmentation method for object detection, which semantically interpolates the
training data with novel views. Specifically, our proposed system generates
controllable views of training images based on differentiable neural rendering,
together with corresponding bounding box annotations which involve no human
intervention. Firstly, we extract and project pixel-aligned image features into
point clouds while estimating depth maps. We then re-project them with a target
camera pose and render a novel-view 2d image. Objects in the form of keypoints
are marked in point clouds to recover annotations in new views. It is fully
compatible with online data augmentation methods, such as affine transform,
image mixup, etc. Extensive experiments show that our method, as a cost-free
tool to enrich images and labels, can significantly boost the performance of
object detection systems with scarce training data. Code is available at
\url{https://github.com/Guanghan/DANR}.
Related papers
- CrossDehaze: Scaling Up Image Dehazing with Cross-Data Vision Alignment and Augmentation [47.425906124301775]
Methods based on priors and deep learning have been proposed to address the task of image dehazing.
We propose a novel method of internal and external data augmentation to improve the existing dehazing methodology.
Our approach significantly outperforms other advanced methods in dehazing and produces dehazed images that are closest to real haze-free images.
arXiv Detail & Related papers (2024-07-20T10:00:20Z) - Free-ATM: Exploring Unsupervised Learning on Diffusion-Generated Images
with Free Attention Masks [64.67735676127208]
Text-to-image diffusion models have shown great potential for benefiting image recognition.
Although promising, there has been inadequate exploration dedicated to unsupervised learning on diffusion-generated images.
We introduce customized solutions by fully exploiting the aforementioned free attention masks.
arXiv Detail & Related papers (2023-08-13T10:07:46Z) - Robust Object Detection in Remote Sensing Imagery with Noisy and Sparse
Geo-Annotations (Full Version) [4.493174773769076]
In this paper, we present a novel approach for training object detectors with extremely noisy and incomplete annotations.
Our method is based on a teacher-student learning framework and a correction module accounting for imprecise and missing annotations.
We demonstrate that our approach improves standard detectors by 37.1% $AP_50$ on a noisy real-world remote-sensing dataset.
arXiv Detail & Related papers (2022-10-24T07:25:31Z) - Learning to Detect Every Thing in an Open World [139.78830329914135]
We propose a simple yet surprisingly powerful data augmentation and training scheme we call Learning to Detect Every Thing (LDET)
To avoid suppressing hidden objects, background objects that are visible but unlabeled, we paste annotated objects on a background image sampled from a small region of the original image.
LDET leads to significant improvements on many datasets in the open world instance segmentation task.
arXiv Detail & Related papers (2021-12-03T03:56:06Z) - AugNet: End-to-End Unsupervised Visual Representation Learning with
Image Augmentation [3.6790362352712873]
We propose AugNet, a new deep learning training paradigm to learn image features from a collection of unlabeled pictures.
Our experiments demonstrate that the method is able to represent the image in low dimensional space.
Unlike many deep-learning-based image retrieval algorithms, our approach does not require access to external annotated datasets.
arXiv Detail & Related papers (2021-06-11T09:02:30Z) - Recognizing Actions in Videos from Unseen Viewpoints [80.6338404141284]
We show that current convolutional neural network models are unable to recognize actions from camera viewpoints not present in training data.
We introduce a new dataset for unseen view recognition and show the approaches ability to learn viewpoint invariant representations.
arXiv Detail & Related papers (2021-03-30T17:17:54Z) - Cross-Model Image Annotation Platform with Active Learning [0.0]
This work presents an End-to-End pipeline tool for object annotation and recognition.
We have developed a modular image annotation platform which seamlessly incorporates assisted image annotation, active learning and model training and evaluation.
The highest accuracy achieved is 74%.
arXiv Detail & Related papers (2020-08-06T01:42:25Z) - Single Image Cloud Detection via Multi-Image Fusion [23.641624507709274]
A primary challenge in developing algorithms is the cost of collecting annotated training data.
We demonstrate how recent advances in multi-image fusion can be leveraged to bootstrap single image cloud detection.
We collect a large dataset of Sentinel-2 images along with a per-pixel semantic labelling for land cover.
arXiv Detail & Related papers (2020-07-29T22:52:28Z) - Improving Object Detection with Selective Self-supervised Self-training [62.792445237541145]
We study how to leverage Web images to augment human-curated object detection datasets.
We retrieve Web images by image-to-image search, which incurs less domain shift from the curated data than other search methods.
We propose a novel learning method motivated by two parallel lines of work that explore unlabeled data for image classification.
arXiv Detail & Related papers (2020-07-17T18:05:01Z) - Distilling Localization for Self-Supervised Representation Learning [82.79808902674282]
Contrastive learning has revolutionized unsupervised representation learning.
Current contrastive models are ineffective at localizing the foreground object.
We propose a data-driven approach for learning in variance to backgrounds.
arXiv Detail & Related papers (2020-04-14T16:29:42Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.