Synthetic data enables faster annotation and robust segmentation for
multi-object grasping in clutter
- URL: http://arxiv.org/abs/2401.13405v1
- Date: Wed, 24 Jan 2024 11:58:30 GMT
- Title: Synthetic data enables faster annotation and robust segmentation for
multi-object grasping in clutter
- Authors: Dongmyoung Lee, Wei Chen, Nicolas Rojas
- Abstract summary: We propose a synthetic data generation method that minimizes human intervention and makes downstream image segmentation algorithms more robust.
Experiments show that the proposed synthetic scene generation can diminish labelling time dramatically.
Pick-and-place experiments demonstrate that segmentation trained on our hybrid dataset (98.9%, 70%) outperforms the real dataset and a publicly available dataset by (6.7%, 18.8%) and (2.8%, 10%) in terms of labelling and grasping success rate.
- Score: 9.092550803271005
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Object recognition and object pose estimation in robotic grasping continue to
be significant challenges, since building a labelled dataset can be time
consuming and financially costly in terms of data collection and annotation. In
this work, we propose a synthetic data generation method that minimizes human
intervention and makes downstream image segmentation algorithms more robust by
combining a generated synthetic dataset with a smaller real-world dataset
(hybrid dataset). Annotation experiments show that the proposed synthetic scene
generation can diminish labelling time dramatically. RGB image segmentation is
trained with hybrid dataset and combined with depth information to produce
pixel-to-point correspondence of individual segmented objects. The object to
grasp is then determined by the confidence score of the segmentation algorithm.
Pick-and-place experiments demonstrate that segmentation trained on our hybrid
dataset (98.9%, 70%) outperforms the real dataset and a publicly available
dataset by (6.7%, 18.8%) and (2.8%, 10%) in terms of labelling and grasping
success rate, respectively. Supplementary material is available at
https://sites.google.com/view/synthetic-dataset-generation.
Related papers
- Modified CycleGAN for the synthesization of samples for wheat head
segmentation [0.09999629695552192]
In the absence of an annotated dataset, synthetic data can be used for model development.
We develop a realistic annotated synthetic dataset for wheat head segmentation.
The resulting model achieved a Dice score of 83.4% on an internal dataset and 83.6% on two external Global Wheat Head Detection datasets.
arXiv Detail & Related papers (2024-02-23T06:42:58Z) - TarGEN: Targeted Data Generation with Large Language Models [51.87504111286201]
TarGEN is a multi-step prompting strategy for generating high-quality synthetic datasets.
We augment TarGEN with a method known as self-correction empowering LLMs to rectify inaccurately labeled instances.
A comprehensive analysis of the synthetic dataset compared to the original dataset reveals similar or higher levels of dataset complexity and diversity.
arXiv Detail & Related papers (2023-10-27T03:32:17Z) - Diffusion-based Data Augmentation for Nuclei Image Segmentation [68.28350341833526]
We introduce the first diffusion-based augmentation method for nuclei segmentation.
The idea is to synthesize a large number of labeled images to facilitate training the segmentation model.
The experimental results show that by augmenting 10% labeled real dataset with synthetic samples, one can achieve comparable segmentation results.
arXiv Detail & Related papers (2023-10-22T06:16:16Z) - DatasetDM: Synthesizing Data with Perception Annotations Using Diffusion
Models [61.906934570771256]
We present a generic dataset generation model that can produce diverse synthetic images and perception annotations.
Our method builds upon the pre-trained diffusion model and extends text-guided image synthesis to perception data generation.
We show that the rich latent code of the diffusion model can be effectively decoded as accurate perception annotations using a decoder module.
arXiv Detail & Related papers (2023-08-11T14:38:11Z) - Bridging the Gap: Enhancing the Utility of Synthetic Data via
Post-Processing Techniques [7.967995669387532]
generative models have emerged as a promising solution for generating synthetic datasets that can replace or augment real-world data.
We propose three novel post-processing techniques to improve the quality and diversity of the synthetic dataset.
Experiments show that Gap Filler (GaFi) effectively reduces the gap with real-accuracy scores to an error of 2.03%, 1.78%, and 3.99% on the Fashion-MNIST, CIFAR-10, and CIFAR-100 datasets, respectively.
arXiv Detail & Related papers (2023-05-17T10:50:38Z) - TRoVE: Transforming Road Scene Datasets into Photorealistic Virtual
Environments [84.6017003787244]
This work proposes a synthetic data generation pipeline to address the difficulties and domain-gaps present in simulated datasets.
We show that using annotations and visual cues from existing datasets, we can facilitate automated multi-modal data generation.
arXiv Detail & Related papers (2022-08-16T20:46:08Z) - Delving into High-Quality Synthetic Face Occlusion Segmentation Datasets [83.749895930242]
We propose two techniques for producing high-quality naturalistic synthetic occluded faces.
We empirically show the effectiveness and robustness of both methods, even for unseen occlusions.
We present two high-resolution real-world occluded face datasets with fine-grained annotations, RealOcc and RealOcc-Wild.
arXiv Detail & Related papers (2022-05-12T17:03:57Z) - DatasetGAN: Efficient Labeled Data Factory with Minimal Human Effort [117.41383937100751]
Current deep networks are extremely data-hungry, benefiting from training on large-scale datasets.
We show how the GAN latent code can be decoded to produce a semantic segmentation of the image.
These generated datasets can then be used for training any computer vision architecture just as real datasets are.
arXiv Detail & Related papers (2021-04-13T20:08:29Z) - Semi-synthesis: A fast way to produce effective datasets for stereo
matching [16.602343511350252]
Close-to-real-scene texture rendering is a key factor to boost up stereo matching performance.
We propose semi-synthetic, an effective and fast way to synthesize large amount of data with close-to-real-scene texture.
With further fine-tuning on the real dataset, we also achieve SOTA performance on Middlebury and competitive results on KITTI and ETH3D datasets.
arXiv Detail & Related papers (2021-01-26T14:34:49Z) - Mask-based Data Augmentation for Semi-supervised Semantic Segmentation [3.946367634483361]
We propose a new approach for data augmentation, termed ComplexMix, which incorporates aspects of CutMix and ClassMix with improved performance.
The proposed approach has the ability to control the complexity of the augmented data while attempting to be semantically-correct.
Experimental results show that our method yields improvement over state-of-the-art methods on standard datasets for semantic image segmentation.
arXiv Detail & Related papers (2021-01-25T15:09:34Z) - Self-supervised Robust Object Detectors from Partially Labelled Datasets [3.1669406516464007]
merging datasets allows us to train one integrated object detector, instead of training several ones.
We propose a training framework to overcome missing-label challenge of the merged datasets.
We evaluate our proposed framework for training Yolo on a simulated merged dataset with missing rate $approx!48%$ using VOC2012 and VOC2007.
arXiv Detail & Related papers (2020-05-23T15:18:20Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.