GI-NNet \& RGI-NNet: Development of Robotic Grasp Pose Models, Trainable
with Large as well as Limited Labelled Training Datasets, under supervised
and semi supervised paradigms
- URL: http://arxiv.org/abs/2107.07452v1
- Date: Thu, 15 Jul 2021 16:55:49 GMT
- Title: GI-NNet \& RGI-NNet: Development of Robotic Grasp Pose Models, Trainable
with Large as well as Limited Labelled Training Datasets, under supervised
and semi supervised paradigms
- Authors: Priya Shukla, Nilotpal Pramanik, Deepesh Mehta and G.C. Nandi
- Abstract summary: We use deep learning techniques to help robots learn to generate and execute appropriate grasps quickly.
We developed a Generative Inception Neural Network (GI-NNet) model, capable of generating antipodal robotic grasps on seen as well as unseen objects.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Our way of grasping objects is challenging for efficient, intelligent and
optimal grasp by COBOTs. To streamline the process, here we use deep learning
techniques to help robots learn to generate and execute appropriate grasps
quickly. We developed a Generative Inception Neural Network (GI-NNet) model,
capable of generating antipodal robotic grasps on seen as well as unseen
objects. It is trained on Cornell Grasping Dataset (CGD) and attained 98.87%
grasp pose accuracy for detecting both regular and irregular shaped objects
from RGB-Depth (RGB-D) images while requiring only one third of the network
trainable parameters as compared to the existing approaches. However, to attain
this level of performance the model requires the entire 90% of the available
labelled data of CGD keeping only 10% labelled data for testing which makes it
vulnerable to poor generalization. Furthermore, getting sufficient and quality
labelled dataset is becoming increasingly difficult keeping in pace with the
requirement of gigantic networks. To address these issues, we attach our model
as a decoder with a semi-supervised learning based architecture known as Vector
Quantized Variational Auto Encoder (VQVAE), which works efficiently when
trained both with the available labelled and unlabelled data. The proposed
model, which we name as Representation based GI-NNet (RGI-NNet), has been
trained with various splits of label data on CGD with as minimum as 10%
labelled dataset together with latent embedding generated from VQVAE up to 50%
labelled data with latent embedding obtained from VQVAE. The performance level,
in terms of grasp pose accuracy of RGI-NNet, varies between 92.13% to 95.6%
which is far better than several existing models trained with only labelled
dataset. For the performance verification of both GI-NNet and RGI-NNet models,
we use Anukul (Baxter) hardware cobot.
Related papers
- GOODAT: Towards Test-time Graph Out-of-Distribution Detection [103.40396427724667]
Graph neural networks (GNNs) have found widespread application in modeling graph data across diverse domains.
Recent studies have explored graph OOD detection, often focusing on training a specific model or modifying the data on top of a well-trained GNN.
This paper introduces a data-centric, unsupervised, and plug-and-play solution that operates independently of training data and modifications of GNN architecture.
arXiv Detail & Related papers (2024-01-10T08:37:39Z) - SeiT++: Masked Token Modeling Improves Storage-efficient Training [36.95646819348317]
Recent advancements in Deep Neural Network (DNN) models have significantly improved performance across computer vision tasks.
achieving highly generalizable and high-performing vision models requires expansive datasets, resulting in significant storage requirements.
Recent breakthrough by SeiT proposed the use of Vector-Quantized (VQ) feature vectors (i.e., tokens) as network inputs for vision classification.
In this paper, we extend SeiT by integrating Masked Token Modeling (MTM) for self-supervised pre-training.
arXiv Detail & Related papers (2023-12-15T04:11:34Z) - Exploring the Effectiveness of Dataset Synthesis: An application of
Apple Detection in Orchards [68.95806641664713]
We explore the usability of Stable Diffusion 2.1-base for generating synthetic datasets of apple trees for object detection.
We train a YOLOv5m object detection model to predict apples in a real-world apple detection dataset.
Results demonstrate that the model trained on generated data is slightly underperforming compared to a baseline model trained on real-world images.
arXiv Detail & Related papers (2023-06-20T09:46:01Z) - Post-training Model Quantization Using GANs for Synthetic Data
Generation [57.40733249681334]
We investigate the use of synthetic data as a substitute for the calibration with real data for the quantization method.
We compare the performance of models quantized using data generated by StyleGAN2-ADA and our pre-trained DiStyleGAN, with quantization using real data and an alternative data generation method based on fractal images.
arXiv Detail & Related papers (2023-05-10T11:10:09Z) - Data-Free Adversarial Knowledge Distillation for Graph Neural Networks [62.71646916191515]
We propose the first end-to-end framework for data-free adversarial knowledge distillation on graph structured data (DFAD-GNN)
To be specific, our DFAD-GNN employs a generative adversarial network, which mainly consists of three components: a pre-trained teacher model and a student model are regarded as two discriminators, and a generator is utilized for deriving training graphs to distill knowledge from the teacher model into the student model.
Our DFAD-GNN significantly surpasses state-of-the-art data-free baselines in the graph classification task.
arXiv Detail & Related papers (2022-05-08T08:19:40Z) - Self-Supervised Pre-Training for Transformer-Based Person
Re-Identification [54.55281692768765]
Transformer-based supervised pre-training achieves great performance in person re-identification (ReID)
Due to the domain gap between ImageNet and ReID datasets, it usually needs a larger pre-training dataset to boost the performance.
This work aims to mitigate the gap between the pre-training and ReID datasets from the perspective of data and model structure.
arXiv Detail & Related papers (2021-11-23T18:59:08Z) - Development of a robust cascaded architecture for intelligent robot
grasping using limited labelled data [0.0]
In the case of robots, we can not afford to spend that much time on making it to learn how to grasp objects effectively.
We propose an efficient learning architecture based on VQVAE so that robots can be taught with sufficient data corresponding to correct grasping.
A semi-supervised learning based model which has much more generalization capability even with limited labelled data set has been investigated.
arXiv Detail & Related papers (2021-11-06T11:01:15Z) - Model Composition: Can Multiple Neural Networks Be Combined into a
Single Network Using Only Unlabeled Data? [6.0945220518329855]
This paper investigates the idea of combining multiple trained neural networks using unlabeled data.
To this end, the proposed method makes use of generation, filtering, and aggregation of reliable pseudo-labels collected from unlabeled data.
Our method supports using an arbitrary number of input models with arbitrary architectures and categories.
arXiv Detail & Related papers (2021-10-20T04:17:25Z) - Effective Model Sparsification by Scheduled Grow-and-Prune Methods [73.03533268740605]
We propose a novel scheduled grow-and-prune (GaP) methodology without pre-training the dense models.
Experiments have shown that such models can match or beat the quality of highly optimized dense models at 80% sparsity on a variety of tasks.
arXiv Detail & Related papers (2021-06-18T01:03:13Z) - A data-set of piercing needle through deformable objects for Deep
Learning from Demonstrations [0.21096737598952847]
This paper presents a dataset of inserting/piercing a needle with two arms of da Vinci Research Kit in/through soft tissues.
We implement several deep RLfD architectures, including simple feed-forward CNNs and different Recurrent Convolutional Networks (RCNs)
Our study indicates RCNs improve the prediction accuracy of the model despite that the baseline feed-forward CNNs successfully learns the relationship between the visual information and the next step control actions of the robot.
arXiv Detail & Related papers (2020-12-04T08:27:06Z) - Semi-supervised Grasp Detection by Representation Learning in a Vector
Quantized Latent Space [1.3048920509133808]
In this paper, a semi-supervised learning based grasp detection approach has been presented.
To the best of our knowledge, this is the first time a Variational AutoEncoder (VAE) has been applied in the domain of robotic grasp detection.
The model performs significantly better than the existing approaches which do not make use of unlabelled images to improve the grasp.
arXiv Detail & Related papers (2020-01-23T12:47:13Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.