End-to-end Trainable Deep Neural Network for Robotic Grasp Detection and
Semantic Segmentation from RGB
- URL: http://arxiv.org/abs/2107.05287v1
- Date: Mon, 12 Jul 2021 09:45:13 GMT
- Title: End-to-end Trainable Deep Neural Network for Robotic Grasp Detection and
Semantic Segmentation from RGB
- Authors: Stefan Ainetter and Friedrich Fraundorfer
- Abstract summary: We introduce a novel, end-to-end trainable CNN-based architecture to deliver high quality results for grasp detection suitable for a parallel-plate gripper.
Our proposed network delivers state-of-the-art accuracy on two popular grasp dataset, namely Cornell and Jacquard.
- Score: 13.546424272544805
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In this work, we introduce a novel, end-to-end trainable CNN-based
architecture to deliver high quality results for grasp detection suitable for a
parallel-plate gripper, and semantic segmentation. Utilizing this, we propose a
novel refinement module that takes advantage of previously calculated grasp
detection and semantic segmentation and further increases grasp detection
accuracy. Our proposed network delivers state-of-the-art accuracy on two
popular grasp dataset, namely Cornell and Jacquard. As additional contribution,
we provide a novel dataset extension for the OCID dataset, making it possible
to evaluate grasp detection in highly challenging scenes. Using this dataset,
we show that semantic segmentation can additionally be used to assign grasp
candidates to object classes, which can be used to pick specific objects in the
scene.
Related papers
- Self-supervised Pre-training for Semantic Segmentation in an Indoor
Scene [8.357801312689622]
We propose RegConsist, a method for self-supervised pre-training of a semantic segmentation model.
We use a variant of contrastive learning to train a DCNN model for predicting semantic segmentation from RGB views in the target environment.
The proposed method outperforms models pre-trained on ImageNet and achieves competitive performance when using models that are trained for exactly the same task but on a different dataset.
arXiv Detail & Related papers (2022-10-04T20:10:14Z) - Depth-aware Object Segmentation and Grasp Detection for Robotic Picking
Tasks [13.337131101813934]
We present a novel deep neural network architecture for joint class-agnostic object segmentation and grasp detection for robotic picking tasks.
We introduce depth-aware Coordinate Convolution (CoordConv), a method to increase accuracy for point proposal based object instance segmentation.
We evaluate the accuracy of grasp detection and instance segmentation on challenging robotic picking datasets, namely Sil'eane and OCID_grasp.
arXiv Detail & Related papers (2021-11-22T11:06:33Z) - Improving Semi-Supervised and Domain-Adaptive Semantic Segmentation with
Self-Supervised Depth Estimation [94.16816278191477]
We present a framework for semi-adaptive and domain-supervised semantic segmentation.
It is enhanced by self-supervised monocular depth estimation trained only on unlabeled image sequences.
We validate the proposed model on the Cityscapes dataset.
arXiv Detail & Related papers (2021-08-28T01:33:38Z) - DeepSatData: Building large scale datasets of satellite images for
training machine learning models [77.17638664503215]
This report presents design considerations for automatically generating satellite imagery datasets for training machine learning models.
We discuss issues faced from the point of view of deep neural network training and evaluation.
arXiv Detail & Related papers (2021-04-28T15:13:12Z) - Domain Adaptive Semantic Segmentation with Self-Supervised Depth
Estimation [84.34227665232281]
Domain adaptation for semantic segmentation aims to improve the model performance in the presence of a distribution shift between source and target domain.
We leverage the guidance from self-supervised depth estimation, which is available on both domains, to bridge the domain gap.
We demonstrate the effectiveness of our proposed approach on the benchmark tasks SYNTHIA-to-Cityscapes and GTA-to-Cityscapes.
arXiv Detail & Related papers (2021-04-28T07:47:36Z) - Joint Object Contour Points and Semantics for Instance Segmentation [1.2117737635879038]
We propose Mask Point R-CNN aiming at promoting the neural network's attention to the object boundary.
Specifically, we innovatively extend the original human keypoint detection task to the contour point detection of any object.
As a consequence, the model will be more sensitive to the edges of the object and can capture more geometric features.
arXiv Detail & Related papers (2020-08-02T11:11:28Z) - Track Seeding and Labelling with Embedded-space Graph Neural Networks [3.5236955190576693]
The Exa.TrkX project is investigating machine learning approaches to particle track reconstruction.
The most promising of these solutions, graph neural networks (GNN), process the event as a graph that connects track measurements.
We report updates on the state-of-the-art architectures for this task.
arXiv Detail & Related papers (2020-06-30T23:43:28Z) - FAIRS -- Soft Focus Generator and Attention for Robust Object
Segmentation from Extreme Points [70.65563691392987]
We present a new approach to generate object segmentation from user inputs in the form of extreme points and corrective clicks.
We demonstrate our method's ability to generate high-quality training data as well as its scalability in incorporating extreme points, guiding clicks, and corrective clicks in a principled manner.
arXiv Detail & Related papers (2020-04-04T22:25:47Z) - Refined Plane Segmentation for Cuboid-Shaped Objects by Leveraging Edge
Detection [63.942632088208505]
We propose a post-processing algorithm to align the segmented plane masks with edges detected in the image.
This allows us to increase the accuracy of state-of-the-art approaches, while limiting ourselves to cuboid-shaped objects.
arXiv Detail & Related papers (2020-03-28T18:51:43Z) - Gated Path Selection Network for Semantic Segmentation [72.44994579325822]
We develop a novel network named Gated Path Selection Network (GPSNet), which aims to learn adaptive receptive fields.
In GPSNet, we first design a two-dimensional multi-scale network - SuperNet, which densely incorporates features from growing receptive fields.
To dynamically select desirable semantic context, a gate prediction module is further introduced.
arXiv Detail & Related papers (2020-01-19T12:32:17Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.