Self-Supervised Object-in-Gripper Segmentation from Robotic Motions
- URL: http://arxiv.org/abs/2002.04487v3
- Date: Fri, 6 Nov 2020 10:31:16 GMT
- Title: Self-Supervised Object-in-Gripper Segmentation from Robotic Motions
- Authors: Wout Boerdijk, Martin Sundermeyer, Maximilian Durner and Rudolph
Triebel
- Abstract summary: We propose a robust solution for learning to segment unknown objects grasped by a robot.
We exploit motion and temporal cues in RGB video sequences.
Our approach is fully self-supervised and independent of precise camera calibration, 3D models or potentially imperfect depth data.
- Score: 27.915309216800125
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Accurate object segmentation is a crucial task in the context of robotic
manipulation. However, creating sufficient annotated training data for neural
networks is particularly time consuming and often requires manual labeling. To
this end, we propose a simple, yet robust solution for learning to segment
unknown objects grasped by a robot. Specifically, we exploit motion and
temporal cues in RGB video sequences. Using optical flow estimation we first
learn to predict segmentation masks of our given manipulator. Then, these
annotations are used in combination with motion cues to automatically
distinguish between background, manipulator and unknown, grasped object. In
contrast to existing systems our approach is fully self-supervised and
independent of precise camera calibration, 3D models or potentially imperfect
depth data. We perform a thorough comparison with alternative baselines and
approaches from literature. The object masks and views are shown to be suitable
training data for segmentation networks that generalize to novel environments
and also allow for watertight 3D reconstruction.
Related papers
- Appearance-based Refinement for Object-Centric Motion Segmentation [95.80420062679104]
We introduce an appearance-based refinement method that leverages temporal consistency in video streams to correct inaccurate flow-based proposals.
Our approach involves a simple selection mechanism that identifies accurate flow-predicted masks as exemplars.
Its performance is evaluated on multiple video segmentation benchmarks, including DAVIS, YouTubeVOS, SegTrackv2, and FBMS-59.
arXiv Detail & Related papers (2023-12-18T18:59:51Z) - Self-Supervised Learning of Object Segmentation from Unlabeled RGB-D
Videos [11.40098981859033]
This work proposes a self-supervised learning system for segmenting rigid objects in RGB images.
The proposed pipeline is trained on unlabeled RGB-D videos of static objects, which can be captured with a camera carried by a mobile robot.
arXiv Detail & Related papers (2023-04-09T23:13:39Z) - Semi-Weakly Supervised Object Kinematic Motion Prediction [56.282759127180306]
Given a 3D object, kinematic motion prediction aims to identify the mobile parts as well as the corresponding motion parameters.
We propose a graph neural network to learn the map between hierarchical part-level segmentation and mobile parts parameters.
The network predictions yield a large scale of 3D objects with pseudo labeled mobility information.
arXiv Detail & Related papers (2023-03-31T02:37:36Z) - ALSO: Automotive Lidar Self-supervision by Occupancy estimation [70.70557577874155]
We propose a new self-supervised method for pre-training the backbone of deep perception models operating on point clouds.
The core idea is to train the model on a pretext task which is the reconstruction of the surface on which the 3D points are sampled.
The intuition is that if the network is able to reconstruct the scene surface, given only sparse input points, then it probably also captures some fragments of semantic information.
arXiv Detail & Related papers (2022-12-12T13:10:19Z) - Learning To Segment Dominant Object Motion From Watching Videos [72.57852930273256]
We envision a simple framework for dominant moving object segmentation that neither requires annotated data to train nor relies on saliency priors or pre-trained optical flow maps.
Inspired by a layered image representation, we introduce a technique to group pixel regions according to their affine parametric motion.
This enables our network to learn segmentation of the dominant foreground object using only RGB image pairs as input for both training and inference.
arXiv Detail & Related papers (2021-11-28T14:51:00Z) - Supervised Training of Dense Object Nets using Optimal Descriptors for
Industrial Robotic Applications [57.87136703404356]
Dense Object Nets (DONs) by Florence, Manuelli and Tedrake introduced dense object descriptors as a novel visual object representation for the robotics community.
In this paper we show that given a 3D model of an object, we can generate its descriptor space image, which allows for supervised training of DONs.
We compare the training methods on generating 6D grasps for industrial objects and show that our novel supervised training approach improves the pick-and-place performance in industry-relevant tasks.
arXiv Detail & Related papers (2021-02-16T11:40:12Z) - Rapid Pose Label Generation through Sparse Representation of Unknown
Objects [7.32172860877574]
This work presents an approach for rapidly generating real-world, pose-annotated RGB-D data for unknown objects.
We first source minimalistic labelings of an ordered set of arbitrarily chosen keypoints over a set of RGB-D videos.
By solving an optimization problem, we combine these labels under a world frame to recover a sparse, keypoint-based representation of the object.
arXiv Detail & Related papers (2020-11-07T15:14:03Z) - "What's This?" -- Learning to Segment Unknown Objects from Manipulation
Sequences [27.915309216800125]
We present a novel framework for self-supervised grasped object segmentation with a robotic manipulator.
We propose a single, end-to-end trainable architecture which jointly incorporates motion cues and semantic knowledge.
Our method neither depends on any visual registration of a kinematic robot or 3D object models, nor on precise hand-eye calibration or any additional sensor data.
arXiv Detail & Related papers (2020-11-06T10:55:28Z) - DyStaB: Unsupervised Object Segmentation via Dynamic-Static
Bootstrapping [72.84991726271024]
We describe an unsupervised method to detect and segment portions of images of live scenes that are seen moving as a coherent whole.
Our method first partitions the motion field by minimizing the mutual information between segments.
It uses the segments to learn object models that can be used for detection in a static image.
arXiv Detail & Related papers (2020-08-16T22:05:13Z) - Self-supervised Transfer Learning for Instance Segmentation through
Physical Interaction [25.956451840257916]
We present a transfer learning approach for robots that learn to segment objects by interacting with their environment in a self-supervised manner.
Our robot pushes unknown objects on a table and uses information from optical flow to create training labels in the form of object masks.
We evaluate our trained network (SelfDeepMask) on a set of real images showing challenging and cluttered scenes with novel objects.
arXiv Detail & Related papers (2020-05-19T14:31:24Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.