"What's This?" -- Learning to Segment Unknown Objects from Manipulation
Sequences
- URL: http://arxiv.org/abs/2011.03279v2
- Date: Thu, 17 Jun 2021 09:00:03 GMT
- Title: "What's This?" -- Learning to Segment Unknown Objects from Manipulation
Sequences
- Authors: Wout Boerdijk, Martin Sundermeyer, Maximilian Durner, Rudolph Triebel
- Abstract summary: We present a novel framework for self-supervised grasped object segmentation with a robotic manipulator.
We propose a single, end-to-end trainable architecture which jointly incorporates motion cues and semantic knowledge.
Our method neither depends on any visual registration of a kinematic robot or 3D object models, nor on precise hand-eye calibration or any additional sensor data.
- Score: 27.915309216800125
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We present a novel framework for self-supervised grasped object segmentation
with a robotic manipulator. Our method successively learns an agnostic
foreground segmentation followed by a distinction between manipulator and
object solely by observing the motion between consecutive RGB frames. In
contrast to previous approaches, we propose a single, end-to-end trainable
architecture which jointly incorporates motion cues and semantic knowledge.
Furthermore, while the motion of the manipulator and the object are substantial
cues for our algorithm, we present means to robustly deal with distraction
objects moving in the background, as well as with completely static scenes. Our
method neither depends on any visual registration of a kinematic robot or 3D
object models, nor on precise hand-eye calibration or any additional sensor
data. By extensive experimental evaluation we demonstrate the superiority of
our framework and provide detailed insights on its capability of dealing with
the aforementioned extreme cases of motion. We also show that training a
semantic segmentation network with the automatically labeled data achieves
results on par with manually annotated training data. Code and pretrained model
are available at https://github.com/DLR-RM/DistinctNet.
Related papers
- Learning Manipulation by Predicting Interaction [85.57297574510507]
We propose a general pre-training pipeline that learns Manipulation by Predicting the Interaction.
The experimental results demonstrate that MPI exhibits remarkable improvement by 10% to 64% compared with previous state-of-the-art in real-world robot platforms.
arXiv Detail & Related papers (2024-06-01T13:28:31Z) - DITTO: Demonstration Imitation by Trajectory Transformation [31.930923345163087]
Teaching robots new skills quickly and conveniently is crucial for the broader adoption of robotic systems.
We address the problem of one-shot imitation from a single human demonstration, given by an RGB-D video recording.
We make the code publicly available at http://ditto.cs.uni-freiburg.de.
arXiv Detail & Related papers (2024-03-22T13:46:51Z) - SeMoLi: What Moves Together Belongs Together [51.72754014130369]
We tackle semi-supervised object detection based on motion cues.
Recent results suggest that motion-based clustering methods can be used to pseudo-label instances of moving objects.
We re-think this approach and suggest that both, object detection, as well as motion-inspired pseudo-labeling, can be tackled in a data-driven manner.
arXiv Detail & Related papers (2024-02-29T18:54:53Z) - Semi-Weakly Supervised Object Kinematic Motion Prediction [56.282759127180306]
Given a 3D object, kinematic motion prediction aims to identify the mobile parts as well as the corresponding motion parameters.
We propose a graph neural network to learn the map between hierarchical part-level segmentation and mobile parts parameters.
The network predictions yield a large scale of 3D objects with pseudo labeled mobility information.
arXiv Detail & Related papers (2023-03-31T02:37:36Z) - Instance Segmentation with Cross-Modal Consistency [13.524441194366544]
We introduce a novel approach to instance segmentation that jointly leverages measurements from multiple sensor modalities.
Our technique applies contrastive learning to points in the scene both across sensor modalities and the temporal domain.
We demonstrate that this formulation encourages the models to learn embeddings that are invariant to viewpoint variations.
arXiv Detail & Related papers (2022-10-14T21:17:19Z) - Discovering Objects that Can Move [55.743225595012966]
We study the problem of object discovery -- separating objects from the background without manual labels.
Existing approaches utilize appearance cues, such as color, texture, and location, to group pixels into object-like regions.
We choose to focus on dynamic objects -- entities that can move independently in the world.
arXiv Detail & Related papers (2022-03-18T21:13:56Z) - A System for Traded Control Teleoperation of Manipulation Tasks using
Intent Prediction from Hand Gestures [20.120263332724438]
This paper presents a teleoperation system that includes robot perception and intent prediction from hand gestures.
The perception module identifies the objects present in the robot workspace and the intent prediction module which object the user likely wants to grasp.
arXiv Detail & Related papers (2021-07-05T07:37:17Z) - 3D Registration for Self-Occluded Objects in Context [66.41922513553367]
We introduce the first deep learning framework capable of effectively handling this scenario.
Our method consists of an instance segmentation module followed by a pose estimation one.
It allows us to perform 3D registration in a one-shot manner, without requiring an expensive iterative procedure.
arXiv Detail & Related papers (2020-11-23T08:05:28Z) - DyStaB: Unsupervised Object Segmentation via Dynamic-Static
Bootstrapping [72.84991726271024]
We describe an unsupervised method to detect and segment portions of images of live scenes that are seen moving as a coherent whole.
Our method first partitions the motion field by minimizing the mutual information between segments.
It uses the segments to learn object models that can be used for detection in a static image.
arXiv Detail & Related papers (2020-08-16T22:05:13Z) - A Deep Learning Approach to Object Affordance Segmentation [31.221897360610114]
We design an autoencoder that infers pixel-wise affordance labels in both videos and static images.
Our model surpasses the need for object labels and bounding boxes by using a soft-attention mechanism.
We show that our model achieves competitive results compared to strongly supervised methods on SOR3D-AFF.
arXiv Detail & Related papers (2020-04-18T15:34:41Z) - Self-Supervised Object-in-Gripper Segmentation from Robotic Motions [27.915309216800125]
We propose a robust solution for learning to segment unknown objects grasped by a robot.
We exploit motion and temporal cues in RGB video sequences.
Our approach is fully self-supervised and independent of precise camera calibration, 3D models or potentially imperfect depth data.
arXiv Detail & Related papers (2020-02-11T15:44:46Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.