"What's This?" -- Learning to Segment Unknown Objects from Manipulation
Sequences
- URL: http://arxiv.org/abs/2011.03279v2
- Date: Thu, 17 Jun 2021 09:00:03 GMT
- Title: "What's This?" -- Learning to Segment Unknown Objects from Manipulation
Sequences
- Authors: Wout Boerdijk, Martin Sundermeyer, Maximilian Durner, Rudolph Triebel
- Abstract summary: We present a novel framework for self-supervised grasped object segmentation with a robotic manipulator.
We propose a single, end-to-end trainable architecture which jointly incorporates motion cues and semantic knowledge.
Our method neither depends on any visual registration of a kinematic robot or 3D object models, nor on precise hand-eye calibration or any additional sensor data.
- Score: 27.915309216800125
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We present a novel framework for self-supervised grasped object segmentation
with a robotic manipulator. Our method successively learns an agnostic
foreground segmentation followed by a distinction between manipulator and
object solely by observing the motion between consecutive RGB frames. In
contrast to previous approaches, we propose a single, end-to-end trainable
architecture which jointly incorporates motion cues and semantic knowledge.
Furthermore, while the motion of the manipulator and the object are substantial
cues for our algorithm, we present means to robustly deal with distraction
objects moving in the background, as well as with completely static scenes. Our
method neither depends on any visual registration of a kinematic robot or 3D
object models, nor on precise hand-eye calibration or any additional sensor
data. By extensive experimental evaluation we demonstrate the superiority of
our framework and provide detailed insights on its capability of dealing with
the aforementioned extreme cases of motion. We also show that training a
semantic segmentation network with the automatically labeled data achieves
results on par with manually annotated training data. Code and pretrained model
are available at https://github.com/DLR-RM/DistinctNet.
Related papers
- Articulated Object Manipulation using Online Axis Estimation with SAM2-Based Tracking [59.87033229815062]
Articulated object manipulation requires precise object interaction, where the object's axis must be carefully considered.
Previous research employed interactive perception for manipulating articulated objects, but typically, open-loop approaches often suffer from overlooking the interaction dynamics.
We present a closed-loop pipeline integrating interactive perception with online axis estimation from segmented 3D point clouds.
arXiv Detail & Related papers (2024-09-24T17:59:56Z) - Learning Manipulation by Predicting Interaction [85.57297574510507]
We propose a general pre-training pipeline that learns Manipulation by Predicting the Interaction.
The experimental results demonstrate that MPI exhibits remarkable improvement by 10% to 64% compared with previous state-of-the-art in real-world robot platforms.
arXiv Detail & Related papers (2024-06-01T13:28:31Z) - SeMoLi: What Moves Together Belongs Together [51.72754014130369]
We tackle semi-supervised object detection based on motion cues.
Recent results suggest that motion-based clustering methods can be used to pseudo-label instances of moving objects.
We re-think this approach and suggest that both, object detection, as well as motion-inspired pseudo-labeling, can be tackled in a data-driven manner.
arXiv Detail & Related papers (2024-02-29T18:54:53Z) - Event-Free Moving Object Segmentation from Moving Ego Vehicle [88.33470650615162]
Moving object segmentation (MOS) in dynamic scenes is an important, challenging, but under-explored research topic for autonomous driving.
Most segmentation methods leverage motion cues obtained from optical flow maps.
We propose to exploit event cameras for better video understanding, which provide rich motion cues without relying on optical flow.
arXiv Detail & Related papers (2023-04-28T23:43:10Z) - Instance Segmentation with Cross-Modal Consistency [13.524441194366544]
We introduce a novel approach to instance segmentation that jointly leverages measurements from multiple sensor modalities.
Our technique applies contrastive learning to points in the scene both across sensor modalities and the temporal domain.
We demonstrate that this formulation encourages the models to learn embeddings that are invariant to viewpoint variations.
arXiv Detail & Related papers (2022-10-14T21:17:19Z) - Discovering Objects that Can Move [55.743225595012966]
We study the problem of object discovery -- separating objects from the background without manual labels.
Existing approaches utilize appearance cues, such as color, texture, and location, to group pixels into object-like regions.
We choose to focus on dynamic objects -- entities that can move independently in the world.
arXiv Detail & Related papers (2022-03-18T21:13:56Z) - A System for Traded Control Teleoperation of Manipulation Tasks using
Intent Prediction from Hand Gestures [20.120263332724438]
This paper presents a teleoperation system that includes robot perception and intent prediction from hand gestures.
The perception module identifies the objects present in the robot workspace and the intent prediction module which object the user likely wants to grasp.
arXiv Detail & Related papers (2021-07-05T07:37:17Z) - DyStaB: Unsupervised Object Segmentation via Dynamic-Static
Bootstrapping [72.84991726271024]
We describe an unsupervised method to detect and segment portions of images of live scenes that are seen moving as a coherent whole.
Our method first partitions the motion field by minimizing the mutual information between segments.
It uses the segments to learn object models that can be used for detection in a static image.
arXiv Detail & Related papers (2020-08-16T22:05:13Z) - A Deep Learning Approach to Object Affordance Segmentation [31.221897360610114]
We design an autoencoder that infers pixel-wise affordance labels in both videos and static images.
Our model surpasses the need for object labels and bounding boxes by using a soft-attention mechanism.
We show that our model achieves competitive results compared to strongly supervised methods on SOR3D-AFF.
arXiv Detail & Related papers (2020-04-18T15:34:41Z) - Self-Supervised Object-in-Gripper Segmentation from Robotic Motions [27.915309216800125]
We propose a robust solution for learning to segment unknown objects grasped by a robot.
We exploit motion and temporal cues in RGB video sequences.
Our approach is fully self-supervised and independent of precise camera calibration, 3D models or potentially imperfect depth data.
arXiv Detail & Related papers (2020-02-11T15:44:46Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.