Related papers: Self-Supervised Object-in-Gripper Segmentation from Robotic Motions

Self-Supervised Object-in-Gripper Segmentation from Robotic Motions

URL: http://arxiv.org/abs/2002.04487v3
Date: Fri, 6 Nov 2020 10:31:16 GMT
Title: Self-Supervised Object-in-Gripper Segmentation from Robotic Motions
Authors: Wout Boerdijk, Martin Sundermeyer, Maximilian Durner and Rudolph Triebel
Abstract summary: We propose a robust solution for learning to segment unknown objects grasped by a robot. We exploit motion and temporal cues in RGB video sequences. Our approach is fully self-supervised and independent of precise camera calibration, 3D models or potentially imperfect depth data.
Score: 27.915309216800125
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Accurate object segmentation is a crucial task in the context of robotic manipulation. However, creating sufficient annotated training data for neural networks is particularly time consuming and often requires manual labeling. To this end, we propose a simple, yet robust solution for learning to segment unknown objects grasped by a robot. Specifically, we exploit motion and temporal cues in RGB video sequences. Using optical flow estimation we first learn to predict segmentation masks of our given manipulator. Then, these annotations are used in combination with motion cues to automatically distinguish between background, manipulator and unknown, grasped object. In contrast to existing systems our approach is fully self-supervised and independent of precise camera calibration, 3D models or potentially imperfect depth data. We perform a thorough comparison with alternative baselines and approaches from literature. The object masks and views are shown to be suitable training data for segmentation networks that generalize to novel environments and also allow for watertight 3D reconstruction.

Related papers

Volumetric Mapping with Panoptic Refinement via Kernel Density Estimation for Mobile Robots [2.8668675011182967]
Mobile robots usually use lightweight networks to segment objects on RGB images and then localize them via depth maps. We address the problem of panoptic segmentation quality in 3D scene reconstruction by refining segmentation errors using non-parametric statistical methods. We map the predicted masks into a depth frame to estimate their distribution via kernel densities. The outliers in depth perception are then rejected without the need for additional parameters.
arXiv Detail & Related papers (2024-12-15T16:46:23Z)
Articulated Object Manipulation using Online Axis Estimation with SAM2-Based Tracking [59.87033229815062]
Articulated object manipulation requires precise object interaction, where the object's axis must be carefully considered. Previous research employed interactive perception for manipulating articulated objects, but typically, open-loop approaches often suffer from overlooking the interaction dynamics. We present a closed-loop pipeline integrating interactive perception with online axis estimation from segmented 3D point clouds.
arXiv Detail & Related papers (2024-09-24T17:59:56Z)
LAC-Net: Linear-Fusion Attention-Guided Convolutional Network for Accurate Robotic Grasping Under the Occlusion [79.22197702626542]
This paper introduces a framework that explores amodal segmentation for robotic grasping in cluttered scenes. We propose a Linear-fusion Attention-guided Convolutional Network (LAC-Net) The results on different datasets show that our method achieves state-of-the-art performance.
arXiv Detail & Related papers (2024-08-06T14:50:48Z)
Self-Supervised Learning of Object Segmentation from Unlabeled RGB-D Videos [11.40098981859033]
This work proposes a self-supervised learning system for segmenting rigid objects in RGB images. The proposed pipeline is trained on unlabeled RGB-D videos of static objects, which can be captured with a camera carried by a mobile robot.
arXiv Detail & Related papers (2023-04-09T23:13:39Z)
Semi-Weakly Supervised Object Kinematic Motion Prediction [56.282759127180306]
Given a 3D object, kinematic motion prediction aims to identify the mobile parts as well as the corresponding motion parameters. We propose a graph neural network to learn the map between hierarchical part-level segmentation and mobile parts parameters. The network predictions yield a large scale of 3D objects with pseudo labeled mobility information.
arXiv Detail & Related papers (2023-03-31T02:37:36Z)
ALSO: Automotive Lidar Self-supervision by Occupancy estimation [70.70557577874155]
We propose a new self-supervised method for pre-training the backbone of deep perception models operating on point clouds. The core idea is to train the model on a pretext task which is the reconstruction of the surface on which the 3D points are sampled. The intuition is that if the network is able to reconstruct the scene surface, given only sparse input points, then it probably also captures some fragments of semantic information.
arXiv Detail & Related papers (2022-12-12T13:10:19Z)
Supervised Training of Dense Object Nets using Optimal Descriptors for Industrial Robotic Applications [57.87136703404356]
Dense Object Nets (DONs) by Florence, Manuelli and Tedrake introduced dense object descriptors as a novel visual object representation for the robotics community. In this paper we show that given a 3D model of an object, we can generate its descriptor space image, which allows for supervised training of DONs. We compare the training methods on generating 6D grasps for industrial objects and show that our novel supervised training approach improves the pick-and-place performance in industry-relevant tasks.
arXiv Detail & Related papers (2021-02-16T11:40:12Z)
Rapid Pose Label Generation through Sparse Representation of Unknown Objects [7.32172860877574]
This work presents an approach for rapidly generating real-world, pose-annotated RGB-D data for unknown objects. We first source minimalistic labelings of an ordered set of arbitrarily chosen keypoints over a set of RGB-D videos. By solving an optimization problem, we combine these labels under a world frame to recover a sparse, keypoint-based representation of the object.
arXiv Detail & Related papers (2020-11-07T15:14:03Z)
"What's This?" -- Learning to Segment Unknown Objects from Manipulation Sequences [27.915309216800125]
We present a novel framework for self-supervised grasped object segmentation with a robotic manipulator. We propose a single, end-to-end trainable architecture which jointly incorporates motion cues and semantic knowledge. Our method neither depends on any visual registration of a kinematic robot or 3D object models, nor on precise hand-eye calibration or any additional sensor data.
arXiv Detail & Related papers (2020-11-06T10:55:28Z)
DyStaB: Unsupervised Object Segmentation via Dynamic-Static Bootstrapping [72.84991726271024]
We describe an unsupervised method to detect and segment portions of images of live scenes that are seen moving as a coherent whole. Our method first partitions the motion field by minimizing the mutual information between segments. It uses the segments to learn object models that can be used for detection in a static image.
arXiv Detail & Related papers (2020-08-16T22:05:13Z)
Self-supervised Transfer Learning for Instance Segmentation through Physical Interaction [25.956451840257916]
We present a transfer learning approach for robots that learn to segment objects by interacting with their environment in a self-supervised manner. Our robot pushes unknown objects on a table and uses information from optical flow to create training labels in the form of object masks. We evaluate our trained network (SelfDeepMask) on a set of real images showing challenging and cluttered scenes with novel objects.
arXiv Detail & Related papers (2020-05-19T14:31:24Z)

This list is automatically generated from the titles and abstracts of the papers in this site.