Self-Supervised Instance Segmentation by Grasping
- URL: http://arxiv.org/abs/2305.06305v1
- Date: Wed, 10 May 2023 16:51:36 GMT
- Title: Self-Supervised Instance Segmentation by Grasping
- Authors: YuXuan Liu, Xi Chen, Pieter Abbeel
- Abstract summary: We learn a grasp segmentation model to segment the grasped object from before and after grasp images.
Using the segmented objects, we can "cut" objects from their original scenes and "paste" them into new scenes to generate instance supervision.
We show that our grasp segmentation model provides a 5x error reduction when segmenting grasped objects compared with traditional image subtraction approaches.
- Score: 84.2469669256257
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Instance segmentation is a fundamental skill for many robotic applications.
We propose a self-supervised method that uses grasp interactions to collect
segmentation supervision for an instance segmentation model. When a robot
grasps an item, the mask of that grasped item can be inferred from the images
of the scene before and after the grasp. Leveraging this insight, we learn a
grasp segmentation model to segment the grasped object from before and after
grasp images. Such a model can segment grasped objects from thousands of grasp
interactions without costly human annotation. Using the segmented grasped
objects, we can "cut" objects from their original scenes and "paste" them into
new scenes to generate instance supervision. We show that our grasp
segmentation model provides a 5x error reduction when segmenting grasped
objects compared with traditional image subtraction approaches. Combined with
our "cut-and-paste" generation method, instance segmentation models trained
with our method achieve better performance than a model trained with 10x the
amount of labeled data. On a real robotic grasping system, our instance
segmentation model reduces the rate of grasp errors by over 3x compared to an
image subtraction baseline.
Related papers
- UnSeg: One Universal Unlearnable Example Generator is Enough against All Image Segmentation [64.01742988773745]
An increasing privacy concern exists regarding training large-scale image segmentation models on unauthorized private data.
We exploit the concept of unlearnable examples to make images unusable to model training by generating and adding unlearnable noise into the original images.
We empirically verify the effectiveness of UnSeg across 6 mainstream image segmentation tasks, 10 widely used datasets, and 7 different network architectures.
arXiv Detail & Related papers (2024-10-13T16:34:46Z) - Synthetic Instance Segmentation from Semantic Image Segmentation Masks [15.477053085267404]
We propose a novel paradigm called Synthetic Instance (SISeg)
SISeg instance segmentation results by leveraging image masks generated by existing semantic segmentation models.
In other words, the proposed model does not need extra manpower or higher computational expenses.
arXiv Detail & Related papers (2023-08-02T05:13:02Z) - Towards Open-World Segmentation of Parts [16.056921233445784]
We propose to explore a class-agnostic part segmentation task.
We argue that models trained without part classes can better localize parts and segment them on objects unseen in training.
We show notable and consistent gains by our approach, essentially a critical step towards open-world part segmentation.
arXiv Detail & Related papers (2023-05-26T10:34:58Z) - Foreground-Background Separation through Concept Distillation from
Generative Image Foundation Models [6.408114351192012]
We present a novel method that enables the generation of general foreground-background segmentation models from simple textual descriptions.
We show results on the task of segmenting four different objects (humans, dogs, cars, birds) and a use case scenario in medical image analysis.
arXiv Detail & Related papers (2022-12-29T13:51:54Z) - A Generalist Framework for Panoptic Segmentation of Images and Videos [61.61453194912186]
We formulate panoptic segmentation as a discrete data generation problem, without relying on inductive bias of the task.
A diffusion model is proposed to model panoptic masks, with a simple architecture and generic loss function.
Our method is capable of modeling video (in a streaming setting) and thereby learns to track object instances automatically.
arXiv Detail & Related papers (2022-10-12T16:18:25Z) - Learning with Free Object Segments for Long-Tailed Instance Segmentation [15.563842274862314]
We find that an abundance of instance segments can potentially be obtained freely from object-centric im-ages.
Motivated by these insights, we propose FreeSeg for extracting and leveraging these "free" object segments.
FreeSeg achieves state-of-the-art accuracy for segmenting rare object categories.
arXiv Detail & Related papers (2022-02-22T19:06:16Z) - The Emergence of Objectness: Learning Zero-Shot Segmentation from Videos [59.12750806239545]
We show that a video has different views of the same scene related by moving components, and the right region segmentation and region flow would allow mutual view synthesis.
Our model starts with two separate pathways: an appearance pathway that outputs feature-based region segmentation for a single image, and a motion pathway that outputs motion features for a pair of images.
By training the model to minimize view synthesis errors based on segment flow, our appearance and motion pathways learn region segmentation and flow estimation automatically without building them up from low-level edges or optical flows respectively.
arXiv Detail & Related papers (2021-11-11T18:59:11Z) - SOLO: A Simple Framework for Instance Segmentation [84.00519148562606]
"instance categories" assigns categories to each pixel within an instance according to the instance's location.
"SOLO" is a simple, direct, and fast framework for instance segmentation with strong performance.
Our approach achieves state-of-the-art results for instance segmentation in terms of both speed and accuracy.
arXiv Detail & Related papers (2021-06-30T09:56:54Z) - DyStaB: Unsupervised Object Segmentation via Dynamic-Static
Bootstrapping [72.84991726271024]
We describe an unsupervised method to detect and segment portions of images of live scenes that are seen moving as a coherent whole.
Our method first partitions the motion field by minimizing the mutual information between segments.
It uses the segments to learn object models that can be used for detection in a static image.
arXiv Detail & Related papers (2020-08-16T22:05:13Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.