Related papers: FewSOL: A Dataset for Few-Shot Object Learning in Robotic Environments

FewSOL: A Dataset for Few-Shot Object Learning in Robotic Environments

URL: http://arxiv.org/abs/2207.03333v1
Date: Wed, 6 Jul 2022 05:57:24 GMT
Title: FewSOL: A Dataset for Few-Shot Object Learning in Robotic Environments
Authors: Jishnu Jaykumar P and Yu-Wei Chao and Yu Xiang
Abstract summary: We introduce the Few-Shot Object Learning dataset for object recognition with a few images per object. We captured 336 real-world objects with 9 RGB-D images per object from different views. The evaluation results show that there is still a large margin to be improved for few-shot object classification in robotic environments.
Score: 21.393674766169543
License: http://creativecommons.org/licenses/by/4.0/
Abstract: We introduce the Few-Shot Object Learning (FewSOL) dataset for object recognition with a few images per object. We captured 336 real-world objects with 9 RGB-D images per object from different views. Object segmentation masks, object poses and object attributes are provided. In addition, synthetic images generated using 330 3D object models are used to augment the dataset. We investigated (i) few-shot object classification and (ii) joint object segmentation and few-shot classification with the state-of-the-art methods for few-shot learning and meta-learning using our dataset. The evaluation results show that there is still a large margin to be improved for few-shot object classification in robotic environments. Our dataset can be used to study a set of few-shot object recognition problems such as classification, detection and segmentation, shape reconstruction, pose estimation, keypoint correspondences and attribute recognition. The dataset and code are available at https://irvlutd.github.io/FewSOL.

Related papers

PickScan: Object discovery and reconstruction from handheld interactions [99.99566882133179]
We develop an interaction-guided and class-agnostic method to reconstruct 3D representations of scenes. Our main contribution is a novel approach to detecting user-object interactions and extracting the masks of manipulated objects. Compared to Co-Fusion, the only comparable interaction-based and class-agnostic baseline, this corresponds to a reduction in chamfer distance of 73%.
arXiv Detail & Related papers (2024-11-17T23:09:08Z)
LocaliseBot: Multi-view 3D object localisation with differentiable rendering for robot grasping [9.690844449175948]
We focus on object pose estimation. Our approach relies on three pieces of information: multiple views of the object, the camera's parameters at those viewpoints, and 3D CAD models of objects. We show that the estimated object pose results in 99.65% grasp accuracy with the ground truth grasp candidates.
arXiv Detail & Related papers (2023-11-14T14:27:53Z)
DoUnseen: Tuning-Free Class-Adaptive Object Detection of Unseen Objects for Robotic Grasping [1.6317061277457001]
We develop an object detector that requires no fine-tuning and can add any object as a class just by capturing a few images of the object. We evaluate our class-adaptive object detector on unseen datasets and compare it to a trained Mask R-CNN on those datasets.
arXiv Detail & Related papers (2023-04-06T02:45:39Z)
SupeRGB-D: Zero-shot Instance Segmentation in Cluttered Indoor Environments [67.34330257205525]
In this work, we explore zero-shot instance segmentation (ZSIS) from RGB-D data to identify unseen objects in a semantic category-agnostic manner. We present a method that uses annotated objects to learn the objectness'' of pixels and generalize to unseen object categories in cluttered indoor environments.
arXiv Detail & Related papers (2022-12-22T17:59:48Z)
MegaPose: 6D Pose Estimation of Novel Objects via Render & Compare [84.80956484848505]
MegaPose is a method to estimate the 6D pose of novel objects, that is, objects unseen during training. We present a 6D pose refiner based on a render&compare strategy which can be applied to novel objects. Second, we introduce a novel approach for coarse pose estimation which leverages a network trained to classify whether the pose error between a synthetic rendering and an observed image of the same object can be corrected by the refiner.
arXiv Detail & Related papers (2022-12-13T19:30:03Z)
Automatic dataset generation for specific object detection [6.346581421948067]
We present a method to synthesize object-in-scene images, which can preserve the objects' detailed features without bringing irrelevant information. Our result shows that in the synthesized image, the boundaries of objects blend very well with the background.
arXiv Detail & Related papers (2022-07-16T07:44:33Z)
Contrastive Object Detection Using Knowledge Graph Embeddings [72.17159795485915]
We compare the error statistics of the class embeddings learned from a one-hot approach with semantically structured embeddings from natural language processing or knowledge graphs. We propose a knowledge-embedded design for keypoint-based and transformer-based object detection architectures.
arXiv Detail & Related papers (2021-12-21T17:10:21Z)
Unsupervised Learning of 3D Object Categories from Videos in the Wild [75.09720013151247]
We focus on learning a model from multiple views of a large collection of object instances. We propose a new neural network design, called warp-conditioned ray embedding (WCR), which significantly improves reconstruction. Our evaluation demonstrates performance improvements over several deep monocular reconstruction baselines on existing benchmarks.
arXiv Detail & Related papers (2021-03-30T17:57:01Z)
Learning RGB-D Feature Embeddings for Unseen Object Instance Segmentation [67.88276573341734]
We propose a new method for unseen object instance segmentation by learning RGB-D feature embeddings from synthetic data. A metric learning loss function is utilized to learn to produce pixel-wise feature embeddings. We further improve the segmentation accuracy with a new two-stage clustering algorithm.
arXiv Detail & Related papers (2020-07-30T00:23:07Z)
Neural Object Learning for 6D Pose Estimation Using a Few Cluttered Images [30.240630713652035]
Recent methods for 6D pose estimation of objects assume either textured 3D models or real images that cover the entire range of target poses. This paper proposes a method, Neural Object Learning (NOL), that creates synthetic images of objects in arbitrary poses by combining only a few observations from cluttered images.
arXiv Detail & Related papers (2020-05-07T19:33:06Z)

This list is automatically generated from the titles and abstracts of the papers in this site.