Related papers: LocaliseBot: Multi-view 3D object localisation with differentiable rendering for robot grasping

LocaliseBot: Multi-view 3D object localisation with differentiable rendering for robot grasping

URL: http://arxiv.org/abs/2311.08438v1
Date: Tue, 14 Nov 2023 14:27:53 GMT
Title: LocaliseBot: Multi-view 3D object localisation with differentiable rendering for robot grasping
Authors: Sujal Vijayaraghavan and Redwan Alqasemi and Rajiv Dubey and Sudeep Sarkar
Abstract summary: We focus on object pose estimation. Our approach relies on three pieces of information: multiple views of the object, the camera's parameters at those viewpoints, and 3D CAD models of objects. We show that the estimated object pose results in 99.65% grasp accuracy with the ground truth grasp candidates.
Score: 9.690844449175948
License: http://creativecommons.org/licenses/by-nc-sa/4.0/
Abstract: Robot grasp typically follows five stages: object detection, object localisation, object pose estimation, grasp pose estimation, and grasp planning. We focus on object pose estimation. Our approach relies on three pieces of information: multiple views of the object, the camera's extrinsic parameters at those viewpoints, and 3D CAD models of objects. The first step involves a standard deep learning backbone (FCN ResNet) to estimate the object label, semantic segmentation, and a coarse estimate of the object pose with respect to the camera. Our novelty is using a refinement module that starts from the coarse pose estimate and refines it by optimisation through differentiable rendering. This is a purely vision-based approach that avoids the need for other information such as point cloud or depth images. We evaluate our object pose estimation approach on the ShapeNet dataset and show improvements over the state of the art. We also show that the estimated object pose results in 99.65% grasp accuracy with the ground truth grasp candidates on the Object Clutter Indoor Dataset (OCID) Grasp dataset, as computed using standard practice.

Related papers

UA-Pose: Uncertainty-Aware 6D Object Pose Estimation and Online Object Completion with Partial References [14.762839788171584]
We propose UA-Pose, an uncertainty-aware approach for 6D object pose estimation and online object completion.<n>We evaluate our method on the YCB-Video, YCBInEOAT, and HO3D datasets, including RGBD sequences of YCB objects manipulated by robots and human hands.
arXiv Detail & Related papers (2025-06-09T17:58:12Z)
DVMNet++: Rethinking Relative Pose Estimation for Unseen Objects [59.51874686414509]
Existing approaches typically predict 3D translation utilizing the ground-truth object bounding box and approximate 3D rotation with a large number of discrete hypotheses. We present a Deep Voxel Matching Network (DVMNet++) that computes the relative object pose in a single pass. Our approach delivers more accurate relative pose estimates for novel objects at a lower computational cost compared to state-of-the-art methods.
arXiv Detail & Related papers (2024-03-20T15:41:32Z)
3D-Aware Hypothesis & Verification for Generalizable Relative Object Pose Estimation [69.73691477825079]
We present a new hypothesis-and-verification framework to tackle the problem of generalizable object pose estimation. To measure reliability, we introduce a 3D-aware verification that explicitly applies 3D transformations to the 3D object representations learned from the two input images.
arXiv Detail & Related papers (2023-10-05T13:34:07Z)
ShapeShift: Superquadric-based Object Pose Estimation for Robotic Grasping [85.38689479346276]
Current techniques heavily rely on a reference 3D object, limiting their generalizability and making it expensive to expand to new object categories. This paper proposes ShapeShift, a superquadric-based framework for object pose estimation that predicts the object's pose relative to a primitive shape which is fitted to the object.
arXiv Detail & Related papers (2023-04-10T20:55:41Z)
MegaPose: 6D Pose Estimation of Novel Objects via Render & Compare [84.80956484848505]
MegaPose is a method to estimate the 6D pose of novel objects, that is, objects unseen during training. We present a 6D pose refiner based on a render&compare strategy which can be applied to novel objects. Second, we introduce a novel approach for coarse pose estimation which leverages a network trained to classify whether the pose error between a synthetic rendering and an observed image of the same object can be corrected by the refiner.
arXiv Detail & Related papers (2022-12-13T19:30:03Z)
LocPoseNet: Robust Location Prior for Unseen Object Pose Estimation [69.70498875887611]
LocPoseNet is able to robustly learn location prior for unseen objects. Our method outperforms existing works by a large margin on LINEMOD and GenMOP.
arXiv Detail & Related papers (2022-11-29T15:21:34Z)
OSOP: A Multi-Stage One Shot Object Pose Estimation Framework [35.89334617258322]
We present a novel one-shot method for object detection and 6 DoF pose estimation, that does not require training on target objects. At test time, it takes as input a target image and a textured 3D query model. We evaluate the method on LineMOD, Occlusion, Homebrewed, YCB-V and TLESS datasets.
arXiv Detail & Related papers (2022-03-29T13:12:00Z)
6D Object Pose Estimation using Keypoints and Part Affinity Fields [24.126513851779936]
The task of 6D object pose estimation from RGB images is an important requirement for autonomous service robots to be able to interact with the real world. We present a two-step pipeline for estimating the 6 DoF translation and orientation of known objects.
arXiv Detail & Related papers (2021-07-05T14:41:19Z)
DSC-PoseNet: Learning 6DoF Object Pose Estimation via Dual-scale Consistency [43.09728251735362]
We present a two-step pose estimation framework to attain 6DoF object poses from 2D object bounding-boxes. In the first step, the framework learns to segment objects from real and synthetic data. In the second step, we design a dual-scale pose estimation network, namely DSC-PoseNet. Our method outperforms state-of-the-art models trained on synthetic data by a large margin.
arXiv Detail & Related papers (2021-04-08T10:19:35Z)
Supervised Training of Dense Object Nets using Optimal Descriptors for Industrial Robotic Applications [57.87136703404356]
Dense Object Nets (DONs) by Florence, Manuelli and Tedrake introduced dense object descriptors as a novel visual object representation for the robotics community. In this paper we show that given a 3D model of an object, we can generate its descriptor space image, which allows for supervised training of DONs. We compare the training methods on generating 6D grasps for industrial objects and show that our novel supervised training approach improves the pick-and-place performance in industry-relevant tasks.
arXiv Detail & Related papers (2021-02-16T11:40:12Z)
CosyPose: Consistent multi-view multi-object 6D pose estimation [48.097599674329004]
We present a single-view single-object 6D pose estimation method, which we use to generate 6D object pose hypotheses. Second, we develop a robust method for matching individual 6D object pose hypotheses across different input images. Third, we develop a method for global scene refinement given multiple object hypotheses and their correspondences across views.
arXiv Detail & Related papers (2020-08-19T14:11:56Z)
Single View Metrology in the Wild [94.7005246862618]
We present a novel approach to single view metrology that can recover the absolute scale of a scene represented by 3D heights of objects or camera height above the ground. Our method relies on data-driven priors learned by a deep network specifically designed to imbibe weakly supervised constraints from the interplay of the unknown camera with 3D entities such as object heights. We demonstrate state-of-the-art qualitative and quantitative results on several datasets as well as applications including virtual object insertion.
arXiv Detail & Related papers (2020-07-18T22:31:33Z)

This list is automatically generated from the titles and abstracts of the papers in this site.