FingerSLAM: Closed-loop Unknown Object Localization and Reconstruction
from Visuo-tactile Feedback
- URL: http://arxiv.org/abs/2303.07997v1
- Date: Tue, 14 Mar 2023 15:48:47 GMT
- Title: FingerSLAM: Closed-loop Unknown Object Localization and Reconstruction
from Visuo-tactile Feedback
- Authors: Jialiang Zhao, Maria Bauza, Edward H. Adelson
- Abstract summary: FingerSLAM is a closed-loop factor graph-based pose estimator that combines local tactile sensing at finger-tip and global vision sensing from a wrist-mount camera.
We demonstrate reliable visuo-tactile pose estimation and shape reconstruction through quantitative and qualitative real-world evaluations on 6 objects that are unseen during training.
- Score: 5.871946269300959
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In this paper, we address the problem of using visuo-tactile feedback for
6-DoF localization and 3D reconstruction of unknown in-hand objects. We propose
FingerSLAM, a closed-loop factor graph-based pose estimator that combines local
tactile sensing at finger-tip and global vision sensing from a wrist-mount
camera. FingerSLAM is constructed with two constituent pose estimators: a
multi-pass refined tactile-based pose estimator that captures movements from
detailed local textures, and a single-pass vision-based pose estimator that
predicts from a global view of the object. We also design a loop closure
mechanism that actively matches current vision and tactile images to previously
stored key-frames to reduce accumulated error. FingerSLAM incorporates the two
sensing modalities of tactile and vision, as well as the loop closure mechanism
with a factor graph-based optimization framework. Such a framework produces an
optimized pose estimation solution that is more accurate than the standalone
estimators. The estimated poses are then used to reconstruct the shape of the
unknown object incrementally by stitching the local point clouds recovered from
tactile images. We train our system on real-world data collected with 20
objects. We demonstrate reliable visuo-tactile pose estimation and shape
reconstruction through quantitative and qualitative real-world evaluations on 6
objects that are unseen during training.
Related papers
- SkelFormer: Markerless 3D Pose and Shape Estimation using Skeletal Transformers [57.46911575980854]
We introduce SkelFormer, a novel markerless motion capture pipeline for multi-view human pose and shape estimation.
Our method first uses off-the-shelf 2D keypoint estimators, pre-trained on large-scale in-the-wild data, to obtain 3D joint positions.
Next, we design a regression-based inverse-kinematic skeletal transformer that maps the joint positions to pose and shape representations from heavily noisy observations.
arXiv Detail & Related papers (2024-04-19T04:51:18Z) - Semantic Object-level Modeling for Robust Visual Camera Relocalization [14.998133272060695]
We propose a novel method of automatic object-level voxel modeling for accurate ellipsoidal representations of objects.
All of these modules are entirely intergrated into visual SLAM system.
arXiv Detail & Related papers (2024-02-10T13:39:44Z) - LocaliseBot: Multi-view 3D object localisation with differentiable
rendering for robot grasping [9.690844449175948]
We focus on object pose estimation.
Our approach relies on three pieces of information: multiple views of the object, the camera's parameters at those viewpoints, and 3D CAD models of objects.
We show that the estimated object pose results in 99.65% grasp accuracy with the ground truth grasp candidates.
arXiv Detail & Related papers (2023-11-14T14:27:53Z) - AGO-Net: Association-Guided 3D Point Cloud Object Detection Network [86.10213302724085]
We propose a novel 3D detection framework that associates intact features for objects via domain adaptation.
We achieve new state-of-the-art performance on the KITTI 3D detection benchmark in both accuracy and speed.
arXiv Detail & Related papers (2022-08-24T16:54:38Z) - Tac2Pose: Tactile Object Pose Estimation from the First Touch [6.321662423735226]
We present Tac2Pose, an object-specific approach to tactile pose estimation from the first touch for known objects.
We simulate the contact shapes that a dense set of object poses would produce on the sensor.
We obtain contact shapes from the sensor with an object-agnostic calibration step that maps RGB tactile observations to binary contact shapes.
arXiv Detail & Related papers (2022-04-25T14:43:48Z) - What's in your hands? 3D Reconstruction of Generic Objects in Hands [49.12461675219253]
Our work aims to reconstruct hand-held objects given a single RGB image.
In contrast to prior works that typically assume known 3D templates and reduce the problem to 3D pose estimation, our work reconstructs generic hand-held object without knowing their 3D templates.
arXiv Detail & Related papers (2022-04-14T17:59:02Z) - Object Manipulation via Visual Target Localization [64.05939029132394]
Training agents to manipulate objects, poses many challenges.
We propose an approach that explores the environment in search for target objects, computes their 3D coordinates once they are located, and then continues to estimate their 3D locations even when the objects are not visible.
Our evaluations show a massive 3x improvement in success rate over a model that has access to the same sensory suite.
arXiv Detail & Related papers (2022-03-15T17:59:01Z) - Active 3D Shape Reconstruction from Vision and Touch [66.08432412497443]
Humans build 3D understandings of the world through active object exploration, using jointly their senses of vision and touch.
In 3D shape reconstruction, most recent progress has relied on static datasets of limited sensory data such as RGB images, depth maps or haptic readings.
We introduce a system composed of: 1) a haptic simulator leveraging high spatial resolution vision-based tactile sensors for active touching of 3D objects; 2) a mesh-based 3D shape reconstruction model that relies on tactile or visuotactile priors to guide the shape exploration; and 3) a set of data-driven solutions with either tactile or visuo
arXiv Detail & Related papers (2021-07-20T15:56:52Z) - End-to-end learning of keypoint detection and matching for relative pose
estimation [1.8352113484137624]
We propose a new method for estimating the relative pose between two images.
We jointly learn keypoint detection, description extraction, matching and robust pose estimation.
We demonstrate our method for the task of visual localization of a query image within a database of images with known pose.
arXiv Detail & Related papers (2021-04-02T15:16:17Z) - Leveraging Photometric Consistency over Time for Sparsely Supervised
Hand-Object Reconstruction [118.21363599332493]
We present a method to leverage photometric consistency across time when annotations are only available for a sparse subset of frames in a video.
Our model is trained end-to-end on color images to jointly reconstruct hands and objects in 3D by inferring their poses.
We achieve state-of-the-art results on 3D hand-object reconstruction benchmarks and demonstrate that our approach allows us to improve the pose estimation accuracy.
arXiv Detail & Related papers (2020-04-28T12:03:14Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.