Related papers: Data-driven 6D Pose Tracking by Calibrating Image Residuals in Synthetic Domains

Data-driven 6D Pose Tracking by Calibrating Image Residuals in Synthetic Domains

URL: http://arxiv.org/abs/2105.14391v1
Date: Sat, 29 May 2021 23:56:05 GMT
Title: Data-driven 6D Pose Tracking by Calibrating Image Residuals in Synthetic Domains
Authors: Bowen Wen, Chaitanya Mitash and Kostas Bekris
Abstract summary: This work presents se(3)-TrackNet, a data-driven optimization approach for long term, 6D pose tracking. It aims to identify the optimal relative pose given the current RGB-D observation and a synthetic image conditioned on the previous best estimate and the object's model. Neural network architecture appropriately disentangles the feature encoding to help reduce domain shift, and an effective 3D orientation representation via Lie Algebra.
Score: 6.187780920448869
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Tracking the 6D pose of objects in video sequences is important for robot manipulation. This work presents se(3)-TrackNet, a data-driven optimization approach for long term, 6D pose tracking. It aims to identify the optimal relative pose given the current RGB-D observation and a synthetic image conditioned on the previous best estimate and the object's model. The key contribution in this context is a novel neural network architecture, which appropriately disentangles the feature encoding to help reduce domain shift, and an effective 3D orientation representation via Lie Algebra. Consequently, even when the network is trained solely with synthetic data can work effectively over real images. Comprehensive experiments over multiple benchmarks show se(3)-TrackNet achieves consistently robust estimates and outperforms alternatives, even though they have been trained with real images. The approach runs in real time at 90.9Hz. Code, data and supplementary video for this project are available at https://github.com/wenbowen123/iros20-6d-pose-tracking

Related papers

Inverse Neural Rendering for Explainable Multi-Object Tracking [35.072142773300655]
We recast 3D multi-object tracking from RGB cameras as an emphInverse Rendering (IR) problem. We optimize an image loss over generative latent spaces that inherently disentangle shape and appearance properties. We validate the generalization and scaling capabilities of our method by learning the generative prior exclusively from synthetic data.
arXiv Detail & Related papers (2024-04-18T17:37:53Z)
3DiffTection: 3D Object Detection with Geometry-Aware Diffusion Features [70.50665869806188]
3DiffTection is a state-of-the-art method for 3D object detection from single images. We fine-tune a diffusion model to perform novel view synthesis conditioned on a single image. We further train the model on target data with detection supervision.
arXiv Detail & Related papers (2023-11-07T23:46:41Z)
6D Object Pose Estimation from Approximate 3D Models for Orbital Robotics [19.64111218032901]
We present a novel technique to estimate the 6D pose of objects from single images. We employ a dense 2D-to-3D correspondence predictor that regresses 3D model coordinates for every pixel. Our method achieves state-of-the-art performance on the SPEED+ dataset and has won the SPEC2021 post-mortem competition.
arXiv Detail & Related papers (2023-03-23T13:18:05Z)
Learning 6D Pose Estimation from Synthetic RGBD Images for Robotic Applications [0.6299766708197883]
The proposed pipeline can efficiently generate large amounts of photo-realistic RGBD images for the object of interest. We develop a real-time two-stage 6D pose estimation approach by integrating the object detector YOLO-V4-tiny and the 6D pose estimation algorithm PVN3D. The resulting network shows competitive performance compared to state-of-the-art methods when evaluated on LineMod dataset.
arXiv Detail & Related papers (2022-08-30T14:17:15Z)
Unseen Object 6D Pose Estimation: A Benchmark and Baselines [62.8809734237213]
We propose a new task that enables and facilitates algorithms to estimate the 6D pose estimation of novel objects during testing. We collect a dataset with both real and synthetic images and up to 48 unseen objects in the test set. By training an end-to-end 3D correspondences network, our method finds corresponding points between an unseen object and a partial view RGBD image accurately and efficiently.
arXiv Detail & Related papers (2022-06-23T16:29:53Z)
Simple and Effective Synthesis of Indoor 3D Scenes [78.95697556834536]
We study the problem of immersive 3D indoor scenes from one or more images. Our aim is to generate high-resolution images and videos from novel viewpoints. We propose an image-to-image GAN that maps directly from reprojections of incomplete point clouds to full high-resolution RGB-D images.
arXiv Detail & Related papers (2022-04-06T17:54:46Z)
Learnable Online Graph Representations for 3D Multi-Object Tracking [156.58876381318402]
We propose a unified and learning based approach to the 3D MOT problem. We employ a Neural Message Passing network for data association that is fully trainable. We show the merit of the proposed approach on the publicly available nuScenes dataset by achieving state-of-the-art performance of 65.6% AMOTA and 58% fewer ID-switches.
arXiv Detail & Related papers (2021-04-23T17:59:28Z)
GDRNPP: A Geometry-guided and Fully Learning-based Object Pose Estimator [51.89441403642665]
6D pose estimation of rigid objects is a long-standing and challenging task in computer vision. Recently, the emergence of deep learning reveals the potential of Convolutional Neural Networks (CNNs) to predict reliable 6D poses. This paper introduces a fully learning-based object pose estimator.
arXiv Detail & Related papers (2021-02-24T09:11:31Z)
Synthetic Training for Monocular Human Mesh Recovery [100.38109761268639]
This paper aims to estimate 3D mesh of multiple body parts with large-scale differences from a single RGB image. The main challenge is lacking training data that have complete 3D annotations of all body parts in 2D images. We propose a depth-to-scale (D2S) projection to incorporate the depth difference into the projection function to derive per-joint scale variants.
arXiv Detail & Related papers (2020-10-27T03:31:35Z)
se(3)-TrackNet: Data-driven 6D Pose Tracking by Calibrating Image Residuals in Synthetic Domains [12.71983073907091]
This work proposes a data-driven optimization approach for long-term, 6D pose tracking. It aims to identify the optimal relative pose given the current RGB-D observation and a synthetic image conditioned on the previous best estimate and the object's model. The proposed approach achieves consistently robust estimates and outperforms alternatives, even though they have been trained with real images.
arXiv Detail & Related papers (2020-07-27T21:09:36Z)
PerMO: Perceiving More at Once from a Single Image for Autonomous Driving [76.35684439949094]
We present a novel approach to detect, segment, and reconstruct complete textured 3D models of vehicles from a single image. Our approach combines the strengths of deep learning and the elegance of traditional techniques. We have integrated these algorithms with an autonomous driving system.
arXiv Detail & Related papers (2020-07-16T05:02:45Z)
Single Shot 6D Object Pose Estimation [11.37625512264302]
We introduce a novel single shot approach for 6D object pose estimation of rigid objects based on depth images. A fully convolutional neural network is employed, where the 3D input data is spatially discretized and pose estimation is considered as a regression task. With 65 fps on a GPU, our Object Pose Network (OP-Net) is extremely fast, is optimized end-to-end, and estimates the 6D pose of multiple objects in the image simultaneously.
arXiv Detail & Related papers (2020-04-27T11:59:11Z)

This list is automatically generated from the titles and abstracts of the papers in this site.