ROFT: Real-Time Optical Flow-Aided 6D Object Pose and Velocity Tracking
- URL: http://arxiv.org/abs/2111.03821v1
- Date: Sat, 6 Nov 2021 07:30:00 GMT
- Title: ROFT: Real-Time Optical Flow-Aided 6D Object Pose and Velocity Tracking
- Authors: Nicola A. Piga, Yuriy Onyshchuk, Giulia Pasquale, Ugo Pattacini and
Lorenzo Natale
- Abstract summary: We introduce ROFT, a Kalman filtering approach for 6D object pose and velocity tracking from a stream of RGB-D images.
By leveraging real-time optical flow, ROFT synchronizes delayed outputs of low frame rate Convolutional Neural Networks for instance segmentation and 6D object pose estimation.
Results demonstrate that our approach outperforms state-of-the-art methods for 6D object pose tracking, while also providing 6D object velocity tracking.
- Score: 7.617467911329272
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: 6D object pose tracking has been extensively studied in the robotics and
computer vision communities. The most promising solutions, leveraging on deep
neural networks and/or filtering and optimization, exhibit notable performance
on standard benchmarks. However, to our best knowledge, these have not been
tested thoroughly against fast object motions. Tracking performance in this
scenario degrades significantly, especially for methods that do not achieve
real-time performance and introduce non negligible delays. In this work, we
introduce ROFT, a Kalman filtering approach for 6D object pose and velocity
tracking from a stream of RGB-D images. By leveraging real-time optical flow,
ROFT synchronizes delayed outputs of low frame rate Convolutional Neural
Networks for instance segmentation and 6D object pose estimation with the RGB-D
input stream to achieve fast and precise 6D object pose and velocity tracking.
We test our method on a newly introduced photorealistic dataset, Fast-YCB,
which comprises fast moving objects from the YCB model set, and on the dataset
for object and hand pose estimation HO-3D. Results demonstrate that our
approach outperforms state-of-the-art methods for 6D object pose tracking,
while also providing 6D object velocity tracking. A video showing the
experiments is provided as supplementary material.
Related papers
- DO3D: Self-supervised Learning of Decomposed Object-aware 3D Motion and
Depth from Monocular Videos [76.01906393673897]
We propose a self-supervised method to jointly learn 3D motion and depth from monocular videos.
Our system contains a depth estimation module to predict depth, and a new decomposed object-wise 3D motion (DO3D) estimation module to predict ego-motion and 3D object motion.
Our model delivers superior performance in all evaluated settings.
arXiv Detail & Related papers (2024-03-09T12:22:46Z) - Multi-Modal Dataset Acquisition for Photometrically Challenging Object [56.30027922063559]
This paper addresses the limitations of current datasets for 3D vision tasks in terms of accuracy, size, realism, and suitable imaging modalities for photometrically challenging objects.
We propose a novel annotation and acquisition pipeline that enhances existing 3D perception and 6D object pose datasets.
arXiv Detail & Related papers (2023-08-21T10:38:32Z) - 3D Neural Embedding Likelihood: Probabilistic Inverse Graphics for
Robust 6D Pose Estimation [50.15926681475939]
Inverse graphics aims to infer the 3D scene structure from 2D images.
We introduce probabilistic modeling to quantify uncertainty and achieve robustness in 6D pose estimation tasks.
3DNEL effectively combines learned neural embeddings from RGB with depth information to improve robustness in sim-to-real 6D object pose estimation from RGB-D images.
arXiv Detail & Related papers (2023-02-07T20:48:35Z) - Enhancing Generalizable 6D Pose Tracking of an In-Hand Object with
Tactile Sensing [31.49529551069215]
TEG-Track is a tactile-enhanced 6D pose tracking system.
It can track previously unseen objects held in hand.
Results demonstrate that TEG-Track consistently enhances state-of-the-art generalizable 6D pose trackers.
arXiv Detail & Related papers (2022-10-08T13:47:03Z) - Unseen Object 6D Pose Estimation: A Benchmark and Baselines [62.8809734237213]
We propose a new task that enables and facilitates algorithms to estimate the 6D pose estimation of novel objects during testing.
We collect a dataset with both real and synthetic images and up to 48 unseen objects in the test set.
By training an end-to-end 3D correspondences network, our method finds corresponding points between an unseen object and a partial view RGBD image accurately and efficiently.
arXiv Detail & Related papers (2022-06-23T16:29:53Z) - Motion-from-Blur: 3D Shape and Motion Estimation of Motion-blurred
Objects in Videos [115.71874459429381]
We propose a method for jointly estimating the 3D motion, 3D shape, and appearance of highly motion-blurred objects from a video.
Experiments on benchmark datasets demonstrate that our method outperforms previous methods for fast moving object deblurring and 3D reconstruction.
arXiv Detail & Related papers (2021-11-29T11:25:14Z) - VIPose: Real-time Visual-Inertial 6D Object Pose Tracking [3.44942675405441]
We introduce a novel Deep Neural Network (DNN) called VIPose to address the object pose tracking problem in real-time.
The key contribution is the design of a novel DNN architecture which fuses visual and inertial features to predict the objects' relative 6D pose.
The approach presents accuracy performances comparable to state-of-the-art techniques, but with additional benefit to be real-time.
arXiv Detail & Related papers (2021-07-27T06:10:23Z) - Data-driven 6D Pose Tracking by Calibrating Image Residuals in Synthetic
Domains [6.187780920448869]
This work presents se(3)-TrackNet, a data-driven optimization approach for long term, 6D pose tracking.
It aims to identify the optimal relative pose given the current RGB-D observation and a synthetic image conditioned on the previous best estimate and the object's model.
Neural network architecture appropriately disentangles the feature encoding to help reduce domain shift, and an effective 3D orientation representation via Lie Algebra.
arXiv Detail & Related papers (2021-05-29T23:56:05Z) - Spatial Attention Improves Iterative 6D Object Pose Estimation [52.365075652976735]
We propose a new method for 6D pose estimation refinement from RGB images.
Our main insight is that after the initial pose estimate, it is important to pay attention to distinct spatial features of the object.
We experimentally show that this approach learns to attend to salient spatial features and learns to ignore occluded parts of the object, leading to better pose estimation across datasets.
arXiv Detail & Related papers (2021-01-05T17:18:52Z) - se(3)-TrackNet: Data-driven 6D Pose Tracking by Calibrating Image
Residuals in Synthetic Domains [12.71983073907091]
This work proposes a data-driven optimization approach for long-term, 6D pose tracking.
It aims to identify the optimal relative pose given the current RGB-D observation and a synthetic image conditioned on the previous best estimate and the object's model.
The proposed approach achieves consistently robust estimates and outperforms alternatives, even though they have been trained with real images.
arXiv Detail & Related papers (2020-07-27T21:09:36Z) - Single Shot 6D Object Pose Estimation [11.37625512264302]
We introduce a novel single shot approach for 6D object pose estimation of rigid objects based on depth images.
A fully convolutional neural network is employed, where the 3D input data is spatially discretized and pose estimation is considered as a regression task.
With 65 fps on a GPU, our Object Pose Network (OP-Net) is extremely fast, is optimized end-to-end, and estimates the 6D pose of multiple objects in the image simultaneously.
arXiv Detail & Related papers (2020-04-27T11:59:11Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.