Related papers: ROFT: Real-Time Optical Flow-Aided 6D Object Pose and Velocity Tracking

ROFT: Real-Time Optical Flow-Aided 6D Object Pose and Velocity Tracking

URL: http://arxiv.org/abs/2111.03821v1
Date: Sat, 6 Nov 2021 07:30:00 GMT
Title: ROFT: Real-Time Optical Flow-Aided 6D Object Pose and Velocity Tracking
Authors: Nicola A. Piga, Yuriy Onyshchuk, Giulia Pasquale, Ugo Pattacini and Lorenzo Natale
Abstract summary: We introduce ROFT, a Kalman filtering approach for 6D object pose and velocity tracking from a stream of RGB-D images. By leveraging real-time optical flow, ROFT synchronizes delayed outputs of low frame rate Convolutional Neural Networks for instance segmentation and 6D object pose estimation. Results demonstrate that our approach outperforms state-of-the-art methods for 6D object pose tracking, while also providing 6D object velocity tracking.
Score: 7.617467911329272
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: 6D object pose tracking has been extensively studied in the robotics and computer vision communities. The most promising solutions, leveraging on deep neural networks and/or filtering and optimization, exhibit notable performance on standard benchmarks. However, to our best knowledge, these have not been tested thoroughly against fast object motions. Tracking performance in this scenario degrades significantly, especially for methods that do not achieve real-time performance and introduce non negligible delays. In this work, we introduce ROFT, a Kalman filtering approach for 6D object pose and velocity tracking from a stream of RGB-D images. By leveraging real-time optical flow, ROFT synchronizes delayed outputs of low frame rate Convolutional Neural Networks for instance segmentation and 6D object pose estimation with the RGB-D input stream to achieve fast and precise 6D object pose and velocity tracking. We test our method on a newly introduced photorealistic dataset, Fast-YCB, which comprises fast moving objects from the YCB model set, and on the dataset for object and hand pose estimation HO-3D. Results demonstrate that our approach outperforms state-of-the-art methods for 6D object pose tracking, while also providing 6D object velocity tracking. A video showing the experiments is provided as supplementary material.

Related papers

Any6D: Model-free 6D Pose Estimation of Novel Objects [76.30057578269668]
We introduce Any6D, a model-free framework for 6D object pose estimation. It requires only a single RGB-D anchor image to estimate both the 6D pose and size of unknown objects in novel scenes. We evaluate our method on five challenging datasets.
arXiv Detail & Related papers (2025-03-24T13:46:21Z)
6D Object Pose Tracking in Internet Videos for Robotic Manipulation [20.22297850525832]
We develop a new method that estimates the 6D pose of any object in the input image without prior knowledge of the object itself. We extract smooth 6D object trajectories from Internet videos by carefully tracking the detected objects across video frames. We demonstrate significant improvements over existing state-of-the-art RGB 6D pose estimation methods.
arXiv Detail & Related papers (2025-03-13T12:33:34Z)
ReFlow6D: Refraction-Guided Transparent Object 6D Pose Estimation via Intermediate Representation Learning [48.29147383536012]
We present ReFlow6D, a novel method for transparent object 6D pose estimation. Unlike conventional approaches, our method leverages a feature space impervious to changes in RGB image space and independent of depth information. We show that ReFlow6D achieves precise 6D pose estimation of transparent objects, using only RGB images as input.
arXiv Detail & Related papers (2024-12-30T09:53:26Z)
6DOPE-GS: Online 6D Object Pose Estimation using Gaussian Splatting [7.7145084897748974]
We present 6DOPE-GS, a novel method for online 6D object pose estimation & tracking with a single RGB-D camera. We show that 6DOPE-GS matches the performance of state-of-the-art baselines for model-free simultaneous 6D pose tracking and reconstruction. We also demonstrate the method's suitability for live, dynamic object tracking and reconstruction in a real-world setting.
arXiv Detail & Related papers (2024-12-02T14:32:19Z)
DO3D: Self-supervised Learning of Decomposed Object-aware 3D Motion and Depth from Monocular Videos [76.01906393673897]
We propose a self-supervised method to jointly learn 3D motion and depth from monocular videos. Our system contains a depth estimation module to predict depth, and a new decomposed object-wise 3D motion (DO3D) estimation module to predict ego-motion and 3D object motion. Our model delivers superior performance in all evaluated settings.
arXiv Detail & Related papers (2024-03-09T12:22:46Z)
Multi-Modal Dataset Acquisition for Photometrically Challenging Object [56.30027922063559]
This paper addresses the limitations of current datasets for 3D vision tasks in terms of accuracy, size, realism, and suitable imaging modalities for photometrically challenging objects. We propose a novel annotation and acquisition pipeline that enhances existing 3D perception and 6D object pose datasets.
arXiv Detail & Related papers (2023-08-21T10:38:32Z)
3D Neural Embedding Likelihood: Probabilistic Inverse Graphics for Robust 6D Pose Estimation [50.15926681475939]
Inverse graphics aims to infer the 3D scene structure from 2D images. We introduce probabilistic modeling to quantify uncertainty and achieve robustness in 6D pose estimation tasks. 3DNEL effectively combines learned neural embeddings from RGB with depth information to improve robustness in sim-to-real 6D object pose estimation from RGB-D images.
arXiv Detail & Related papers (2023-02-07T20:48:35Z)
Enhancing Generalizable 6D Pose Tracking of an In-Hand Object with Tactile Sensing [31.49529551069215]
TEG-Track is a tactile-enhanced 6D pose tracking system. It can track previously unseen objects held in hand. Results demonstrate that TEG-Track consistently enhances state-of-the-art generalizable 6D pose trackers.
arXiv Detail & Related papers (2022-10-08T13:47:03Z)
Unseen Object 6D Pose Estimation: A Benchmark and Baselines [62.8809734237213]
We propose a new task that enables and facilitates algorithms to estimate the 6D pose estimation of novel objects during testing. We collect a dataset with both real and synthetic images and up to 48 unseen objects in the test set. By training an end-to-end 3D correspondences network, our method finds corresponding points between an unseen object and a partial view RGBD image accurately and efficiently.
arXiv Detail & Related papers (2022-06-23T16:29:53Z)
Motion-from-Blur: 3D Shape and Motion Estimation of Motion-blurred Objects in Videos [115.71874459429381]
We propose a method for jointly estimating the 3D motion, 3D shape, and appearance of highly motion-blurred objects from a video. Experiments on benchmark datasets demonstrate that our method outperforms previous methods for fast moving object deblurring and 3D reconstruction.
arXiv Detail & Related papers (2021-11-29T11:25:14Z)
VIPose: Real-time Visual-Inertial 6D Object Pose Tracking [3.44942675405441]
We introduce a novel Deep Neural Network (DNN) called VIPose to address the object pose tracking problem in real-time. The key contribution is the design of a novel DNN architecture which fuses visual and inertial features to predict the objects' relative 6D pose. The approach presents accuracy performances comparable to state-of-the-art techniques, but with additional benefit to be real-time.
arXiv Detail & Related papers (2021-07-27T06:10:23Z)
Data-driven 6D Pose Tracking by Calibrating Image Residuals in Synthetic Domains [6.187780920448869]
This work presents se(3)-TrackNet, a data-driven optimization approach for long term, 6D pose tracking. It aims to identify the optimal relative pose given the current RGB-D observation and a synthetic image conditioned on the previous best estimate and the object's model. Neural network architecture appropriately disentangles the feature encoding to help reduce domain shift, and an effective 3D orientation representation via Lie Algebra.
arXiv Detail & Related papers (2021-05-29T23:56:05Z)
Spatial Attention Improves Iterative 6D Object Pose Estimation [52.365075652976735]
We propose a new method for 6D pose estimation refinement from RGB images. Our main insight is that after the initial pose estimate, it is important to pay attention to distinct spatial features of the object. We experimentally show that this approach learns to attend to salient spatial features and learns to ignore occluded parts of the object, leading to better pose estimation across datasets.
arXiv Detail & Related papers (2021-01-05T17:18:52Z)
se(3)-TrackNet: Data-driven 6D Pose Tracking by Calibrating Image Residuals in Synthetic Domains [12.71983073907091]
This work proposes a data-driven optimization approach for long-term, 6D pose tracking. It aims to identify the optimal relative pose given the current RGB-D observation and a synthetic image conditioned on the previous best estimate and the object's model. The proposed approach achieves consistently robust estimates and outperforms alternatives, even though they have been trained with real images.
arXiv Detail & Related papers (2020-07-27T21:09:36Z)
Single Shot 6D Object Pose Estimation [11.37625512264302]
We introduce a novel single shot approach for 6D object pose estimation of rigid objects based on depth images. A fully convolutional neural network is employed, where the 3D input data is spatially discretized and pose estimation is considered as a regression task. With 65 fps on a GPU, our Object Pose Network (OP-Net) is extremely fast, is optimized end-to-end, and estimates the 6D pose of multiple objects in the image simultaneously.
arXiv Detail & Related papers (2020-04-27T11:59:11Z)

This list is automatically generated from the titles and abstracts of the papers in this site.