SRT3D: A Sparse Region-Based 3D Object Tracking Approach for the Real
World
- URL: http://arxiv.org/abs/2110.12715v1
- Date: Mon, 25 Oct 2021 07:58:18 GMT
- Title: SRT3D: A Sparse Region-Based 3D Object Tracking Approach for the Real
World
- Authors: Manuel Stoiber, Martin Pfanne, Klaus H. Strobl, Rudolph Triebel, Alin
Albu-Sch\"affer
- Abstract summary: Region-based methods have become increasingly popular for model-based, monocular 3D tracking of texture-less objects in cluttered scenes.
However, most methods are computationally expensive, requiring significant resources to run in real-time.
We develop SRT3D, a sparse region-based approach to 3D object tracking that bridges this gap in efficiency.
- Score: 10.029003607782878
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Region-based methods have become increasingly popular for model-based,
monocular 3D tracking of texture-less objects in cluttered scenes. However,
while they achieve state-of-the-art results, most methods are computationally
expensive, requiring significant resources to run in real-time. In the
following, we build on our previous work and develop SRT3D, a sparse
region-based approach to 3D object tracking that bridges this gap in
efficiency. Our method considers image information sparsely along so-called
correspondence lines that model the probability of the object's contour
location. We thereby improve on the current state of the art and introduce
smoothed step functions that consider a defined global and local uncertainty.
For the resulting probabilistic formulation, a thorough analysis is provided.
Finally, we use a pre-rendered sparse viewpoint model to create a joint
posterior probability for the object pose. The function is maximized using
second-order Newton optimization with Tikhonov regularization. During the pose
estimation, we differentiate between global and local optimization, using a
novel approximation for the first-order derivative employed in the Newton
method. In multiple experiments, we demonstrate that the resulting algorithm
improves the current state of the art both in terms of runtime and quality,
performing particularly well for noisy and cluttered images encountered in the
real world.
Related papers
- GS-SLAM: Dense Visual SLAM with 3D Gaussian Splatting [51.96353586773191]
We introduce textbfGS-SLAM that first utilizes 3D Gaussian representation in the Simultaneous Localization and Mapping system.
Our method utilizes a real-time differentiable splatting rendering pipeline that offers significant speedup to map optimization and RGB-D rendering.
Our method achieves competitive performance compared with existing state-of-the-art real-time methods on the Replica, TUM-RGBD datasets.
arXiv Detail & Related papers (2023-11-20T12:08:23Z) - Volumetric Semantically Consistent 3D Panoptic Mapping [77.13446499924977]
We introduce an online 2D-to-3D semantic instance mapping algorithm aimed at generating semantic 3D maps suitable for autonomous agents in unstructured environments.
It introduces novel ways of integrating semantic prediction confidence during mapping, producing semantic and instance-consistent 3D regions.
The proposed method achieves accuracy superior to the state of the art on public large-scale datasets, improving on a number of widely used metrics.
arXiv Detail & Related papers (2023-09-26T08:03:10Z) - FvOR: Robust Joint Shape and Pose Optimization for Few-view Object
Reconstruction [37.81077373162092]
Reconstructing an accurate 3D object model from a few image observations remains a challenging problem in computer vision.
We present FvOR, a learning-based object reconstruction method that predicts accurate 3D models given a few images with noisy input poses.
arXiv Detail & Related papers (2022-05-16T15:39:27Z) - Stereo Neural Vernier Caliper [57.187088191829886]
We propose a new object-centric framework for learning-based stereo 3D object detection.
We tackle a problem of how to predict a refined update given an initial 3D cuboid guess.
Our approach achieves state-of-the-art performance on the KITTI benchmark.
arXiv Detail & Related papers (2022-03-21T14:36:07Z) - LocATe: End-to-end Localization of Actions in 3D with Transformers [91.28982770522329]
LocATe is an end-to-end approach that jointly localizes and recognizes actions in a 3D sequence.
Unlike transformer-based object-detection and classification models which consider image or patch features as input, LocATe's transformer model is capable of capturing long-term correlations between actions in a sequence.
We introduce a new, challenging, and more realistic benchmark dataset, BABEL-TAL-20 (BT20), where the performance of state-of-the-art methods is significantly worse.
arXiv Detail & Related papers (2022-03-21T03:35:32Z) - RandomRooms: Unsupervised Pre-training from Synthetic Shapes and
Randomized Layouts for 3D Object Detection [138.2892824662943]
A promising solution is to make better use of the synthetic dataset, which consists of CAD object models, to boost the learning on real datasets.
Recent work on 3D pre-training exhibits failure when transfer features learned on synthetic objects to other real-world applications.
In this work, we put forward a new method called RandomRooms to accomplish this objective.
arXiv Detail & Related papers (2021-08-17T17:56:12Z) - Soft Expectation and Deep Maximization for Image Feature Detection [68.8204255655161]
We propose SEDM, an iterative semi-supervised learning process that flips the question and first looks for repeatable 3D points, then trains a detector to localize them in image space.
Our results show that this new model trained using SEDM is able to better localize the underlying 3D points in a scene.
arXiv Detail & Related papers (2021-04-21T00:35:32Z) - Factor Graph based 3D Multi-Object Tracking in Point Clouds [8.411514688735183]
We propose a novel optimization-based approach that does not rely on explicit and fixed assignments.
We demonstrate its performance on the real world KITTI tracking dataset and achieve better results than many state-of-the-art algorithms.
arXiv Detail & Related papers (2020-08-12T13:34:46Z) - Joint Spatial-Temporal Optimization for Stereo 3D Object Tracking [34.40019455462043]
We propose a joint spatial-temporal optimization-based stereo 3D object tracking method.
From the network, we detect corresponding 2D bounding boxes on adjacent images and regress an initial 3D bounding box.
Dense object cues that associating to the object centroid are then predicted using a region-based network.
arXiv Detail & Related papers (2020-04-20T13:59:46Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.