Joint Spatial-Temporal Optimization for Stereo 3D Object Tracking
- URL: http://arxiv.org/abs/2004.09305v1
- Date: Mon, 20 Apr 2020 13:59:46 GMT
- Title: Joint Spatial-Temporal Optimization for Stereo 3D Object Tracking
- Authors: Peiliang Li, Jieqi Shi, Shaojie Shen
- Abstract summary: We propose a joint spatial-temporal optimization-based stereo 3D object tracking method.
From the network, we detect corresponding 2D bounding boxes on adjacent images and regress an initial 3D bounding box.
Dense object cues that associating to the object centroid are then predicted using a region-based network.
- Score: 34.40019455462043
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Directly learning multiple 3D objects motion from sequential images is
difficult, while the geometric bundle adjustment lacks the ability to localize
the invisible object centroid. To benefit from both the powerful object
understanding skill from deep neural network meanwhile tackle precise geometry
modeling for consistent trajectory estimation, we propose a joint
spatial-temporal optimization-based stereo 3D object tracking method. From the
network, we detect corresponding 2D bounding boxes on adjacent images and
regress an initial 3D bounding box. Dense object cues (local depth and local
coordinates) that associating to the object centroid are then predicted using a
region-based network. Considering both the instant localization accuracy and
motion consistency, our optimization models the relations between the object
centroid and observed cues into a joint spatial-temporal error function. All
historic cues will be summarized to contribute to the current estimation by a
per-frame marginalization strategy without repeated computation. Quantitative
evaluation on the KITTI tracking dataset shows our approach outperforms
previous image-based 3D tracking methods by significant margins. We also report
extensive results on multiple categories and larger datasets (KITTI raw and
Argoverse Tracking) for future benchmarking.
Related papers
- 3DiffTection: 3D Object Detection with Geometry-Aware Diffusion Features [70.50665869806188]
3DiffTection is a state-of-the-art method for 3D object detection from single images.
We fine-tune a diffusion model to perform novel view synthesis conditioned on a single image.
We further train the model on target data with detection supervision.
arXiv Detail & Related papers (2023-11-07T23:46:41Z) - OriCon3D: Effective 3D Object Detection using Orientation and Confidence [0.0]
We propose an advanced methodology for the detection of 3D objects from a single image.
We use a deep convolutional neural network-based 3D object weighted orientation regression paradigm.
Our approach significantly improves the accuracy of 3D object pose determination, surpassing baseline methodologies.
arXiv Detail & Related papers (2023-04-27T19:52:47Z) - OPA-3D: Occlusion-Aware Pixel-Wise Aggregation for Monocular 3D Object
Detection [51.153003057515754]
OPA-3D is a single-stage, end-to-end, Occlusion-Aware Pixel-Wise Aggregation network.
It jointly estimates dense scene depth with depth-bounding box residuals and object bounding boxes.
It outperforms state-of-the-art methods on the main Car category.
arXiv Detail & Related papers (2022-11-02T14:19:13Z) - 3D Multi-Object Tracking with Differentiable Pose Estimation [0.0]
We propose a novel approach for joint 3D multi-object tracking and reconstruction from RGB-D sequences in indoor environments.
We leverage those correspondences to inform a graph neural network to solve for the optimal, temporally-consistent 7-DoF pose trajectories of all objects.
Our method improves the accumulated MOTA score for all test sequences by 24.8% over existing state-of-the-art methods.
arXiv Detail & Related papers (2022-06-28T06:46:32Z) - Homography Loss for Monocular 3D Object Detection [54.04870007473932]
A differentiable loss function, termed as Homography Loss, is proposed to achieve the goal, which exploits both 2D and 3D information.
Our method yields the best performance compared with the other state-of-the-arts by a large margin on KITTI 3D datasets.
arXiv Detail & Related papers (2022-04-02T03:48:03Z) - Joint stereo 3D object detection and implicit surface reconstruction [39.30458073540617]
We present a new learning-based framework S-3D-RCNN that can recover accurate object orientation in SO(3) and simultaneously predict implicit rigid shapes from stereo RGB images.
For orientation estimation, in contrast to previous studies that map local appearance to observation angles, we propose a progressive approach by extracting meaningful Intermediate Geometrical Representations (IGRs)
This approach features a deep model that transforms perceived intensities from one or two views to object part coordinates to achieve direct egocentric object orientation estimation in the camera coordinate system.
To further achieve finer description inside 3D bounding boxes, we investigate the implicit shape estimation problem from stereo images
arXiv Detail & Related papers (2021-11-25T05:52:30Z) - M3DSSD: Monocular 3D Single Stage Object Detector [82.25793227026443]
We propose a Monocular 3D Single Stage object Detector (M3DSSD) with feature alignment and asymmetric non-local attention.
The proposed M3DSSD achieves significantly better performance than the monocular 3D object detection methods on the KITTI dataset.
arXiv Detail & Related papers (2021-03-24T13:09:11Z) - Monocular Quasi-Dense 3D Object Tracking [99.51683944057191]
A reliable and accurate 3D tracking framework is essential for predicting future locations of surrounding objects and planning the observer's actions in numerous applications such as autonomous driving.
We propose a framework that can effectively associate moving objects over time and estimate their full 3D bounding box information from a sequence of 2D images captured on a moving platform.
arXiv Detail & Related papers (2021-03-12T15:30:02Z) - Relation3DMOT: Exploiting Deep Affinity for 3D Multi-Object Tracking
from View Aggregation [8.854112907350624]
3D multi-object tracking plays a vital role in autonomous navigation.
Many approaches detect objects in 2D RGB sequences for tracking, which is lack of reliability when localizing objects in 3D space.
We propose a novel convolutional operation, named RelationConv, to better exploit the correlation between each pair of objects in the adjacent frames.
arXiv Detail & Related papers (2020-11-25T16:14:40Z) - Reinforced Axial Refinement Network for Monocular 3D Object Detection [160.34246529816085]
Monocular 3D object detection aims to extract the 3D position and properties of objects from a 2D input image.
Conventional approaches sample 3D bounding boxes from the space and infer the relationship between the target object and each of them, however, the probability of effective samples is relatively small in the 3D space.
We propose to start with an initial prediction and refine it gradually towards the ground truth, with only one 3d parameter changed in each step.
This requires designing a policy which gets a reward after several steps, and thus we adopt reinforcement learning to optimize it.
arXiv Detail & Related papers (2020-08-31T17:10:48Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.