Digital Twin Tracking Dataset (DTTD): A New RGB+Depth 3D Dataset for
Longer-Range Object Tracking Applications
- URL: http://arxiv.org/abs/2302.05991v2
- Date: Tue, 11 Apr 2023 20:31:38 GMT
- Title: Digital Twin Tracking Dataset (DTTD): A New RGB+Depth 3D Dataset for
Longer-Range Object Tracking Applications
- Authors: Weiyu Feng, Seth Z. Zhao, Chuanyu Pan, Adam Chang, Yichen Chen, Zekun
Wang, Allen Y. Yang
- Abstract summary: Digital twin is a problem of augmenting real objects with their digital counterparts.
A critical component in a good digital-twin system is real-time, accurate 3D object tracking.
In this work, we create a novel RGB-D dataset, called Digital Twin Tracking dataset (DTTD)
- Score: 3.9776693020673677
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: Digital twin is a problem of augmenting real objects with their digital
counterparts. It can underpin a wide range of applications in augmented reality
(AR), autonomy, and UI/UX. A critical component in a good digital-twin system
is real-time, accurate 3D object tracking. Most existing works solve 3D object
tracking through the lens of robotic grasping, employ older generations of
depth sensors, and measure performance metrics that may not apply to other
digital-twin applications such as in AR. In this work, we create a novel RGB-D
dataset, called Digital Twin Tracking Dataset (DTTD), to enable further
research of the problem and extend potential solutions towards longer ranges
and mm localization accuracy. To reduce point cloud noise from the input
source, we select the latest Microsoft Azure Kinect as the state-of-the-art
time-of-flight (ToF) camera. In total, 103 scenes of 10 common off-the-shelf
objects with rich textures are recorded, with each frame annotated with a
per-pixel semantic segmentation and ground-truth object poses provided by a
commercial motion capturing system. Through extensive experiments with
model-level and dataset-level analysis, we demonstrate that DTTD can help
researchers develop future object tracking methods and analyze new challenges.
The dataset, data generation, annotation, and model evaluation pipeline are
made publicly available as open source code at:
https://github.com/augcog/DTTDv1.
Related papers
- Implicit-Zoo: A Large-Scale Dataset of Neural Implicit Functions for 2D Images and 3D Scenes [65.22070581594426]
"Implicit-Zoo" is a large-scale dataset requiring thousands of GPU training days to facilitate research and development in this field.
We showcase two immediate benefits as it enables to: (1) learn token locations for transformer models; (2) directly regress 3D cameras poses of 2D images with respect to NeRF models.
This in turn leads to an improved performance in all three task of image classification, semantic segmentation, and 3D pose regression, thereby unlocking new avenues for research.
arXiv Detail & Related papers (2024-06-25T10:20:44Z) - Multi-Modal Dataset Acquisition for Photometrically Challenging Object [56.30027922063559]
This paper addresses the limitations of current datasets for 3D vision tasks in terms of accuracy, size, realism, and suitable imaging modalities for photometrically challenging objects.
We propose a novel annotation and acquisition pipeline that enhances existing 3D perception and 6D object pose datasets.
arXiv Detail & Related papers (2023-08-21T10:38:32Z) - 3D Data Augmentation for Driving Scenes on Camera [50.41413053812315]
We propose a 3D data augmentation approach termed Drive-3DAug, aiming at augmenting the driving scenes on camera in the 3D space.
We first utilize Neural Radiance Field (NeRF) to reconstruct the 3D models of background and foreground objects.
Then, augmented driving scenes can be obtained by placing the 3D objects with adapted location and orientation at the pre-defined valid region of backgrounds.
arXiv Detail & Related papers (2023-03-18T05:51:05Z) - PC-DAN: Point Cloud based Deep Affinity Network for 3D Multi-Object
Tracking (Accepted as an extended abstract in JRDB-ACT Workshop at CVPR21) [68.12101204123422]
A point cloud is a dense compilation of spatial data in 3D coordinates.
We propose a PointNet-based approach for 3D Multi-Object Tracking (MOT)
arXiv Detail & Related papers (2021-06-03T05:36:39Z) - Learnable Online Graph Representations for 3D Multi-Object Tracking [156.58876381318402]
We propose a unified and learning based approach to the 3D MOT problem.
We employ a Neural Message Passing network for data association that is fully trainable.
We show the merit of the proposed approach on the publicly available nuScenes dataset by achieving state-of-the-art performance of 65.6% AMOTA and 58% fewer ID-switches.
arXiv Detail & Related papers (2021-04-23T17:59:28Z) - Monocular Quasi-Dense 3D Object Tracking [99.51683944057191]
A reliable and accurate 3D tracking framework is essential for predicting future locations of surrounding objects and planning the observer's actions in numerous applications such as autonomous driving.
We propose a framework that can effectively associate moving objects over time and estimate their full 3D bounding box information from a sequence of 2D images captured on a moving platform.
arXiv Detail & Related papers (2021-03-12T15:30:02Z) - Relation3DMOT: Exploiting Deep Affinity for 3D Multi-Object Tracking
from View Aggregation [8.854112907350624]
3D multi-object tracking plays a vital role in autonomous navigation.
Many approaches detect objects in 2D RGB sequences for tracking, which is lack of reliability when localizing objects in 3D space.
We propose a novel convolutional operation, named RelationConv, to better exploit the correlation between each pair of objects in the adjacent frames.
arXiv Detail & Related papers (2020-11-25T16:14:40Z) - 1st Place Solution for Waymo Open Dataset Challenge -- 3D Detection and
Domain Adaptation [7.807118356899879]
We propose a one-stage, anchor-free and NMS-free 3D point cloud object detector AFDet.
AFDet serves as a strong baseline in our winning solution.
We design stronger networks and enhance the point cloud data using densification and point painting.
arXiv Detail & Related papers (2020-06-28T04:49:39Z) - JRMOT: A Real-Time 3D Multi-Object Tracker and a New Large-Scale Dataset [34.609125601292]
We present JRMOT, a novel 3D MOT system that integrates information from RGB images and 3D point clouds to achieve real-time tracking performance.
As part of our work, we release the JRDB dataset, a novel large scale 2D+3D dataset and benchmark.
The presented 3D MOT system demonstrates state-of-the-art performance against competing methods on the popular 2D tracking KITTI benchmark.
arXiv Detail & Related papers (2020-02-19T19:21:33Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.