TrackAgent: 6D Object Tracking via Reinforcement Learning
- URL: http://arxiv.org/abs/2307.15671v1
- Date: Fri, 28 Jul 2023 17:03:00 GMT
- Title: TrackAgent: 6D Object Tracking via Reinforcement Learning
- Authors: Konstantin R\"ohrl, Dominik Bauer, Timothy Patten, and Markus Vincze
- Abstract summary: We propose to simplify object tracking to a reinforced point cloud (depth only) alignment task.
This allows us to train a streamlined approach from scratch with limited amounts of sparse 3D point clouds.
We also show that the RL agent's uncertainty and a rendering-based mask propagation are effective reinitialization triggers.
- Score: 24.621588217873395
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Tracking an object's 6D pose, while either the object itself or the observing
camera is moving, is important for many robotics and augmented reality
applications. While exploiting temporal priors eases this problem,
object-specific knowledge is required to recover when tracking is lost. Under
the tight time constraints of the tracking task, RGB(D)-based methods are often
conceptionally complex or rely on heuristic motion models. In comparison, we
propose to simplify object tracking to a reinforced point cloud (depth only)
alignment task. This allows us to train a streamlined approach from scratch
with limited amounts of sparse 3D point clouds, compared to the large datasets
of diverse RGBD sequences required in previous works. We incorporate temporal
frame-to-frame registration with object-based recovery by frame-to-model
refinement using a reinforcement learning (RL) agent that jointly solves for
both objectives. We also show that the RL agent's uncertainty and a
rendering-based mask propagation are effective reinitialization triggers.
Related papers
- SuperPose: Improved 6D Pose Estimation with Robust Tracking and Mask-Free Initialization [5.298176595324931]
We developed a robust solution for real-time 6D object detection in industrial applications by integrating FoundationPose, SAM2, and LightGlue.
The algorithm requires only a CAD model of the target object, with the user clicking on its location in the live feed during the initial setup.
Tested on the YCB dataset and industrial components such as bleach cleanser and gears, the algorithm demonstrated reliable 6D detection and tracking.
arXiv Detail & Related papers (2024-09-30T06:26:49Z) - Spatio-Temporal Bi-directional Cross-frame Memory for Distractor Filtering Point Cloud Single Object Tracking [2.487142846438629]
3 single object tracking within LIDAR point is pivotal task in computer vision.
Existing methods, which depend solely on appearance matching via networks or utilize information from successive frames, encounter significant challenges.
We design an innovative cross-frame bi-temporal motion tracker, named STMD-Tracker, to mitigate these challenges.
arXiv Detail & Related papers (2024-03-23T13:15:44Z) - DORT: Modeling Dynamic Objects in Recurrent for Multi-Camera 3D Object
Detection and Tracking [67.34803048690428]
We propose to model Dynamic Objects in RecurrenT (DORT) to tackle this problem.
DORT extracts object-wise local volumes for motion estimation that also alleviates the heavy computational burden.
It is flexible and practical that can be plugged into most camera-based 3D object detectors.
arXiv Detail & Related papers (2023-03-29T12:33:55Z) - Modeling Continuous Motion for 3D Point Cloud Object Tracking [54.48716096286417]
This paper presents a novel approach that views each tracklet as a continuous stream.
At each timestamp, only the current frame is fed into the network to interact with multi-frame historical features stored in a memory bank.
To enhance the utilization of multi-frame features for robust tracking, a contrastive sequence enhancement strategy is proposed.
arXiv Detail & Related papers (2023-03-14T02:58:27Z) - 3D-FCT: Simultaneous 3D Object Detection and Tracking Using Feature
Correlation [0.0]
3D-FCT is a Siamese network architecture that utilizes temporal information to simultaneously perform the related tasks of 3D object detection and tracking.
Our proposed method is evaluated on the KITTI tracking dataset where it is shown to provide an improvement of 5.57% mAP over a state-of-the-art approach.
arXiv Detail & Related papers (2021-10-06T06:36:29Z) - Learnable Online Graph Representations for 3D Multi-Object Tracking [156.58876381318402]
We propose a unified and learning based approach to the 3D MOT problem.
We employ a Neural Message Passing network for data association that is fully trainable.
We show the merit of the proposed approach on the publicly available nuScenes dataset by achieving state-of-the-art performance of 65.6% AMOTA and 58% fewer ID-switches.
arXiv Detail & Related papers (2021-04-23T17:59:28Z) - Learning to Track with Object Permanence [61.36492084090744]
We introduce an end-to-end trainable approach for joint object detection and tracking.
Our model, trained jointly on synthetic and real data, outperforms the state of the art on KITTI, and MOT17 datasets.
arXiv Detail & Related papers (2021-03-26T04:43:04Z) - Monocular Quasi-Dense 3D Object Tracking [99.51683944057191]
A reliable and accurate 3D tracking framework is essential for predicting future locations of surrounding objects and planning the observer's actions in numerous applications such as autonomous driving.
We propose a framework that can effectively associate moving objects over time and estimate their full 3D bounding box information from a sequence of 2D images captured on a moving platform.
arXiv Detail & Related papers (2021-03-12T15:30:02Z) - Detecting Invisible People [58.49425715635312]
We re-purpose tracking benchmarks and propose new metrics for the task of detecting invisible objects.
We demonstrate that current detection and tracking systems perform dramatically worse on this task.
Second, we build dynamic models that explicitly reason in 3D, making use of observations produced by state-of-the-art monocular depth estimation networks.
arXiv Detail & Related papers (2020-12-15T16:54:45Z) - Tracking from Patterns: Learning Corresponding Patterns in Point Clouds
for 3D Object Tracking [34.40019455462043]
We propose to learn 3D object correspondences from temporal point cloud data and infer the motion information from correspondence patterns.
Our method exceeds the existing 3D tracking methods on both the KITTI and larger scale Nuscenes dataset.
arXiv Detail & Related papers (2020-10-20T06:07:20Z) - How to track your dragon: A Multi-Attentional Framework for real-time
RGB-D 6-DOF Object Pose Tracking [35.21561169636035]
We present a novel multi-attentional convolutional architecture to tackle the problem of real-time RGB-D 6D object pose tracking.
We consider the special geometrical properties of both the object's 3D model and the pose space, and we use a more sophisticated approach for data augmentation during training.
arXiv Detail & Related papers (2020-04-21T23:00:06Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.