Detecting Invisible People
- URL: http://arxiv.org/abs/2012.08419v1
- Date: Tue, 15 Dec 2020 16:54:45 GMT
- Title: Detecting Invisible People
- Authors: Tarasha Khurana, Achal Dave, Deva Ramanan
- Abstract summary: We re-purpose tracking benchmarks and propose new metrics for the task of detecting invisible objects.
We demonstrate that current detection and tracking systems perform dramatically worse on this task.
Second, we build dynamic models that explicitly reason in 3D, making use of observations produced by state-of-the-art monocular depth estimation networks.
- Score: 58.49425715635312
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Monocular object detection and tracking have improved drastically in recent
years, but rely on a key assumption: that objects are visible to the camera.
Many offline tracking approaches reason about occluded objects post-hoc, by
linking together tracklets after the object re-appears, making use of
reidentification (ReID). However, online tracking in embodied robotic agents
(such as a self-driving vehicle) fundamentally requires object permanence,
which is the ability to reason about occluded objects before they re-appear. In
this work, we re-purpose tracking benchmarks and propose new metrics for the
task of detecting invisible objects, focusing on the illustrative case of
people. We demonstrate that current detection and tracking systems perform
dramatically worse on this task. We introduce two key innovations to recover
much of this performance drop. We treat occluded object detection in temporal
sequences as a short-term forecasting challenge, bringing to bear tools from
dynamic sequence prediction. Second, we build dynamic models that explicitly
reason in 3D, making use of observations produced by state-of-the-art monocular
depth estimation networks. To our knowledge, ours is the first work to
demonstrate the effectiveness of monocular depth estimation for the task of
tracking and detecting occluded objects. Our approach strongly improves by
11.4% over the baseline in ablations and by 5.0% over the state-of-the-art in
F1 score.
Related papers
- Uncertainty Estimation for 3D Object Detection via Evidential Learning [63.61283174146648]
We introduce a framework for quantifying uncertainty in 3D object detection by leveraging an evidential learning loss on Bird's Eye View representations in the 3D detector.
We demonstrate both the efficacy and importance of these uncertainty estimates on identifying out-of-distribution scenes, poorly localized objects, and missing (false negative) detections.
arXiv Detail & Related papers (2024-10-31T13:13:32Z) - Leveraging Object Priors for Point Tracking [25.030407197192]
Point tracking is a fundamental problem in computer vision with numerous applications in AR and robotics.
We propose a novel objectness regularization approach that guides points to be aware of object priors.
Our approach achieves state-of-the-art performance on three point tracking benchmarks.
arXiv Detail & Related papers (2024-09-09T16:48:42Z) - Towards Unsupervised Object Detection From LiDAR Point Clouds [46.57452180314863]
OYSTER (Object Discovery via Spatio-Temporal Refinement) is able to detect objects in a zero-shot manner without supervised finetuning.
We propose a new planning-centric perception metric based on distance-to-collision.
arXiv Detail & Related papers (2023-11-03T16:12:01Z) - Once Detected, Never Lost: Surpassing Human Performance in Offline LiDAR
based 3D Object Detection [50.959453059206446]
This paper aims for high-performance offline LiDAR-based 3D object detection.
We first observe that experienced human annotators annotate objects from a track-centric perspective.
We propose a high-performance offline detector in a track-centric perspective instead of the conventional object-centric perspective.
arXiv Detail & Related papers (2023-04-24T17:59:05Z) - DORT: Modeling Dynamic Objects in Recurrent for Multi-Camera 3D Object
Detection and Tracking [67.34803048690428]
We propose to model Dynamic Objects in RecurrenT (DORT) to tackle this problem.
DORT extracts object-wise local volumes for motion estimation that also alleviates the heavy computational burden.
It is flexible and practical that can be plugged into most camera-based 3D object detectors.
arXiv Detail & Related papers (2023-03-29T12:33:55Z) - Long Range Object-Level Monocular Depth Estimation for UAVs [0.0]
We propose several novel extensions to state-of-the-art methods for monocular object detection from images at long range.
Firstly, we propose Sigmoid and ReLU-like encodings when modeling depth estimation as a regression task.
Secondly, we frame the depth estimation as a classification problem and introduce a Soft-Argmax function in the calculation of the training loss.
arXiv Detail & Related papers (2023-02-17T15:26:04Z) - Learning to Track with Object Permanence [61.36492084090744]
We introduce an end-to-end trainable approach for joint object detection and tracking.
Our model, trained jointly on synthetic and real data, outperforms the state of the art on KITTI, and MOT17 datasets.
arXiv Detail & Related papers (2021-03-26T04:43:04Z) - ArTIST: Autoregressive Trajectory Inpainting and Scoring for Tracking [80.02322563402758]
One of the core components in online multiple object tracking (MOT) frameworks is associating new detections with existing tracklets.
We introduce a probabilistic autoregressive generative model to score tracklet proposals by directly measuring the likelihood that a tracklet represents natural motion.
arXiv Detail & Related papers (2020-04-16T06:43:11Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.