Detecting Invisible People
- URL: http://arxiv.org/abs/2012.08419v1
- Date: Tue, 15 Dec 2020 16:54:45 GMT
- Title: Detecting Invisible People
- Authors: Tarasha Khurana, Achal Dave, Deva Ramanan
- Abstract summary: We re-purpose tracking benchmarks and propose new metrics for the task of detecting invisible objects.
We demonstrate that current detection and tracking systems perform dramatically worse on this task.
Second, we build dynamic models that explicitly reason in 3D, making use of observations produced by state-of-the-art monocular depth estimation networks.
- Score: 58.49425715635312
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Monocular object detection and tracking have improved drastically in recent
years, but rely on a key assumption: that objects are visible to the camera.
Many offline tracking approaches reason about occluded objects post-hoc, by
linking together tracklets after the object re-appears, making use of
reidentification (ReID). However, online tracking in embodied robotic agents
(such as a self-driving vehicle) fundamentally requires object permanence,
which is the ability to reason about occluded objects before they re-appear. In
this work, we re-purpose tracking benchmarks and propose new metrics for the
task of detecting invisible objects, focusing on the illustrative case of
people. We demonstrate that current detection and tracking systems perform
dramatically worse on this task. We introduce two key innovations to recover
much of this performance drop. We treat occluded object detection in temporal
sequences as a short-term forecasting challenge, bringing to bear tools from
dynamic sequence prediction. Second, we build dynamic models that explicitly
reason in 3D, making use of observations produced by state-of-the-art monocular
depth estimation networks. To our knowledge, ours is the first work to
demonstrate the effectiveness of monocular depth estimation for the task of
tracking and detecting occluded objects. Our approach strongly improves by
11.4% over the baseline in ablations and by 5.0% over the state-of-the-art in
F1 score.
Related papers
- SeMoLi: What Moves Together Belongs Together [51.72754014130369]
We tackle semi-supervised object detection based on motion cues.
Recent results suggest that motion-based clustering methods can be used to pseudo-label instances of moving objects.
We re-think this approach and suggest that both, object detection, as well as motion-inspired pseudo-labeling, can be tackled in a data-driven manner.
arXiv Detail & Related papers (2024-02-29T18:54:53Z) - Towards Unsupervised Object Detection From LiDAR Point Clouds [46.57452180314863]
OYSTER (Object Discovery via Spatio-Temporal Refinement) is able to detect objects in a zero-shot manner without supervised finetuning.
We propose a new planning-centric perception metric based on distance-to-collision.
arXiv Detail & Related papers (2023-11-03T16:12:01Z) - Once Detected, Never Lost: Surpassing Human Performance in Offline LiDAR
based 3D Object Detection [50.959453059206446]
This paper aims for high-performance offline LiDAR-based 3D object detection.
We first observe that experienced human annotators annotate objects from a track-centric perspective.
We propose a high-performance offline detector in a track-centric perspective instead of the conventional object-centric perspective.
arXiv Detail & Related papers (2023-04-24T17:59:05Z) - DORT: Modeling Dynamic Objects in Recurrent for Multi-Camera 3D Object
Detection and Tracking [67.34803048690428]
We propose to model Dynamic Objects in RecurrenT (DORT) to tackle this problem.
DORT extracts object-wise local volumes for motion estimation that also alleviates the heavy computational burden.
It is flexible and practical that can be plugged into most camera-based 3D object detectors.
arXiv Detail & Related papers (2023-03-29T12:33:55Z) - Long Range Object-Level Monocular Depth Estimation for UAVs [0.0]
We propose several novel extensions to state-of-the-art methods for monocular object detection from images at long range.
Firstly, we propose Sigmoid and ReLU-like encodings when modeling depth estimation as a regression task.
Secondly, we frame the depth estimation as a classification problem and introduce a Soft-Argmax function in the calculation of the training loss.
arXiv Detail & Related papers (2023-02-17T15:26:04Z) - Object Permanence in Object Detection Leveraging Temporal Priors at
Inference Time [11.255962936937744]
We introduce explicit object permanence into two stage detection approaches drawing inspiration from particle filters.
Our detector uses the predictions of previous frames as additional proposals for the current one at inference time.
Experiments confirm the feedback loop improving detection performance by a up to 10.3 mAP with little computational overhead.
arXiv Detail & Related papers (2022-11-28T16:24:08Z) - Learning to Track with Object Permanence [61.36492084090744]
We introduce an end-to-end trainable approach for joint object detection and tracking.
Our model, trained jointly on synthetic and real data, outperforms the state of the art on KITTI, and MOT17 datasets.
arXiv Detail & Related papers (2021-03-26T04:43:04Z) - ArTIST: Autoregressive Trajectory Inpainting and Scoring for Tracking [80.02322563402758]
One of the core components in online multiple object tracking (MOT) frameworks is associating new detections with existing tracklets.
We introduce a probabilistic autoregressive generative model to score tracklet proposals by directly measuring the likelihood that a tracklet represents natural motion.
arXiv Detail & Related papers (2020-04-16T06:43:11Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.