A Bayesian Filter for Multi-view 3D Multi-object Tracking with Occlusion
Handling
- URL: http://arxiv.org/abs/2001.04118v4
- Date: Tue, 27 Oct 2020 10:06:49 GMT
- Title: A Bayesian Filter for Multi-view 3D Multi-object Tracking with Occlusion
Handling
- Authors: Jonah Ong, Ba Tuong Vo, Ba Ngu Vo, Du Yong Kim, Sven Nordholm
- Abstract summary: The proposed algorithm has a linear complexity in the total number of detections across the cameras.
It operates in the 3D world frame, and provides 3D trajectory estimates of the objects.
The proposed algorithm is evaluated on the latest WILDTRACKS dataset, and demonstrated to work in very crowded scenes.
- Score: 2.824395407508717
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This paper proposes an online multi-camera multi-object tracker that only
requires monocular detector training, independent of the multi-camera
configurations, allowing seamless extension/deletion of cameras without
retraining effort. The proposed algorithm has a linear complexity in the total
number of detections across the cameras, and hence scales gracefully with the
number of cameras. It operates in the 3D world frame, and provides 3D
trajectory estimates of the objects. The key innovation is a high fidelity yet
tractable 3D occlusion model, amenable to optimal Bayesian multi-view
multi-object filtering, which seamlessly integrates, into a single Bayesian
recursion, the sub-tasks of track management, state estimation, clutter
rejection, and occlusion/misdetection handling. The proposed algorithm is
evaluated on the latest WILDTRACKS dataset, and demonstrated to work in very
crowded scenes on a new dataset.
Related papers
- RockTrack: A 3D Robust Multi-Camera-Ken Multi-Object Tracking Framework [28.359633046753228]
We propose RockTrack, a 3D MOT method for multi-camera detectors.
RockTrack incorporates a confidence-guided preprocessing module to extract reliable motion and image observations.
RockTrack achieves state-of-the-art performance on the nuScenes vision-only tracking leaderboard with 59.1% AMOTA.
arXiv Detail & Related papers (2024-09-18T07:08:08Z) - Track Initialization and Re-Identification for~3D Multi-View Multi-Object Tracking [12.389483990547223]
We propose a 3D multi-object tracking (MOT) solution using only 2D detections from monocular cameras.
We exploit the 2D detections and extracted features from multiple cameras to provide a better approximation of the multi-object filtering density.
arXiv Detail & Related papers (2024-05-28T21:36:16Z) - Multi-Modal Dataset Acquisition for Photometrically Challenging Object [56.30027922063559]
This paper addresses the limitations of current datasets for 3D vision tasks in terms of accuracy, size, realism, and suitable imaging modalities for photometrically challenging objects.
We propose a novel annotation and acquisition pipeline that enhances existing 3D perception and 6D object pose datasets.
arXiv Detail & Related papers (2023-08-21T10:38:32Z) - Scatter Points in Space: 3D Detection from Multi-view Monocular Images [8.71944437852952]
3D object detection from monocular image(s) is a challenging and long-standing problem of computer vision.
Recent methods tend to aggregate multiview feature by sampling regular 3D grid densely in space.
We propose a learnable keypoints sampling method, which scatters pseudo surface points in 3D space, in order to keep data sparsity.
arXiv Detail & Related papers (2022-08-31T09:38:05Z) - A Simple Baseline for Multi-Camera 3D Object Detection [94.63944826540491]
3D object detection with surrounding cameras has been a promising direction for autonomous driving.
We present SimMOD, a Simple baseline for Multi-camera Object Detection.
We conduct extensive experiments on the 3D object detection benchmark of nuScenes to demonstrate the effectiveness of SimMOD.
arXiv Detail & Related papers (2022-08-22T03:38:01Z) - MetaPose: Fast 3D Pose from Multiple Views without 3D Supervision [72.5863451123577]
We show how to train a neural model that can perform accurate 3D pose and camera estimation.
Our method outperforms both classical bundle adjustment and weakly-supervised monocular 3D baselines.
arXiv Detail & Related papers (2021-08-10T18:39:56Z) - Multi-View Multi-Person 3D Pose Estimation with Plane Sweep Stereo [71.59494156155309]
Existing approaches for multi-view 3D pose estimation explicitly establish cross-view correspondences to group 2D pose detections from multiple camera views.
We present our multi-view 3D pose estimation approach based on plane sweep stereo to jointly address the cross-view fusion and 3D pose reconstruction in a single shot.
arXiv Detail & Related papers (2021-04-06T03:49:35Z) - Reinforced Axial Refinement Network for Monocular 3D Object Detection [160.34246529816085]
Monocular 3D object detection aims to extract the 3D position and properties of objects from a 2D input image.
Conventional approaches sample 3D bounding boxes from the space and infer the relationship between the target object and each of them, however, the probability of effective samples is relatively small in the 3D space.
We propose to start with an initial prediction and refine it gradually towards the ground truth, with only one 3d parameter changed in each step.
This requires designing a policy which gets a reward after several steps, and thus we adopt reinforcement learning to optimize it.
arXiv Detail & Related papers (2020-08-31T17:10:48Z) - siaNMS: Non-Maximum Suppression with Siamese Networks for Multi-Camera
3D Object Detection [65.03384167873564]
A siamese network is integrated into the pipeline of a well-known 3D object detector approach.
associations are exploited to enhance the 3D box regression of the object.
The experimental evaluation on the nuScenes dataset shows that the proposed method outperforms traditional NMS approaches.
arXiv Detail & Related papers (2020-02-19T15:32:38Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.