View Invariant Human Body Detection and Pose Estimation from Multiple
Depth Sensors
- URL: http://arxiv.org/abs/2005.04258v1
- Date: Fri, 8 May 2020 19:06:28 GMT
- Title: View Invariant Human Body Detection and Pose Estimation from Multiple
Depth Sensors
- Authors: Walid Bekhtaoui, Ruhan Sa, Brian Teixeira, Vivek Singh, Klaus
Kirchberg, Yao-jen Chang, Ankur Kapoor
- Abstract summary: We propose an end-to-end multi-person 3D pose estimation network, Point R-CNN, using multiple point cloud sources.
We conduct extensive experiments to simulate challenging real world cases, such as individual camera failures, various target appearances, and complex cluttered scenes.
In the meantime, we show our end-to-end network greatly outperforms cascaded state-of-the-art models.
- Score: 0.7080990243618376
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Point cloud based methods have produced promising results in areas such as 3D
object detection in autonomous driving. However, most of the recent point cloud
work focuses on single depth sensor data, whereas less work has been done on
indoor monitoring applications, such as operation room monitoring in hospitals
or indoor surveillance. In these scenarios multiple cameras are often used to
tackle occlusion problems. We propose an end-to-end multi-person 3D pose
estimation network, Point R-CNN, using multiple point cloud sources. We conduct
extensive experiments to simulate challenging real world cases, such as
individual camera failures, various target appearances, and complex cluttered
scenes with the CMU panoptic dataset and the MVOR operation room dataset.
Unlike most of the previous methods that attempt to use multiple sensor
information by building complex fusion models, which often lead to poor
generalization, we take advantage of the efficiency of concatenating point
clouds to fuse the information at the input level. In the meantime, we show our
end-to-end network greatly outperforms cascaded state-of-the-art models.
Related papers
- One for All: Multi-Domain Joint Training for Point Cloud Based 3D Object Detection [71.78795573911512]
We propose textbfOneDet3D, a universal one-for-all model that addresses 3D detection across different domains.
We propose the domain-aware in scatter and context, guided by a routing mechanism, to address the data interference issue.
The fully sparse structure and anchor-free head further accommodate point clouds with significant scale disparities.
arXiv Detail & Related papers (2024-11-03T14:21:56Z) - Joint object detection and re-identification for 3D obstacle
multi-camera systems [47.87501281561605]
This research paper introduces a novel modification to an object detection network that uses camera and lidar information.
It incorporates an additional branch designed for the task of re-identifying objects across adjacent cameras within the same vehicle.
The results underscore the superiority of this method over traditional Non-Maximum Suppression (NMS) techniques.
arXiv Detail & Related papers (2023-10-09T15:16:35Z) - Multi-Camera Multi-Object Tracking on the Move via Single-Stage Global
Association Approach [23.960847268459293]
This work introduces novel Single-Stage Global Association Tracking approaches to associate one or more detection from multi-cameras with tracked objects.
Our models also improve the detection accuracy of the standard vision-based 3D object detectors in the nuScenes detection challenge.
arXiv Detail & Related papers (2022-11-17T17:03:24Z) - Semantic keypoint extraction for scanned animals using
multi-depth-camera systems [2.513785998932353]
Keypoint annotation in point clouds is an important task for 3D reconstruction, object tracking and alignment.
In the context of agriculture, it is a critical task for livestock automation to work toward condition assessment or behaviour recognition.
We propose a novel approach for semantic keypoint annotation in point clouds, by reformulating the keypoint extraction as a regression problem.
Our method is tested on data collected in the field, on moving beef cattle, with a calibrated system of multiple hardware-synchronised RGB-D cameras.
arXiv Detail & Related papers (2022-11-16T03:06:17Z) - AGO-Net: Association-Guided 3D Point Cloud Object Detection Network [86.10213302724085]
We propose a novel 3D detection framework that associates intact features for objects via domain adaptation.
We achieve new state-of-the-art performance on the KITTI 3D detection benchmark in both accuracy and speed.
arXiv Detail & Related papers (2022-08-24T16:54:38Z) - SurroundDepth: Entangling Surrounding Views for Self-Supervised
Multi-Camera Depth Estimation [101.55622133406446]
We propose a SurroundDepth method to incorporate the information from multiple surrounding views to predict depth maps across cameras.
Specifically, we employ a joint network to process all the surrounding views and propose a cross-view transformer to effectively fuse the information from multiple views.
In experiments, our method achieves the state-of-the-art performance on the challenging multi-camera depth estimation datasets.
arXiv Detail & Related papers (2022-04-07T17:58:47Z) - Self-supervised Human Detection and Segmentation via Multi-view
Consensus [116.92405645348185]
We propose a multi-camera framework in which geometric constraints are embedded in the form of multi-view consistency during training.
We show that our approach outperforms state-of-the-art self-supervised person detection and segmentation techniques on images that visually depart from those of standard benchmarks.
arXiv Detail & Related papers (2020-12-09T15:47:21Z) - RoIFusion: 3D Object Detection from LiDAR and Vision [7.878027048763662]
We propose a novel fusion algorithm by projecting a set of 3D Region of Interests (RoIs) from the point clouds to the 2D RoIs of the corresponding the images.
Our approach achieves state-of-the-art performance on the KITTI 3D object detection challenging benchmark.
arXiv Detail & Related papers (2020-09-09T20:23:27Z) - siaNMS: Non-Maximum Suppression with Siamese Networks for Multi-Camera
3D Object Detection [65.03384167873564]
A siamese network is integrated into the pipeline of a well-known 3D object detector approach.
associations are exploited to enhance the 3D box regression of the object.
The experimental evaluation on the nuScenes dataset shows that the proposed method outperforms traditional NMS approaches.
arXiv Detail & Related papers (2020-02-19T15:32:38Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.