Three-dimensional Human Tracking of a Mobile Robot by Fusion of Tracking
Results of Two Cameras
- URL: http://arxiv.org/abs/2007.01514v1
- Date: Fri, 3 Jul 2020 06:46:49 GMT
- Title: Three-dimensional Human Tracking of a Mobile Robot by Fusion of Tracking
Results of Two Cameras
- Authors: Shinya Matsubara, Akihiko Honda, Yonghoon Ji, Kazunori Umeda
- Abstract summary: OpenPose is used for human detection.
A new stereo vision framework is proposed to cope with the problems.
The effectiveness of the proposed framework and the method is verified through target-tracking experiments.
- Score: 0.860255319568951
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This paper proposes a process that uses two cameras to obtain
three-dimensional (3D) information of a target object for human tracking.
Results of human detection and tracking from two cameras are integrated to
obtain the 3D information. OpenPose is used for human detection. In the case of
a general processing a stereo camera, a range image of the entire scene is
acquired as precisely as possible, and then the range image is processed.
However, there are problems such as incorrect matching and computational cost
for the calibration process. A new stereo vision framework is proposed to cope
with the problems. The effectiveness of the proposed framework and the method
is verified through target-tracking experiments.
Related papers
- MonoSOWA: Scalable monocular 3D Object detector Without human Annotations [0.0]
We present the first method to train 3D object detectors for monocular RGB cameras without domain-specific human annotations.
Thanks to newly proposed Canonical Object Space, the method can not only exploit data across a variety of datasets and camera setups to train a single 3D detector, but unlike previous work it also works out of the box in previously unseen camera setups.
arXiv Detail & Related papers (2025-01-16T11:35:22Z) - SDGE: Stereo Guided Depth Estimation for 360$^\circ$ Camera Sets [65.64958606221069]
Multi-camera systems are often used in autonomous driving to achieve a 360$circ$ perception.
These 360$circ$ camera sets often have limited or low-quality overlap regions, making multi-view stereo methods infeasible for the entire image.
We propose the Stereo Guided Depth Estimation (SGDE) method, which enhances depth estimation of the full image by explicitly utilizing multi-view stereo results on the overlap.
arXiv Detail & Related papers (2024-02-19T02:41:37Z) - Multi-Modal Dataset Acquisition for Photometrically Challenging Object [56.30027922063559]
This paper addresses the limitations of current datasets for 3D vision tasks in terms of accuracy, size, realism, and suitable imaging modalities for photometrically challenging objects.
We propose a novel annotation and acquisition pipeline that enhances existing 3D perception and 6D object pose datasets.
arXiv Detail & Related papers (2023-08-21T10:38:32Z) - 3D Human Pose Estimation in Multi-View Operating Room Videos Using
Differentiable Camera Projections [2.486571221735935]
We propose to directly optimise for localisation in 3D by training 2D CNNs end-to-end based on a 3D loss.
Using videos from the MVOR dataset, we show that this end-to-end approach outperforms optimisation in 2D space.
arXiv Detail & Related papers (2022-10-21T09:00:02Z) - Aerial Monocular 3D Object Detection [67.20369963664314]
DVDET is proposed to achieve aerial monocular 3D object detection in both the 2D image space and the 3D physical space.
To address the severe view deformation issue, we propose a novel trainable geo-deformable transformation module.
To encourage more researchers to investigate this area, we will release the dataset and related code.
arXiv Detail & Related papers (2022-08-08T08:32:56Z) - People Tracking in Panoramic Video for Guiding Robots [2.092922495279074]
A guiding robot aims to effectively bring people to and from specific places within environments that are possibly unknown to them.
During this operation the robot should be able to detect and track the accompanied person, trying never to lose sight of her/him.
A solution to minimize this event is to use an omnidirectional camera: its 360deg Field of View (FoV) guarantees that any framed object cannot leave the FoV if not occluded or very far from the sensor.
We propose a set of targeted methods that allow to effectively adapt to panoramic videos a standard people detection and tracking pipeline originally designed for perspective cameras
arXiv Detail & Related papers (2022-06-06T16:44:38Z) - Rope3D: TheRoadside Perception Dataset for Autonomous Driving and
Monocular 3D Object Detection Task [48.555440807415664]
We present the first high-diversity challenging Roadside Perception 3D dataset- Rope3D from a novel view.
The dataset consists of 50k images and over 1.5M 3D objects in various scenes.
We propose to leverage the geometry constraint to solve the inherent ambiguities caused by various sensors, viewpoints.
arXiv Detail & Related papers (2022-03-25T12:13:23Z) - Monocular 3D Vehicle Detection Using Uncalibrated Traffic Cameras
through Homography [12.062095895630563]
This paper proposes a method to extract the position and pose of vehicles in the 3D world from a single traffic camera.
We observe that the homography between the road plane and the image plane is essential to 3D vehicle detection.
We propose a new regression target called textittailedr-box and a textitdual-view network architecture which boosts the detection accuracy on warped BEV images.
arXiv Detail & Related papers (2021-03-29T02:57:37Z) - Simultaneous Multi-View Camera Pose Estimation and Object Tracking with
Square Planar Markers [0.0]
This work proposes a novel method to simultaneously solve the above-mentioned problems.
From a video sequence showing a rigid set of planar markers recorded from multiple cameras, the proposed method is able to automatically obtain the three-dimensional configuration of the markers.
Once the parameters are obtained, tracking of the object can be done in real time with a low computational cost.
arXiv Detail & Related papers (2021-03-16T15:33:58Z) - PLUME: Efficient 3D Object Detection from Stereo Images [95.31278688164646]
Existing methods tackle the problem in two steps: first depth estimation is performed, a pseudo LiDAR point cloud representation is computed from the depth estimates, and then object detection is performed in 3D space.
We propose a model that unifies these two tasks in the same metric space.
Our approach achieves state-of-the-art performance on the challenging KITTI benchmark, with significantly reduced inference time compared with existing methods.
arXiv Detail & Related papers (2021-01-17T05:11:38Z) - BirdNet+: End-to-End 3D Object Detection in LiDAR Bird's Eye View [117.44028458220427]
On-board 3D object detection in autonomous vehicles often relies on geometry information captured by LiDAR devices.
We present a fully end-to-end 3D object detection framework that can infer oriented 3D boxes solely from BEV images.
arXiv Detail & Related papers (2020-03-09T15:08:40Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.