Three-dimensional Human Tracking of a Mobile Robot by Fusion of Tracking
Results of Two Cameras
- URL: http://arxiv.org/abs/2007.01514v1
- Date: Fri, 3 Jul 2020 06:46:49 GMT
- Title: Three-dimensional Human Tracking of a Mobile Robot by Fusion of Tracking
Results of Two Cameras
- Authors: Shinya Matsubara, Akihiko Honda, Yonghoon Ji, Kazunori Umeda
- Abstract summary: OpenPose is used for human detection.
A new stereo vision framework is proposed to cope with the problems.
The effectiveness of the proposed framework and the method is verified through target-tracking experiments.
- Score: 0.860255319568951
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This paper proposes a process that uses two cameras to obtain
three-dimensional (3D) information of a target object for human tracking.
Results of human detection and tracking from two cameras are integrated to
obtain the 3D information. OpenPose is used for human detection. In the case of
a general processing a stereo camera, a range image of the entire scene is
acquired as precisely as possible, and then the range image is processed.
However, there are problems such as incorrect matching and computational cost
for the calibration process. A new stereo vision framework is proposed to cope
with the problems. The effectiveness of the proposed framework and the method
is verified through target-tracking experiments.
Related papers
- SDGE: Stereo Guided Depth Estimation for 360$^\circ$ Camera Sets [65.64958606221069]
Multi-camera systems are often used in autonomous driving to achieve a 360$circ$ perception.
These 360$circ$ camera sets often have limited or low-quality overlap regions, making multi-view stereo methods infeasible for the entire image.
We propose the Stereo Guided Depth Estimation (SGDE) method, which enhances depth estimation of the full image by explicitly utilizing multi-view stereo results on the overlap.
arXiv Detail & Related papers (2024-02-19T02:41:37Z) - Multi-Modal Dataset Acquisition for Photometrically Challenging Object [56.30027922063559]
This paper addresses the limitations of current datasets for 3D vision tasks in terms of accuracy, size, realism, and suitable imaging modalities for photometrically challenging objects.
We propose a novel annotation and acquisition pipeline that enhances existing 3D perception and 6D object pose datasets.
arXiv Detail & Related papers (2023-08-21T10:38:32Z) - 3D Data Augmentation for Driving Scenes on Camera [50.41413053812315]
We propose a 3D data augmentation approach termed Drive-3DAug, aiming at augmenting the driving scenes on camera in the 3D space.
We first utilize Neural Radiance Field (NeRF) to reconstruct the 3D models of background and foreground objects.
Then, augmented driving scenes can be obtained by placing the 3D objects with adapted location and orientation at the pre-defined valid region of backgrounds.
arXiv Detail & Related papers (2023-03-18T05:51:05Z) - 3D Human Pose Estimation in Multi-View Operating Room Videos Using
Differentiable Camera Projections [2.486571221735935]
We propose to directly optimise for localisation in 3D by training 2D CNNs end-to-end based on a 3D loss.
Using videos from the MVOR dataset, we show that this end-to-end approach outperforms optimisation in 2D space.
arXiv Detail & Related papers (2022-10-21T09:00:02Z) - People Tracking in Panoramic Video for Guiding Robots [2.092922495279074]
A guiding robot aims to effectively bring people to and from specific places within environments that are possibly unknown to them.
During this operation the robot should be able to detect and track the accompanied person, trying never to lose sight of her/him.
A solution to minimize this event is to use an omnidirectional camera: its 360deg Field of View (FoV) guarantees that any framed object cannot leave the FoV if not occluded or very far from the sensor.
We propose a set of targeted methods that allow to effectively adapt to panoramic videos a standard people detection and tracking pipeline originally designed for perspective cameras
arXiv Detail & Related papers (2022-06-06T16:44:38Z) - Rope3D: TheRoadside Perception Dataset for Autonomous Driving and
Monocular 3D Object Detection Task [48.555440807415664]
We present the first high-diversity challenging Roadside Perception 3D dataset- Rope3D from a novel view.
The dataset consists of 50k images and over 1.5M 3D objects in various scenes.
We propose to leverage the geometry constraint to solve the inherent ambiguities caused by various sensors, viewpoints.
arXiv Detail & Related papers (2022-03-25T12:13:23Z) - Monocular 3D Vehicle Detection Using Uncalibrated Traffic Cameras
through Homography [12.062095895630563]
This paper proposes a method to extract the position and pose of vehicles in the 3D world from a single traffic camera.
We observe that the homography between the road plane and the image plane is essential to 3D vehicle detection.
We propose a new regression target called textittailedr-box and a textitdual-view network architecture which boosts the detection accuracy on warped BEV images.
arXiv Detail & Related papers (2021-03-29T02:57:37Z) - Simultaneous Multi-View Camera Pose Estimation and Object Tracking with
Square Planar Markers [0.0]
This work proposes a novel method to simultaneously solve the above-mentioned problems.
From a video sequence showing a rigid set of planar markers recorded from multiple cameras, the proposed method is able to automatically obtain the three-dimensional configuration of the markers.
Once the parameters are obtained, tracking of the object can be done in real time with a low computational cost.
arXiv Detail & Related papers (2021-03-16T15:33:58Z) - PLUME: Efficient 3D Object Detection from Stereo Images [95.31278688164646]
Existing methods tackle the problem in two steps: first depth estimation is performed, a pseudo LiDAR point cloud representation is computed from the depth estimates, and then object detection is performed in 3D space.
We propose a model that unifies these two tasks in the same metric space.
Our approach achieves state-of-the-art performance on the challenging KITTI benchmark, with significantly reduced inference time compared with existing methods.
arXiv Detail & Related papers (2021-01-17T05:11:38Z) - Exploring Severe Occlusion: Multi-Person 3D Pose Estimation with Gated
Convolution [34.301501457959056]
We propose a temporal regression network with a gated convolution module to transform 2D joints to 3D.
A simple yet effective localization approach is also conducted to transform the normalized pose to the global trajectory.
Our proposed method outperforms most state-of-the-art 2D-to-3D pose estimation methods.
arXiv Detail & Related papers (2020-10-31T04:35:24Z) - BirdNet+: End-to-End 3D Object Detection in LiDAR Bird's Eye View [117.44028458220427]
On-board 3D object detection in autonomous vehicles often relies on geometry information captured by LiDAR devices.
We present a fully end-to-end 3D object detection framework that can infer oriented 3D boxes solely from BEV images.
arXiv Detail & Related papers (2020-03-09T15:08:40Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.