Related papers: Three-dimensional Human Tracking of a Mobile Robot by Fusion of Tracking Results of Two Cameras

Three-dimensional Human Tracking of a Mobile Robot by Fusion of Tracking Results of Two Cameras

URL: http://arxiv.org/abs/2007.01514v1
Date: Fri, 3 Jul 2020 06:46:49 GMT
Title: Three-dimensional Human Tracking of a Mobile Robot by Fusion of Tracking Results of Two Cameras
Authors: Shinya Matsubara, Akihiko Honda, Yonghoon Ji, Kazunori Umeda
Abstract summary: OpenPose is used for human detection. A new stereo vision framework is proposed to cope with the problems. The effectiveness of the proposed framework and the method is verified through target-tracking experiments.
Score: 0.860255319568951
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: This paper proposes a process that uses two cameras to obtain three-dimensional (3D) information of a target object for human tracking. Results of human detection and tracking from two cameras are integrated to obtain the 3D information. OpenPose is used for human detection. In the case of a general processing a stereo camera, a range image of the entire scene is acquired as precisely as possible, and then the range image is processed. However, there are problems such as incorrect matching and computational cost for the calibration process. A new stereo vision framework is proposed to cope with the problems. The effectiveness of the proposed framework and the method is verified through target-tracking experiments.

Related papers

SDGE: Stereo Guided Depth Estimation for 360$^\circ$ Camera Sets [65.64958606221069]
Multi-camera systems are often used in autonomous driving to achieve a 360$circ$ perception. These 360$circ$ camera sets often have limited or low-quality overlap regions, making multi-view stereo methods infeasible for the entire image. We propose the Stereo Guided Depth Estimation (SGDE) method, which enhances depth estimation of the full image by explicitly utilizing multi-view stereo results on the overlap.
arXiv Detail & Related papers (2024-02-19T02:41:37Z)
Multi-Modal Dataset Acquisition for Photometrically Challenging Object [56.30027922063559]
This paper addresses the limitations of current datasets for 3D vision tasks in terms of accuracy, size, realism, and suitable imaging modalities for photometrically challenging objects. We propose a novel annotation and acquisition pipeline that enhances existing 3D perception and 6D object pose datasets.
arXiv Detail & Related papers (2023-08-21T10:38:32Z)
3D Data Augmentation for Driving Scenes on Camera [50.41413053812315]
We propose a 3D data augmentation approach termed Drive-3DAug, aiming at augmenting the driving scenes on camera in the 3D space. We first utilize Neural Radiance Field (NeRF) to reconstruct the 3D models of background and foreground objects. Then, augmented driving scenes can be obtained by placing the 3D objects with adapted location and orientation at the pre-defined valid region of backgrounds.
arXiv Detail & Related papers (2023-03-18T05:51:05Z)
3D Human Pose Estimation in Multi-View Operating Room Videos Using Differentiable Camera Projections [2.486571221735935]
We propose to directly optimise for localisation in 3D by training 2D CNNs end-to-end based on a 3D loss. Using videos from the MVOR dataset, we show that this end-to-end approach outperforms optimisation in 2D space.
arXiv Detail & Related papers (2022-10-21T09:00:02Z)
Aerial Monocular 3D Object Detection [67.20369963664314]
DVDET is proposed to achieve aerial monocular 3D object detection in both the 2D image space and the 3D physical space. To address the severe view deformation issue, we propose a novel trainable geo-deformable transformation module. To encourage more researchers to investigate this area, we will release the dataset and related code.
arXiv Detail & Related papers (2022-08-08T08:32:56Z)
People Tracking in Panoramic Video for Guiding Robots [2.092922495279074]
A guiding robot aims to effectively bring people to and from specific places within environments that are possibly unknown to them. During this operation the robot should be able to detect and track the accompanied person, trying never to lose sight of her/him. A solution to minimize this event is to use an omnidirectional camera: its 360deg Field of View (FoV) guarantees that any framed object cannot leave the FoV if not occluded or very far from the sensor. We propose a set of targeted methods that allow to effectively adapt to panoramic videos a standard people detection and tracking pipeline originally designed for perspective cameras
arXiv Detail & Related papers (2022-06-06T16:44:38Z)
Rope3D: TheRoadside Perception Dataset for Autonomous Driving and Monocular 3D Object Detection Task [48.555440807415664]
We present the first high-diversity challenging Roadside Perception 3D dataset- Rope3D from a novel view. The dataset consists of 50k images and over 1.5M 3D objects in various scenes. We propose to leverage the geometry constraint to solve the inherent ambiguities caused by various sensors, viewpoints.
arXiv Detail & Related papers (2022-03-25T12:13:23Z)
Monocular 3D Vehicle Detection Using Uncalibrated Traffic Cameras through Homography [12.062095895630563]
This paper proposes a method to extract the position and pose of vehicles in the 3D world from a single traffic camera. We observe that the homography between the road plane and the image plane is essential to 3D vehicle detection. We propose a new regression target called textittailedr-box and a textitdual-view network architecture which boosts the detection accuracy on warped BEV images.
arXiv Detail & Related papers (2021-03-29T02:57:37Z)
Simultaneous Multi-View Camera Pose Estimation and Object Tracking with Square Planar Markers [0.0]
This work proposes a novel method to simultaneously solve the above-mentioned problems. From a video sequence showing a rigid set of planar markers recorded from multiple cameras, the proposed method is able to automatically obtain the three-dimensional configuration of the markers. Once the parameters are obtained, tracking of the object can be done in real time with a low computational cost.
arXiv Detail & Related papers (2021-03-16T15:33:58Z)
PLUME: Efficient 3D Object Detection from Stereo Images [95.31278688164646]
Existing methods tackle the problem in two steps: first depth estimation is performed, a pseudo LiDAR point cloud representation is computed from the depth estimates, and then object detection is performed in 3D space. We propose a model that unifies these two tasks in the same metric space. Our approach achieves state-of-the-art performance on the challenging KITTI benchmark, with significantly reduced inference time compared with existing methods.
arXiv Detail & Related papers (2021-01-17T05:11:38Z)
Exploring Severe Occlusion: Multi-Person 3D Pose Estimation with Gated Convolution [34.301501457959056]
We propose a temporal regression network with a gated convolution module to transform 2D joints to 3D. A simple yet effective localization approach is also conducted to transform the normalized pose to the global trajectory. Our proposed method outperforms most state-of-the-art 2D-to-3D pose estimation methods.
arXiv Detail & Related papers (2020-10-31T04:35:24Z)
BirdNet+: End-to-End 3D Object Detection in LiDAR Bird's Eye View [117.44028458220427]
On-board 3D object detection in autonomous vehicles often relies on geometry information captured by LiDAR devices. We present a fully end-to-end 3D object detection framework that can infer oriented 3D boxes solely from BEV images.
arXiv Detail & Related papers (2020-03-09T15:08:40Z)

This list is automatically generated from the titles and abstracts of the papers in this site.