Single View Physical Distance Estimation using Human Pose
- URL: http://arxiv.org/abs/2106.10335v1
- Date: Fri, 18 Jun 2021 19:50:40 GMT
- Title: Single View Physical Distance Estimation using Human Pose
- Authors: Xiaohan Fei, Henry Wang, Xiangyu Zeng, Lin Lee Cheong, Meng Wang,
Joseph Tighe
- Abstract summary: We propose a fully automated system that simultaneously estimates the camera intrinsics, the ground plane, and physical distances between people from a single RGB image or video.
The proposed approach enables existing camera systems to measure physical distances without needing a dedicated calibration process or range sensors.
We contribute to the publicly available MEVA dataset with additional distance annotations, resulting in MEVADA -- the first evaluation benchmark in the world for the pose-based auto-calibration and distance estimation problem.
- Score: 18.9877515094788
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We propose a fully automated system that simultaneously estimates the camera
intrinsics, the ground plane, and physical distances between people from a
single RGB image or video captured by a camera viewing a 3-D scene from a fixed
vantage point. To automate camera calibration and distance estimation, we
leverage priors about human pose and develop a novel direct formulation for
pose-based auto-calibration and distance estimation, which shows
state-of-the-art performance on publicly available datasets. The proposed
approach enables existing camera systems to measure physical distances without
needing a dedicated calibration process or range sensors, and is applicable to
a broad range of use cases such as social distancing and workplace safety.
Furthermore, to enable evaluation and drive research in this area, we
contribute to the publicly available MEVA dataset with additional distance
annotations, resulting in MEVADA -- the first evaluation benchmark in the world
for the pose-based auto-calibration and distance estimation problem.
Related papers
- GenDepth: Generalizing Monocular Depth Estimation for Arbitrary Camera
Parameters via Ground Plane Embedding [8.289857214449372]
GenDepth is a novel model capable of performing metric depth estimation for arbitrary vehicle-camera setups.
We propose a novel embedding of camera parameters as the ground plane depth and present a novel architecture that integrates these embeddings with adversarial domain alignment.
We validate GenDepth on several autonomous driving datasets, demonstrating its state-of-the-art generalization capability for different vehicle-camera systems.
arXiv Detail & Related papers (2023-12-10T22:28:34Z) - Multimodal Active Measurement for Human Mesh Recovery in Close Proximity [13.265259738826302]
In physical human-robot interactions, a robot needs to estimate the accurate body pose of a target person.
In these pHRI scenarios, the robot cannot fully observe the target person's body with equipped cameras because the target person must be close to the robot for physical interaction.
We propose an active measurement and sensor fusion framework of the equipped cameras with touch and ranging sensors such as 2D LiDAR.
arXiv Detail & Related papers (2023-10-12T08:17:57Z) - Scene-Aware 3D Multi-Human Motion Capture from a Single Camera [83.06768487435818]
We consider the problem of estimating the 3D position of multiple humans in a scene as well as their body shape and articulation from a single RGB video recorded with a static camera.
We leverage recent advances in computer vision using large-scale pre-trained models for a variety of modalities, including 2D body joints, joint angles, normalized disparity maps, and human segmentation masks.
In particular, we estimate the scene depth and unique person scale from normalized disparity predictions using the 2D body joints and joint angles.
arXiv Detail & Related papers (2023-01-12T18:01:28Z) - Extrinsic Camera Calibration with Semantic Segmentation [60.330549990863624]
We present an extrinsic camera calibration approach that automatizes the parameter estimation by utilizing semantic segmentation information.
Our approach relies on a coarse initial measurement of the camera pose and builds on lidar sensors mounted on a vehicle.
We evaluate our method on simulated and real-world data to demonstrate low error measurements in the calibration results.
arXiv Detail & Related papers (2022-08-08T07:25:03Z) - Embodied Scene-aware Human Pose Estimation [25.094152307452]
We propose embodied scene-aware human pose estimation.
Our method is one stage, causal, and recovers global 3D human poses in a simulated environment.
arXiv Detail & Related papers (2022-06-18T03:50:19Z) - A Quality Index Metric and Method for Online Self-Assessment of
Autonomous Vehicles Sensory Perception [164.93739293097605]
We propose a novel evaluation metric, named as the detection quality index (DQI), which assesses the performance of camera-based object detection algorithms.
We have developed a superpixel-based attention network (SPA-NET) that utilizes raw image pixels and superpixels as input to predict the proposed DQI evaluation metric.
arXiv Detail & Related papers (2022-03-04T22:16:50Z) - A Critical Analysis of Image-based Camera Pose Estimation Techniques [18.566761146552537]
Camera localization could benefit many computer vision fields, such as autonomous driving, robot navigation, and augmented reality (AR)
In this survey, we first introduce specific application areas and the evaluation metrics for camera localization pose according to different sub-tasks.
arXiv Detail & Related papers (2022-01-15T09:57:45Z) - TRiPOD: Human Trajectory and Pose Dynamics Forecasting in the Wild [77.59069361196404]
TRiPOD is a novel method for predicting body dynamics based on graph attentional networks.
To incorporate a real-world challenge, we learn an indicator representing whether an estimated body joint is visible/invisible at each frame.
Our evaluation shows that TRiPOD outperforms all prior work and state-of-the-art specifically designed for each of the trajectory and pose forecasting tasks.
arXiv Detail & Related papers (2021-04-08T20:01:00Z) - Self-supervised Human Detection and Segmentation via Multi-view
Consensus [116.92405645348185]
We propose a multi-camera framework in which geometric constraints are embedded in the form of multi-view consistency during training.
We show that our approach outperforms state-of-the-art self-supervised person detection and segmentation techniques on images that visually depart from those of standard benchmarks.
arXiv Detail & Related papers (2020-12-09T15:47:21Z) - Single Image Human Proxemics Estimation for Visual Social Distancing [37.84559773949066]
We propose a semi-automatic solution to approximate the homography matrix between the scene ground and image plane.
We then leverage an off-the-shelf pose detector to detect body poses on the image and to reason upon their inter-personal distances.
arXiv Detail & Related papers (2020-11-03T21:49:13Z) - Single View Metrology in the Wild [94.7005246862618]
We present a novel approach to single view metrology that can recover the absolute scale of a scene represented by 3D heights of objects or camera height above the ground.
Our method relies on data-driven priors learned by a deep network specifically designed to imbibe weakly supervised constraints from the interplay of the unknown camera with 3D entities such as object heights.
We demonstrate state-of-the-art qualitative and quantitative results on several datasets as well as applications including virtual object insertion.
arXiv Detail & Related papers (2020-07-18T22:31:33Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.