Perceiving Humans: from Monocular 3D Localization to Social Distancing
- URL: http://arxiv.org/abs/2009.00984v2
- Date: Wed, 24 Mar 2021 10:30:19 GMT
- Title: Perceiving Humans: from Monocular 3D Localization to Social Distancing
- Authors: Lorenzo Bertoni, Sven Kreiss, Alexandre Alahi
- Abstract summary: We present a new cost-effective vision-based method that perceives humans' locations in 3D and their body orientation from a single image.
We show that it is possible to rethink the concept of "social distancing" as a form of social interaction in contrast to a simple location-based rule.
- Score: 93.03056743850141
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Perceiving humans in the context of Intelligent Transportation Systems (ITS)
often relies on multiple cameras or expensive LiDAR sensors. In this work, we
present a new cost-effective vision-based method that perceives humans'
locations in 3D and their body orientation from a single image. We address the
challenges related to the ill-posed monocular 3D tasks by proposing a neural
network architecture that predicts confidence intervals in contrast to point
estimates. Our neural network estimates human 3D body locations and their
orientation with a measure of uncertainty. Our proposed solution (i) is
privacy-safe, (ii) works with any fixed or moving cameras, and (iii) does not
rely on ground plane estimation. We demonstrate the performance of our method
with respect to three applications: locating humans in 3D, detecting social
interactions, and verifying the compliance of recent safety measures due to the
COVID-19 outbreak. We show that it is possible to rethink the concept of
"social distancing" as a form of social interaction in contrast to a simple
location-based rule. We publicly share the source code towards an open science
mission.
Related papers
- Social EgoMesh Estimation [7.021561988248192]
We propose a novel framework for Socialcentric Estimation of body MEshes (SEE-ME)
Our approach is the first to estimate the wearer's mesh using only a latent probabilistic diffusion model.
Overall, SEE-ME surpasses the current best technique, reducing the pose estimation error (MPJPE) by 53%.
arXiv Detail & Related papers (2024-11-07T10:28:49Z) - Social-Transmotion: Promptable Human Trajectory Prediction [65.80068316170613]
Social-Transmotion is a generic Transformer-based model that exploits diverse and numerous visual cues to predict human behavior.
Our approach is validated on multiple datasets, including JTA, JRDB, Pedestrians and Cyclists in Road Traffic, and ETH-UCY.
arXiv Detail & Related papers (2023-12-26T18:56:49Z) - Generative Proxemics: A Prior for 3D Social Interaction from Images [32.547187575678464]
Social interaction is a fundamental aspect of human behavior and communication.
We present a novel approach that learns a prior over the 3D proxemics two people in close social interaction.
Our approach recovers accurate and plausible 3D social interactions from noisy initial estimates, outperforming state-of-the-art methods.
arXiv Detail & Related papers (2023-06-15T17:59:20Z) - ScanERU: Interactive 3D Visual Grounding based on Embodied Reference
Understanding [67.21613160846299]
Embodied Reference Understanding (ERU) is first designed for this concern.
New dataset called ScanERU is constructed to evaluate the effectiveness of this idea.
arXiv Detail & Related papers (2023-03-23T11:36:14Z) - Semi-Perspective Decoupled Heatmaps for 3D Robot Pose Estimation from
Depth Maps [66.24554680709417]
Knowing the exact 3D location of workers and robots in a collaborative environment enables several real applications.
We propose a non-invasive framework based on depth devices and deep neural networks to estimate the 3D pose of robots from an external camera.
arXiv Detail & Related papers (2022-07-06T08:52:12Z) - Dual networks based 3D Multi-Person Pose Estimation from Monocular Video [42.01876518017639]
Multi-person 3D pose estimation is more challenging than single pose estimation.
Existing top-down and bottom-up approaches to pose estimation suffer from detection errors.
We propose the integration of top-down and bottom-up approaches to exploit their strengths.
arXiv Detail & Related papers (2022-05-02T08:53:38Z) - Monocular 3D Multi-Person Pose Estimation by Integrating Top-Down and
Bottom-Up Networks [33.974241749058585]
Multi-person pose estimation can cause human detection to be erroneous and human-joints grouping to be unreliable.
Existing top-down methods rely on human detection and thus suffer from these problems.
We propose the integration of top-down and bottom-up approaches to exploit their strengths.
arXiv Detail & Related papers (2021-04-05T07:05:21Z) - Human POSEitioning System (HPS): 3D Human Pose Estimation and
Self-localization in Large Scenes from Body-Mounted Sensors [71.29186299435423]
We introduce (HPS) Human POSEitioning System, a method to recover the full 3D pose of a human registered with a 3D scan of the surrounding environment.
We show that our optimization-based integration exploits the benefits of the two, resulting in pose accuracy free of drift.
HPS could be used for VR/AR applications where humans interact with the scene without requiring direct line of sight with an external camera.
arXiv Detail & Related papers (2021-03-31T17:58:31Z) - MonStereo: When Monocular and Stereo Meet at the Tail of 3D Human
Localization [89.71926844164268]
We propose a novel unified learning framework that leverages the strengths of both monocular and stereo cues for 3D human localization.
Our method associates humans in left-right images, (ii) deals with occluded and distant cases in stereo settings, and (iii) tackles the intrinsic ambiguity of monocular perspective projection.
arXiv Detail & Related papers (2020-08-25T09:47:58Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.