PresSim: An End-to-end Framework for Dynamic Ground Pressure Profile
Generation from Monocular Videos Using Physics-based 3D Simulation
- URL: http://arxiv.org/abs/2302.00391v1
- Date: Wed, 1 Feb 2023 12:02:04 GMT
- Title: PresSim: An End-to-end Framework for Dynamic Ground Pressure Profile
Generation from Monocular Videos Using Physics-based 3D Simulation
- Authors: Lala Shakti Swarup Ray, Bo Zhou, Sungho Suh, Paul Lukowicz
- Abstract summary: Ground pressure exerted by the human body is a valuable source of information for human activity recognition (HAR) in pervasive sensing.
We present a novel end-to-end framework, PresSim, to synthesize sensor data from videos of human activities to reduce such effort significantly.
- Score: 8.107762252448195
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Ground pressure exerted by the human body is a valuable source of information
for human activity recognition (HAR) in unobtrusive pervasive sensing. While
data collection from pressure sensors to develop HAR solutions requires
significant resources and effort, we present a novel end-to-end framework,
PresSim, to synthesize sensor data from videos of human activities to reduce
such effort significantly. PresSim adopts a 3-stage process: first, extract the
3D activity information from videos with computer vision architectures; then
simulate the floor mesh deformation profiles based on the 3D activity
information and gravity-included physics simulation; lastly, generate the
simulated pressure sensor data with deep learning models. We explored two
approaches for the 3D activity information: inverse kinematics with mesh
re-targeting, and volumetric pose and shape estimation. We validated PresSim
with an experimental setup with a monocular camera to provide input and a
pressure-sensing fitness mat (80x28 spatial resolution) to provide the sensor
ground truth, where nine participants performed a set of predefined yoga
sequences.
Related papers
- DreamPhysics: Learning Physical Properties of Dynamic 3D Gaussians with Video Diffusion Priors [77.34056839349076]
We propose DreamPhysics, which estimates physical properties of 3D Gaussian Splatting with video diffusion priors.
Based on a material point method simulator with proper physical parameters, our method can generate 4D content with realistic motions.
arXiv Detail & Related papers (2024-06-03T16:05:25Z) - DO3D: Self-supervised Learning of Decomposed Object-aware 3D Motion and
Depth from Monocular Videos [76.01906393673897]
We propose a self-supervised method to jointly learn 3D motion and depth from monocular videos.
Our system contains a depth estimation module to predict depth, and a new decomposed object-wise 3D motion (DO3D) estimation module to predict ego-motion and 3D object motion.
Our model delivers superior performance in all evaluated settings.
arXiv Detail & Related papers (2024-03-09T12:22:46Z) - PressureTransferNet: Human Attribute Guided Dynamic Ground Pressure
Profile Transfer using 3D simulated Pressure Maps [7.421780713537146]
PressureTransferNet is an encoder-decoder model taking a source pressure map and a target human attribute vector as inputs.
We use a sensor simulation to create a diverse dataset with various human attributes and pressure profiles.
We visually confirm the fidelity of the synthesized pressure shapes using a physics-based deep learning model and achieve a binary R-square value of 0.79 on areas with ground contact.
arXiv Detail & Related papers (2023-08-01T13:31:25Z) - Development of a Realistic Crowd Simulation Environment for Fine-grained
Validation of People Tracking Methods [0.7223361655030193]
This work develops an extension of crowd simulation (named CrowdSim2) and prove its usability in the application of people-tracking algorithms.
The simulator is developed using the very popular Unity 3D engine with particular emphasis on the aspects of realism in the environment.
Three methods of tracking were used to validate generated dataset: IOU-Tracker, Deep-Sort, and Deep-TAMA.
arXiv Detail & Related papers (2023-04-26T09:29:58Z) - 3D-IntPhys: Towards More Generalized 3D-grounded Visual Intuitive
Physics under Challenging Scenes [68.66237114509264]
We present a framework capable of learning 3D-grounded visual intuitive physics models from videos of complex scenes with fluids.
We show our model can make long-horizon future predictions by learning from raw images and significantly outperforms models that do not employ an explicit 3D representation space.
arXiv Detail & Related papers (2023-04-22T19:28:49Z) - FLAG3D: A 3D Fitness Activity Dataset with Language Instruction [89.60371681477791]
We present FLAG3D, a large-scale 3D fitness activity dataset with language instruction containing 180K sequences of 60 categories.
We show that FLAG3D contributes great research value for various challenges, such as cross-domain human action recognition, dynamic human mesh recovery, and language-guided human action generation.
arXiv Detail & Related papers (2022-12-09T02:33:33Z) - Active 3D Shape Reconstruction from Vision and Touch [66.08432412497443]
Humans build 3D understandings of the world through active object exploration, using jointly their senses of vision and touch.
In 3D shape reconstruction, most recent progress has relied on static datasets of limited sensory data such as RGB images, depth maps or haptic readings.
We introduce a system composed of: 1) a haptic simulator leveraging high spatial resolution vision-based tactile sensors for active touching of 3D objects; 2) a mesh-based 3D shape reconstruction model that relies on tactile or visuotactile priors to guide the shape exploration; and 3) a set of data-driven solutions with either tactile or visuo
arXiv Detail & Related papers (2021-07-20T15:56:52Z) - Hindsight for Foresight: Unsupervised Structured Dynamics Models from
Physical Interaction [24.72947291987545]
Key challenge for an agent learning to interact with the world is to reason about physical properties of objects.
We propose a novel approach for modeling the dynamics of a robot's interactions directly from unlabeled 3D point clouds and images.
arXiv Detail & Related papers (2020-08-02T11:04:49Z) - Contact and Human Dynamics from Monocular Video [73.47466545178396]
Existing deep models predict 2D and 3D kinematic poses from video that are approximately accurate, but contain visible errors.
We present a physics-based method for inferring 3D human motion from video sequences that takes initial 2D and 3D pose estimates as input.
arXiv Detail & Related papers (2020-07-22T21:09:11Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.