Hi4D: 4D Instance Segmentation of Close Human Interaction
- URL: http://arxiv.org/abs/2303.15380v1
- Date: Mon, 27 Mar 2023 16:53:09 GMT
- Title: Hi4D: 4D Instance Segmentation of Close Human Interaction
- Authors: Yifei Yin, Chen Guo, Manuel Kaufmann, Juan Jose Zarate, Jie Song,
Otmar Hilliges
- Abstract summary: Hi4D is a dataset of 4D textured scans of 20 subject pairs, 100 sequences, and a total of more than 11K frames.
This dataset contains rich interaction-centric annotations in 2D and 3D alongside accurately registered parametric body models.
- Score: 32.51930800738743
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We propose Hi4D, a method and dataset for the automatic analysis of
physically close human-human interaction under prolonged contact. Robustly
disentangling several in-contact subjects is a challenging task due to
occlusions and complex shapes. Hence, existing multi-view systems typically
fuse 3D surfaces of close subjects into a single, connected mesh. To address
this issue we leverage i) individually fitted neural implicit avatars; ii) an
alternating optimization scheme that refines pose and surface through periods
of close proximity; and iii) thus segment the fused raw scans into individual
instances. From these instances we compile Hi4D dataset of 4D textured scans of
20 subject pairs, 100 sequences, and a total of more than 11K frames. Hi4D
contains rich interaction-centric annotations in 2D and 3D alongside accurately
registered parametric body models. We define varied human pose and shape
estimation tasks on this dataset and provide results from state-of-the-art
methods on these benchmarks.
Related papers
- Harmony4D: A Video Dataset for In-The-Wild Close Human Interactions [27.677520981665012]
Harmony4D is a dataset for human-human interaction featuring in-the-wild activities such as wrestling, dancing, MMA, and more.
We use a flexible multi-view capture system to record these dynamic activities and provide annotations for human detection, tracking, 2D/3D pose estimation, and mesh recovery for closely interacting subjects.
arXiv Detail & Related papers (2024-10-27T00:05:15Z) - CORE4D: A 4D Human-Object-Human Interaction Dataset for Collaborative Object REarrangement [20.520938266527438]
We present CORE4D, a novel large-scale 4D human-object collaborative object rearrangement.
With 1K human-object-human motion sequences captured in the real world, we enrich CORE4D by contributing an iterative collaboration strategy to augment motions to a variety of novel objects.
Benefiting from extensive motion patterns provided by CORE4D, we benchmark two tasks aiming at generating human-object interaction: human-object motion forecasting and interaction synthesis.
arXiv Detail & Related papers (2024-06-27T17:32:18Z) - Decaf: Monocular Deformation Capture for Face and Hand Interactions [77.75726740605748]
This paper introduces the first method that allows tracking human hands interacting with human faces in 3D from single monocular RGB videos.
We model hands as articulated objects inducing non-rigid face deformations during an active interaction.
Our method relies on a new hand-face motion and interaction capture dataset with realistic face deformations acquired with a markerless multi-view camera system.
arXiv Detail & Related papers (2023-09-28T17:59:51Z) - Reconstructing Three-Dimensional Models of Interacting Humans [38.26269716290761]
CHI3D is a lab-based accurate 3d motion capture dataset with 631 sequences containing $2,525$ contact events.
FlickrCI3D is a dataset of $11,216$ images, with $14,081$ processed pairs of people, and $81,233$ facet-level surface correspondences.
arXiv Detail & Related papers (2023-08-03T16:20:33Z) - LoRD: Local 4D Implicit Representation for High-Fidelity Dynamic Human
Modeling [69.56581851211841]
We propose a novel Local 4D implicit Representation for Dynamic clothed human, named LoRD.
Our key insight is to encourage the network to learn the latent codes of local part-level representation.
LoRD has strong capability for representing 4D human, and outperforms state-of-the-art methods on practical applications.
arXiv Detail & Related papers (2022-08-18T03:49:44Z) - BEHAVE: Dataset and Method for Tracking Human Object Interactions [105.77368488612704]
We present the first full body human- object interaction dataset with multi-view RGBD frames and corresponding 3D SMPL and object fits along with the annotated contacts between them.
We use this data to learn a model that can jointly track humans and objects in natural environments with an easy-to-use portable multi-camera setup.
arXiv Detail & Related papers (2022-04-14T13:21:19Z) - HAA4D: Few-Shot Human Atomic Action Recognition via 3D Spatio-Temporal
Skeletal Alignment [62.77491613638775]
This paper proposes a new 4D HAA4D dataset which consists of more than 3,300 videos in 300 human atomic action classes.
The choice of atomic actions makes annotation even easier, because each video clip lasts for only a few seconds.
All training and testing 3D skeletons in HAA4D are globally aligned, using a deep alignment model to the same global space.
arXiv Detail & Related papers (2022-02-15T10:55:21Z) - 4DComplete: Non-Rigid Motion Estimation Beyond the Observable Surface [7.637832293935966]
We introduce 4DComplete, a novel data-driven approach that estimates the non-rigid motion for the unobserved geometry.
For network training, we constructed a large-scale synthetic dataset called DeformingThings4D.
arXiv Detail & Related papers (2021-05-05T07:39:12Z) - Reconstructing Hand-Object Interactions in the Wild [71.16013096764046]
We propose an optimization-based procedure which does not require direct 3D supervision.
We exploit all available related data (2D bounding boxes, 2D hand keypoints, 2D instance masks, 3D object models, 3D in-the-lab MoCap) to provide constraints for the 3D reconstruction.
Our method produces compelling reconstructions on the challenging in-the-wild data from the EPIC Kitchens and the 100 Days of Hands datasets.
arXiv Detail & Related papers (2020-12-17T18:59:58Z) - HMOR: Hierarchical Multi-Person Ordinal Relations for Monocular
Multi-Person 3D Pose Estimation [54.23770284299979]
This paper introduces a novel form of supervision - Hierarchical Multi-person Ordinal Relations (HMOR)
HMOR encodes interaction information as the ordinal relations of depths and angles hierarchically.
An integrated top-down model is designed to leverage these ordinal relations in the learning process.
The proposed method significantly outperforms state-of-the-art methods on publicly available multi-person 3D pose datasets.
arXiv Detail & Related papers (2020-08-01T07:53:27Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.