Reconstructing Three-Dimensional Models of Interacting Humans
- URL: http://arxiv.org/abs/2308.01854v2
- Date: Fri, 4 Aug 2023 08:34:23 GMT
- Title: Reconstructing Three-Dimensional Models of Interacting Humans
- Authors: Mihai Fieraru, Mihai Zanfir, Elisabeta Oneata, Alin-Ionut Popa, Vlad
Olaru, Cristian Sminchisescu
- Abstract summary: CHI3D is a lab-based accurate 3d motion capture dataset with 631 sequences containing $2,525$ contact events.
FlickrCI3D is a dataset of $11,216$ images, with $14,081$ processed pairs of people, and $81,233$ facet-level surface correspondences.
- Score: 38.26269716290761
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Understanding 3d human interactions is fundamental for fine-grained scene
analysis and behavioural modeling. However, most of the existing models predict
incorrect, lifeless 3d estimates, that miss the subtle human contact
aspects--the essence of the event--and are of little use for detailed
behavioral understanding. This paper addresses such issues with several
contributions: (1) we introduce models for interaction signature estimation
(ISP) encompassing contact detection, segmentation, and 3d contact signature
prediction; (2) we show how such components can be leveraged to ensure contact
consistency during 3d reconstruction; (3) we construct several large datasets
for learning and evaluating 3d contact prediction and reconstruction methods;
specifically, we introduce CHI3D, a lab-based accurate 3d motion capture
dataset with 631 sequences containing $2,525$ contact events, $728,664$ ground
truth 3d poses, as well as FlickrCI3D, a dataset of $11,216$ images, with
$14,081$ processed pairs of people, and $81,233$ facet-level surface
correspondences. Finally, (4) we propose methodology for recovering the
ground-truth pose and shape of interacting people in a controlled setup and (5)
annotate all 3d interaction motions in CHI3D with textual descriptions. Motion
data in multiple formats (GHUM and SMPLX parameters, Human3.6m 3d joints) is
made available for research purposes at \url{https://ci3d.imar.ro}, together
with an evaluation server and a public benchmark.
Related papers
- Towards Precise 3D Human Pose Estimation with Multi-Perspective Spatial-Temporal Relational Transformers [28.38686299271394]
We propose a framework for 3D sequence-to-sequence (seq2seq) human pose detection.
Firstly, the spatial module represents the human pose feature by intra-image content, while the frame-image relation module extracts temporal relationships.
Our method is evaluated on Human3.6M, a popular 3D human pose detection dataset.
arXiv Detail & Related papers (2024-01-30T03:00:25Z) - Ins-HOI: Instance Aware Human-Object Interactions Recovery [44.02128629239429]
We propose an end-to-end Instance-aware Human-Object Interactions recovery (Ins-HOI) framework.
Ins-HOI supports instance-level reconstruction and provides reasonable and realistic invisible contact surfaces.
We collect a large-scale, high-fidelity 3D scan dataset, including 5.2k high-quality scans with real-world human-chair and hand-object interactions.
arXiv Detail & Related papers (2023-12-15T09:30:47Z) - Hi4D: 4D Instance Segmentation of Close Human Interaction [32.51930800738743]
Hi4D is a dataset of 4D textured scans of 20 subject pairs, 100 sequences, and a total of more than 11K frames.
This dataset contains rich interaction-centric annotations in 2D and 3D alongside accurately registered parametric body models.
arXiv Detail & Related papers (2023-03-27T16:53:09Z) - Full-Body Articulated Human-Object Interaction [61.01135739641217]
CHAIRS is a large-scale motion-captured f-AHOI dataset consisting of 16.2 hours of versatile interactions.
CHAIRS provides 3D meshes of both humans and articulated objects during the entire interactive process.
By learning the geometrical relationships in HOI, we devise the very first model that leverage human pose estimation.
arXiv Detail & Related papers (2022-12-20T19:50:54Z) - Reconstructing Action-Conditioned Human-Object Interactions Using
Commonsense Knowledge Priors [42.17542596399014]
We present a method for inferring diverse 3D models of human-object interactions from images.
Our method extracts high-level commonsense knowledge from large language models.
We quantitatively evaluate the inferred 3D models on a large human-object interaction dataset.
arXiv Detail & Related papers (2022-09-06T13:32:55Z) - BEHAVE: Dataset and Method for Tracking Human Object Interactions [105.77368488612704]
We present the first full body human- object interaction dataset with multi-view RGBD frames and corresponding 3D SMPL and object fits along with the annotated contacts between them.
We use this data to learn a model that can jointly track humans and objects in natural environments with an easy-to-use portable multi-camera setup.
arXiv Detail & Related papers (2022-04-14T13:21:19Z) - Estimating 3D Motion and Forces of Human-Object Interactions from
Internet Videos [49.52070710518688]
We introduce a method to reconstruct the 3D motion of a person interacting with an object from a single RGB video.
Our method estimates the 3D poses of the person together with the object pose, the contact positions and the contact forces on the human body.
arXiv Detail & Related papers (2021-11-02T13:40:18Z) - D3D-HOI: Dynamic 3D Human-Object Interactions from Videos [49.38319295373466]
We introduce D3D-HOI: a dataset of monocular videos with ground truth annotations of 3D object pose, shape and part motion during human-object interactions.
Our dataset consists of several common articulated objects captured from diverse real-world scenes and camera viewpoints.
We leverage the estimated 3D human pose for more accurate inference of the object spatial layout and dynamics.
arXiv Detail & Related papers (2021-08-19T00:49:01Z) - Learning Complex 3D Human Self-Contact [33.83748199524761]
Existing 3d reconstruction methods do not focus on body regions in self-contact.
We develop a model for Self-Contact Prediction that estimates the body surface signature of self-contact.
We show how more expressive 3d reconstructions can be recovered under self-contact signature constraints.
arXiv Detail & Related papers (2020-12-18T17:09:34Z) - Self-Supervised 3D Human Pose Estimation via Part Guided Novel Image
Synthesis [72.34794624243281]
We propose a self-supervised learning framework to disentangle variations from unlabeled video frames.
Our differentiable formalization, bridging the representation gap between the 3D pose and spatial part maps, allows us to operate on videos with diverse camera movements.
arXiv Detail & Related papers (2020-04-09T07:55:01Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.