RHOBIN Challenge: Reconstruction of Human Object Interaction
- URL: http://arxiv.org/abs/2401.04143v1
- Date: Sun, 7 Jan 2024 23:37:07 GMT
- Title: RHOBIN Challenge: Reconstruction of Human Object Interaction
- Authors: Xianghui Xie and Xi Wang and Nikos Athanasiou and Bharat Lal Bhatnagar
and Chun-Hao P. Huang and Kaichun Mo and Hao Chen and Xia Jia and Zerui Zhang
and Liangxian Cui and Xiao Lin and Bingqiao Qian and Jie Xiao and Wenfei Yang
and Hyeongjin Nam and Daniel Sungho Jung and Kihoon Kim and Kyoung Mu Lee and
Otmar Hilliges and Gerard Pons-Moll
- Abstract summary: First RHOBIN challenge: reconstruction of human-object interactions in conjunction with the RHOBIN workshop.
Our challenge consists of three tracks of 3D reconstruction from monocular RGB images with a focus on dealing with challenging interaction scenarios.
This paper describes the settings of our challenge and discusses the winning methods of each track in more detail.
- Score: 83.07185402102253
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Modeling the interaction between humans and objects has been an emerging
research direction in recent years. Capturing human-object interaction is
however a very challenging task due to heavy occlusion and complex dynamics,
which requires understanding not only 3D human pose, and object pose but also
the interaction between them. Reconstruction of 3D humans and objects has been
two separate research fields in computer vision for a long time. We hence
proposed the first RHOBIN challenge: reconstruction of human-object
interactions in conjunction with the RHOBIN workshop. It was aimed at bringing
the research communities of human and object reconstruction as well as
interaction modeling together to discuss techniques and exchange ideas. Our
challenge consists of three tracks of 3D reconstruction from monocular RGB
images with a focus on dealing with challenging interaction scenarios. Our
challenge attracted more than 100 participants with more than 300 submissions,
indicating the broad interest in the research communities. This paper describes
the settings of our challenge and discusses the winning methods of each track
in more detail. We observe that the human reconstruction task is becoming
mature even under heavy occlusion settings while object pose estimation and
joint reconstruction remain challenging tasks. With the growing interest in
interaction modeling, we hope this report can provide useful insights and
foster future research in this direction. Our workshop website can be found at
\href{https://rhobin-challenge.github.io/}{https://rhobin-challenge.github.io/}.
Related papers
- Generating Human Motion in 3D Scenes from Text Descriptions [60.04976442328767]
This paper focuses on the task of generating human motions in 3D indoor scenes given text descriptions of the human-scene interactions.
We propose a new approach that decomposes the complex problem into two more manageable sub-problems.
For language grounding of the target object, we leverage the power of large language models; for motion generation, we design an object-centric scene representation.
arXiv Detail & Related papers (2024-05-13T14:30:12Z) - Joint Reconstruction of 3D Human and Object via Contact-Based Refinement Transformer [58.98785899556135]
We present a novel joint 3D human-object reconstruction method (CONTHO) that effectively exploits contact information between humans and objects.
There are two core designs in our system: 1) 3D-guided contact estimation and 2) contact-based 3D human and object refinement.
arXiv Detail & Related papers (2024-04-07T06:01:49Z) - Benchmarks and Challenges in Pose Estimation for Egocentric Hand Interactions with Objects [89.95728475983263]
holistic 3Dunderstanding of such interactions from egocentric views is important for tasks in robotics, AR/VR, action recognition and motion generation.
We design the HANDS23 challenge based on the AssemblyHands and ARCTIC datasets with carefully designed training and testing splits.
Based on the results of the top submitted methods and more recent baselines on the leaderboards, we perform a thorough analysis on 3D hand(-object) reconstruction tasks.
arXiv Detail & Related papers (2024-03-25T05:12:21Z) - Compositional 3D Human-Object Neural Animation [93.38239238988719]
Human-object interactions (HOIs) are crucial for human-centric scene understanding applications such as human-centric visual generation, AR/VR, and robotics.
In this paper, we address this challenge in HOI animation from a compositional perspective.
We adopt neural human-object deformation to model and render HOI dynamics based on implicit neural representations.
arXiv Detail & Related papers (2023-04-27T10:04:56Z) - Full-Body Articulated Human-Object Interaction [61.01135739641217]
CHAIRS is a large-scale motion-captured f-AHOI dataset consisting of 16.2 hours of versatile interactions.
CHAIRS provides 3D meshes of both humans and articulated objects during the entire interactive process.
By learning the geometrical relationships in HOI, we devise the very first model that leverage human pose estimation.
arXiv Detail & Related papers (2022-12-20T19:50:54Z) - Reconstructing Action-Conditioned Human-Object Interactions Using
Commonsense Knowledge Priors [42.17542596399014]
We present a method for inferring diverse 3D models of human-object interactions from images.
Our method extracts high-level commonsense knowledge from large language models.
We quantitatively evaluate the inferred 3D models on a large human-object interaction dataset.
arXiv Detail & Related papers (2022-09-06T13:32:55Z) - CHORE: Contact, Human and Object REconstruction from a single RGB image [40.817960406002506]
CHORE is a novel method that learns to jointly reconstruct the human and the object from a single RGB image.
We compute a neural reconstruction of human and object represented implicitly with two unsigned distance fields.
Experiments show that our joint reconstruction learned with the proposed strategy significantly outperforms the SOTA.
arXiv Detail & Related papers (2022-04-05T18:38:06Z) - DemoGrasp: Few-Shot Learning for Robotic Grasping with Human
Demonstration [42.19014385637538]
We propose to teach a robot how to grasp an object with a simple and short human demonstration.
We first present a small sequence of RGB-D images displaying a human-object interaction.
This sequence is then leveraged to build associated hand and object meshes that represent the interaction.
arXiv Detail & Related papers (2021-12-06T08:17:12Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.