Watch It Move: Unsupervised Discovery of 3D Joints for Re-Posing of
Articulated Objects
- URL: http://arxiv.org/abs/2112.11347v1
- Date: Tue, 21 Dec 2021 16:37:48 GMT
- Title: Watch It Move: Unsupervised Discovery of 3D Joints for Re-Posing of
Articulated Objects
- Authors: Atsuhiro Noguchi, Umar Iqbal, Jonathan Tremblay, Tatsuya Harada,
Orazio Gallo
- Abstract summary: We learn both the appearance and the structure of previously unseen articulated objects by observing them move from multiple views.
Our insight is that adjacent parts that move relative to each other must be connected by a joint.
We show that our method works for different structures, from quadrupeds, to single-arm robots, to humans.
- Score: 73.23249640099516
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Rendering articulated objects while controlling their poses is critical to
applications such as virtual reality or animation for movies. Manipulating the
pose of an object, however, requires the understanding of its underlying
structure, that is, its joints and how they interact with each other.
Unfortunately, assuming the structure to be known, as existing methods do,
precludes the ability to work on new object categories. We propose to learn
both the appearance and the structure of previously unseen articulated objects
by observing them move from multiple views, with no additional supervision,
such as joints annotations, or information about the structure. Our insight is
that adjacent parts that move relative to each other must be connected by a
joint. To leverage this observation, we model the object parts in 3D as
ellipsoids, which allows us to identify joints. We combine this explicit
representation with an implicit one that compensates for the approximation
introduced. We show that our method works for different structures, from
quadrupeds, to single-arm robots, to humans.
Related papers
- Hierarchically Structured Neural Bones for Reconstructing Animatable Objects from Casual Videos [37.455535904703204]
We propose a new framework for creating and manipulating 3D models of arbitrary objects using casually captured videos.
Our core ingredient is a novel deformation hierarchy model, which captures motions of objects with a tree-structured bones.
Our framework offers several clear advantages: (1) users can obtain animatable 3D models of the arbitrary objects in improved quality from their casual videos, (2) users can manipulate 3D models in an intuitive manner with minimal costs, and (3) users can interactively add or delete control points as necessary.
arXiv Detail & Related papers (2024-08-01T07:42:45Z) - HOLD: Category-agnostic 3D Reconstruction of Interacting Hands and
Objects from Video [70.11702620562889]
HOLD -- the first category-agnostic method that reconstructs an articulated hand and object jointly from a monocular interaction video.
We develop a compositional articulated implicit model that can disentangled 3D hand and object from 2D images.
Our method does not rely on 3D hand-object annotations while outperforming fully-supervised baselines in both in-the-lab and challenging in-the-wild settings.
arXiv Detail & Related papers (2023-11-30T10:50:35Z) - CAMM: Building Category-Agnostic and Animatable 3D Models from Monocular
Videos [3.356334042188362]
We propose a novel reconstruction method that learns an animatable kinematic chain for any articulated object.
Our approach is on par with state-of-the-art 3D surface reconstruction methods on various articulated object categories.
arXiv Detail & Related papers (2023-04-14T06:07:54Z) - Full-Body Articulated Human-Object Interaction [61.01135739641217]
CHAIRS is a large-scale motion-captured f-AHOI dataset consisting of 16.2 hours of versatile interactions.
CHAIRS provides 3D meshes of both humans and articulated objects during the entire interactive process.
By learning the geometrical relationships in HOI, we devise the very first model that leverage human pose estimation.
arXiv Detail & Related papers (2022-12-20T19:50:54Z) - Unsupervised Kinematic Motion Detection for Part-segmented 3D Shape
Collections [14.899075941080541]
We present an unsupervised approach for discovering articulated motions in a part-segmented 3D shape collection.
Our approach is based on a concept we call category closure: any valid articulation of an object's parts should keep the object in the same semantic category.
We evaluate our approach by using it to re-discover part motions from the PartNet-Mobility dataset.
arXiv Detail & Related papers (2022-06-17T00:50:36Z) - ARCTIC: A Dataset for Dexterous Bimanual Hand-Object Manipulation [68.80339307258835]
ARCTIC is a dataset of two hands that dexterously manipulate objects.
It contains 2.1M video frames paired with accurate 3D hand meshes and detailed, dynamic contact information.
arXiv Detail & Related papers (2022-04-28T17:23:59Z) - What's in your hands? 3D Reconstruction of Generic Objects in Hands [49.12461675219253]
Our work aims to reconstruct hand-held objects given a single RGB image.
In contrast to prior works that typically assume known 3D templates and reduce the problem to 3D pose estimation, our work reconstructs generic hand-held object without knowing their 3D templates.
arXiv Detail & Related papers (2022-04-14T17:59:02Z) - Self-supervised Single-view 3D Reconstruction via Semantic Consistency [142.71430568330172]
We learn a self-supervised, single-view 3D reconstruction model that predicts the shape, texture and camera pose of a target object.
The proposed method does not necessitate 3D supervision, manually annotated keypoints, multi-view images of an object or a prior 3D template.
arXiv Detail & Related papers (2020-03-13T20:29:01Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.