Related papers: CAMM: Building Category-Agnostic and Animatable 3D Models from Monocular Videos

CAMM: Building Category-Agnostic and Animatable 3D Models from Monocular Videos

URL: http://arxiv.org/abs/2304.06937v1
Date: Fri, 14 Apr 2023 06:07:54 GMT
Title: CAMM: Building Category-Agnostic and Animatable 3D Models from Monocular Videos
Authors: Tianshu Kuai, Akash Karthikeyan, Yash Kant, Ashkan Mirzaei, Igor Gilitschenski
Abstract summary: We propose a novel reconstruction method that learns an animatable kinematic chain for any articulated object. Our approach is on par with state-of-the-art 3D surface reconstruction methods on various articulated object categories.
Score: 3.356334042188362
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Animating an object in 3D often requires an articulated structure, e.g. a kinematic chain or skeleton of the manipulated object with proper skinning weights, to obtain smooth movements and surface deformations. However, existing models that allow direct pose manipulations are either limited to specific object categories or built with specialized equipment. To reduce the work needed for creating animatable 3D models, we propose a novel reconstruction method that learns an animatable kinematic chain for any articulated object. Our method operates on monocular videos without prior knowledge of the object's shape or underlying structure. Our approach is on par with state-of-the-art 3D surface reconstruction methods on various articulated object categories while enabling direct pose manipulations by re-posing the learned kinematic chain.

Related papers

Recovering Dynamic 3D Sketches from Videos [30.87733869892925]
Liv3Stroke is a novel approach for abstracting objects in motion with deformable 3D strokes. We first extract noisy, 3D point cloud motion guidance from video frames using semantic features. Our approach deforms a set of curves to abstract essential motion features as a set of explicit 3D representations.
arXiv Detail & Related papers (2025-03-26T08:43:21Z)
Articulate AnyMesh: Open-Vocabulary 3D Articulated Objects Modeling [48.78204955169967]
Articulate Anymesh is an automated framework that is able to convert rigid 3D mesh into its articulated counterpart in an open-vocabulary manner. Our experiments show that Articulate Anymesh can generate large-scale, high-quality 3D articulated objects, including tools, toys, mechanical devices, and vehicles.
arXiv Detail & Related papers (2025-02-04T18:59:55Z)
Hierarchically Structured Neural Bones for Reconstructing Animatable Objects from Casual Videos [37.455535904703204]
We propose a new framework for creating and manipulating 3D models of arbitrary objects using casually captured videos. Our core ingredient is a novel deformation hierarchy model, which captures motions of objects with a tree-structured bones. Our framework offers several clear advantages: (1) users can obtain animatable 3D models of the arbitrary objects in improved quality from their casual videos, (2) users can manipulate 3D models in an intuitive manner with minimal costs, and (3) users can interactively add or delete control points as necessary.
arXiv Detail & Related papers (2024-08-01T07:42:45Z)
REACTO: Reconstructing Articulated Objects from a Single Video [64.89760223391573]
We propose a novel deformation model that enhances the rigidity of each part while maintaining flexible deformation of the joints. Our method outperforms previous works in producing higher-fidelity 3D reconstructions of general articulated objects.
arXiv Detail & Related papers (2024-04-17T08:01:55Z)
HOLD: Category-agnostic 3D Reconstruction of Interacting Hands and Objects from Video [70.11702620562889]
HOLD -- the first category-agnostic method that reconstructs an articulated hand and object jointly from a monocular interaction video. We develop a compositional articulated implicit model that can disentangled 3D hand and object from 2D images. Our method does not rely on 3D hand-object annotations while outperforming fully-supervised baselines in both in-the-lab and challenging in-the-wild settings.
arXiv Detail & Related papers (2023-11-30T10:50:35Z)
Reconstructing Animatable Categories from Videos [65.14948977749269]
Building animatable 3D models is challenging due to the need for 3D scans, laborious registration, and manual rigging. We present RAC that builds category 3D models from monocular videos while disentangling variations over instances and motion over time. We show that 3D models of humans, cats, and dogs can be learned from 50-100 internet videos.
arXiv Detail & Related papers (2023-05-10T17:56:21Z)
Watch It Move: Unsupervised Discovery of 3D Joints for Re-Posing of Articulated Objects [73.23249640099516]
We learn both the appearance and the structure of previously unseen articulated objects by observing them move from multiple views. Our insight is that adjacent parts that move relative to each other must be connected by a joint. We show that our method works for different structures, from quadrupeds, to single-arm robots, to humans.
arXiv Detail & Related papers (2021-12-21T16:37:48Z)
DensePose 3D: Lifting Canonical Surface Maps of Articulated Objects to the Third Dimension [71.71234436165255]
We contribute DensePose 3D, a method that can learn such reconstructions in a weakly supervised fashion from 2D image annotations only. Because it does not require 3D scans, DensePose 3D can be used for learning a wide range of articulated categories such as different animal species. We show significant improvements compared to state-of-the-art non-rigid structure-from-motion baselines on both synthetic and real data on categories of humans and animals.
arXiv Detail & Related papers (2021-08-31T18:33:55Z)
LASR: Learning Articulated Shape Reconstruction from a Monocular Video [97.92849567637819]
We introduce a template-free approach to learn 3D shapes from a single video. Our method faithfully reconstructs nonrigid 3D structures from videos of human, animals, and objects of unknown classes.
arXiv Detail & Related papers (2021-05-06T21:41:11Z)
Learning monocular 3D reconstruction of articulated categories from motion [39.811816510186475]
Video self-supervision forces the consistency of consecutive 3D reconstructions by a motion-based cycle loss. We introduce an interpretable model of 3D template deformations that controls a 3D surface through the displacement of a small number of local, learnable handles. We obtain state-of-the-art reconstructions with diverse shapes, viewpoints and textures for multiple articulated object categories.
arXiv Detail & Related papers (2021-03-30T13:50:27Z)

This list is automatically generated from the titles and abstracts of the papers in this site.