REACTO: Reconstructing Articulated Objects from a Single Video
- URL: http://arxiv.org/abs/2404.11151v1
- Date: Wed, 17 Apr 2024 08:01:55 GMT
- Title: REACTO: Reconstructing Articulated Objects from a Single Video
- Authors: Chaoyue Song, Jiacheng Wei, Chuan-Sheng Foo, Guosheng Lin, Fayao Liu,
- Abstract summary: We propose a novel deformation model that enhances the rigidity of each part while maintaining flexible deformation of the joints.
Our method outperforms previous works in producing higher-fidelity 3D reconstructions of general articulated objects.
- Score: 64.89760223391573
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In this paper, we address the challenge of reconstructing general articulated 3D objects from a single video. Existing works employing dynamic neural radiance fields have advanced the modeling of articulated objects like humans and animals from videos, but face challenges with piece-wise rigid general articulated objects due to limitations in their deformation models. To tackle this, we propose Quasi-Rigid Blend Skinning, a novel deformation model that enhances the rigidity of each part while maintaining flexible deformation of the joints. Our primary insight combines three distinct approaches: 1) an enhanced bone rigging system for improved component modeling, 2) the use of quasi-sparse skinning weights to boost part rigidity and reconstruction fidelity, and 3) the application of geodesic point assignment for precise motion and seamless deformation. Our method outperforms previous works in producing higher-fidelity 3D reconstructions of general articulated objects, as demonstrated on both real and synthetic datasets. Project page: https://chaoyuesong.github.io/REACTO.
Related papers
- DressRecon: Freeform 4D Human Reconstruction from Monocular Video [64.61230035671885]
We present a method to reconstruct time-consistent human body models from monocular videos.
We focus on extremely loose clothing or handheld object interactions.
DressRecon yields higher-fidelity 3D reconstructions than prior art.
arXiv Detail & Related papers (2024-09-30T17:59:15Z) - LEIA: Latent View-invariant Embeddings for Implicit 3D Articulation [32.27869897947267]
We introduce LEIA, a novel approach for representing dynamic 3D objects.
Our method involves observing the object at distinct time steps or "states" and conditioning a hypernetwork on the current state.
By interpolating between these states, we can generate novel articulation configurations in 3D space that were previously unseen.
arXiv Detail & Related papers (2024-09-10T17:59:53Z) - S3O: A Dual-Phase Approach for Reconstructing Dynamic Shape and Skeleton of Articulated Objects from Single Monocular Video [13.510513575340106]
Reconstructing dynamic articulated objects from a singular monocular video is challenging, requiring joint estimation of shape, motion, and camera parameters from limited views.
We propose Synergistic Shape and Skeleton Optimization (S3O), a novel two-phase method that efficiently learns parametric models including visible shapes and underlying skeletons.
Our experimental evaluations on standard benchmarks and the PlanetZoo dataset affirm that S3O provides more accurate 3D reconstruction, and plausible skeletons, and reduces the training time by approximately 60% compared to the state-of-the-art.
arXiv Detail & Related papers (2024-05-21T09:01:00Z) - SceNeRFlow: Time-Consistent Reconstruction of General Dynamic Scenes [75.9110646062442]
We propose SceNeRFlow to reconstruct a general, non-rigid scene in a time-consistent manner.
Our method takes multi-view RGB videos and background images from static cameras with known camera parameters as input.
We show experimentally that, unlike prior work that only handles small motion, our method enables the reconstruction of studio-scale motions.
arXiv Detail & Related papers (2023-08-16T09:50:35Z) - MoDA: Modeling Deformable 3D Objects from Casual Videos [84.29654142118018]
We propose neural dual quaternion blend skinning (NeuDBS) to achieve 3D point deformation without skin-collapsing artifacts.
In the endeavor to register 2D pixels across different frames, we establish a correspondence between canonical feature embeddings that encodes 3D points within the canonical space.
Our approach can reconstruct 3D models for humans and animals with better qualitative and quantitative performance than state-of-the-art methods.
arXiv Detail & Related papers (2023-04-17T13:49:04Z) - CAMM: Building Category-Agnostic and Animatable 3D Models from Monocular
Videos [3.356334042188362]
We propose a novel reconstruction method that learns an animatable kinematic chain for any articulated object.
Our approach is on par with state-of-the-art 3D surface reconstruction methods on various articulated object categories.
arXiv Detail & Related papers (2023-04-14T06:07:54Z) - Class-agnostic Reconstruction of Dynamic Objects from Videos [127.41336060616214]
We introduce REDO, a class-agnostic framework to REconstruct the Dynamic Objects from RGBD or calibrated videos.
We develop two novel modules. First, we introduce a canonical 4D implicit function which is pixel-aligned with aggregated temporal visual cues.
Second, we develop a 4D transformation module which captures object dynamics to support temporal propagation and aggregation.
arXiv Detail & Related papers (2021-12-03T18:57:47Z) - Object Wake-up: 3-D Object Reconstruction, Animation, and in-situ
Rendering from a Single Image [58.69732754597448]
Given a picture of a chair, could we extract the 3-D shape of the chair, animate its plausible articulations and motions, and render in-situ in its original image space?
We devise an automated approach to extract and manipulate articulated objects in single images.
arXiv Detail & Related papers (2021-08-05T16:20:12Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.