Link to the Past: Temporal Propagation for Fast 3D Human Reconstruction from Monocular Video
- URL: http://arxiv.org/abs/2505.07333v1
- Date: Mon, 12 May 2025 08:16:19 GMT
- Title: Link to the Past: Temporal Propagation for Fast 3D Human Reconstruction from Monocular Video
- Authors: Matthew Marchellus, Nadhira Noor, In Kyu Park,
- Abstract summary: We present TemPoFast3D, a novel method that leverages temporal coherency of human appearance to reduce redundant computation.<n>Our approach is a "plug-and play" solution that transforms pixel-aligned reconstruction networks to handle continuous video streams.<n>Extensive experiments demonstrate that TemPoFast3D matches or exceeds state-of-the-art methods across standard metrics.
- Score: 3.065513003860787
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Fast 3D clothed human reconstruction from monocular video remains a significant challenge in computer vision, particularly in balancing computational efficiency with reconstruction quality. Current approaches are either focused on static image reconstruction but too computationally intensive, or achieve high quality through per-video optimization that requires minutes to hours of processing, making them unsuitable for real-time applications. To this end, we present TemPoFast3D, a novel method that leverages temporal coherency of human appearance to reduce redundant computation while maintaining reconstruction quality. Our approach is a "plug-and play" solution that uniquely transforms pixel-aligned reconstruction networks to handle continuous video streams by maintaining and refining a canonical appearance representation through efficient coordinate mapping. Extensive experiments demonstrate that TemPoFast3D matches or exceeds state-of-the-art methods across standard metrics while providing high-quality textured reconstruction across diverse pose and appearance, with a maximum speed of 12 FPS.
Related papers
- Fast3R: Towards 3D Reconstruction of 1000+ Images in One Forward Pass [68.78222900840132]
We propose Fast 3D Reconstruction (Fast3R), a novel multi-view generalization to DUSt3R that achieves efficient and scalable 3D reconstruction by processing many views in parallel.<n>Fast3R demonstrates state-of-the-art performance, with significant improvements in inference speed and reduced error accumulation.
arXiv Detail & Related papers (2025-01-23T18:59:55Z) - VideoLifter: Lifting Videos to 3D with Fast Hierarchical Stereo Alignment [63.21396416244634]
VideoLifter is a novel video-to-3D pipeline that leverages a local-to-global strategy on a fragment basis.<n>It significantly accelerates the reconstruction process, reducing training time by over 82% while holding better visual quality than current SOTA methods.
arXiv Detail & Related papers (2025-01-03T18:52:36Z) - RAVEN: Rethinking Adversarial Video Generation with Efficient Tri-plane Networks [93.18404922542702]
We present a novel video generative model designed to address long-term spatial and temporal dependencies.
Our approach incorporates a hybrid explicit-implicit tri-plane representation inspired by 3D-aware generative frameworks.
Our model synthesizes high-fidelity video clips at a resolution of $256times256$ pixels, with durations extending to more than $5$ seconds at a frame rate of 30 fps.
arXiv Detail & Related papers (2024-01-11T16:48:44Z) - 3D Gaussian Splatting for Real-Time Radiance Field Rendering [4.320393382724066]
We introduce three key elements that allow us to achieve state-of-the-art visual quality while maintaining competitive training times.
We demonstrate state-of-the-art visual quality and real-time rendering on several established datasets.
arXiv Detail & Related papers (2023-08-08T06:37:06Z) - Real-time volumetric rendering of dynamic humans [83.08068677139822]
We present a method for fast 3D reconstruction and real-time rendering of dynamic humans from monocular videos.
Our method can reconstruct a dynamic human in less than 3h using a single GPU, compared to recent state-of-the-art alternatives that take up to 72h.
A novel local ray marching rendering allows visualizing the neural human on a mobile VR device at 40 frames per second with minimal loss of visual quality.
arXiv Detail & Related papers (2023-03-21T14:41:25Z) - Neural 3D Reconstruction in the Wild [86.6264706256377]
We introduce a new method that enables efficient and accurate surface reconstruction from Internet photo collections.
We present a new benchmark and protocol for evaluating reconstruction performance on such in-the-wild scenes.
arXiv Detail & Related papers (2022-05-25T17:59:53Z) - Temporal Consistency Loss for High Resolution Textured and Clothed
3DHuman Reconstruction from Monocular Video [35.42021156572568]
We present a novel method to learn temporally consistent 3D reconstruction of clothed people from a monocular video.
The proposed advances improve the temporal consistency and accuracy of both the 3D reconstruction and texture prediction from a monocular video.
arXiv Detail & Related papers (2021-04-19T13:04:29Z) - A Real-time Action Representation with Temporal Encoding and Deep
Compression [115.3739774920845]
We propose a new real-time convolutional architecture, called Temporal Convolutional 3D Network (T-C3D), for action representation.
T-C3D learns video action representations in a hierarchical multi-granularity manner while obtaining a high process speed.
Our method achieves clear improvements on UCF101 action recognition benchmark against state-of-the-art real-time methods by 5.4% in terms of accuracy and 2 times faster in terms of inference speed with a less than 5MB storage model.
arXiv Detail & Related papers (2020-06-17T06:30:43Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.