Real-time Free-view Human Rendering from Sparse-view RGB Videos using Double Unprojected Textures
- URL: http://arxiv.org/abs/2412.13183v1
- Date: Tue, 17 Dec 2024 18:57:38 GMT
- Title: Real-time Free-view Human Rendering from Sparse-view RGB Videos using Double Unprojected Textures
- Authors: Guoxing Sun, Rishabh Dabral, Heming Zhu, Pascal Fua, Christian Theobalt, Marc Habermann,
- Abstract summary: Real-time free-view human rendering from sparse-view RGB inputs is a challenging task due to the sensor scarcity and the tight time budget.<n>Recent methods leverage 2D CNNs operating in texture space to learn rendering primitives.<n>We present Double Unprojected Textures, which at the core disentangles coarse geometric deformation estimation from appearance synthesis.
- Score: 87.80984588545589
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Real-time free-view human rendering from sparse-view RGB inputs is a challenging task due to the sensor scarcity and the tight time budget. To ensure efficiency, recent methods leverage 2D CNNs operating in texture space to learn rendering primitives. However, they either jointly learn geometry and appearance, or completely ignore sparse image information for geometry estimation, significantly harming visual quality and robustness to unseen body poses. To address these issues, we present Double Unprojected Textures, which at the core disentangles coarse geometric deformation estimation from appearance synthesis, enabling robust and photorealistic 4K rendering in real-time. Specifically, we first introduce a novel image-conditioned template deformation network, which estimates the coarse deformation of the human template from a first unprojected texture. This updated geometry is then used to apply a second and more accurate texture unprojection. The resulting texture map has fewer artifacts and better alignment with input views, which benefits our learning of finer-level geometry and appearance represented by Gaussian splats. We validate the effectiveness and efficiency of the proposed method in quantitative and qualitative experiments, which significantly surpasses other state-of-the-art methods.
Related papers
- SMPL-GPTexture: Dual-View 3D Human Texture Estimation using Text-to-Image Generation Models [7.436391283592317]
SMPL-GPTexture is a novel pipeline that takes natural language prompts as input and leverages a state-of-the-art text-to-image generation model.
We show that our pipeline can generate high resolution texture aligned with user's prompts.
arXiv Detail & Related papers (2025-04-17T23:28:38Z) - Make-A-Texture: Fast Shape-Aware Texture Generation in 3 Seconds [11.238020531599405]
We present Make-A-Texture, a new framework that efficiently synthesizes high-resolution texture maps from textual prompts for given 3D geometries.<n>A significant feature of our method is its remarkable efficiency, achieving a full texture generation within an end-to-end runtime of just 3.07 seconds on a single NVIDIA H100 GPU.<n>Our work significantly improves the applicability and practicality of texture generation models for real-world 3D content creation, including interactive creation and text-guided texture editing.
arXiv Detail & Related papers (2024-12-10T18:58:29Z) - FAMOUS: High-Fidelity Monocular 3D Human Digitization Using View Synthesis [51.193297565630886]
The challenge of accurately inferring texture remains, particularly in obscured areas such as the back of a person in frontal-view images.
This limitation in texture prediction largely stems from the scarcity of large-scale and diverse 3D datasets.
We propose leveraging extensive 2D fashion datasets to enhance both texture and shape prediction in 3D human digitization.
arXiv Detail & Related papers (2024-10-13T01:25:05Z) - UV Gaussians: Joint Learning of Mesh Deformation and Gaussian Textures for Human Avatar Modeling [71.87807614875497]
We propose UV Gaussians, which models the 3D human body by jointly learning mesh deformations and 2D UV-space Gaussian textures.
We collect and process a new dataset of human motion, which includes multi-view images, scanned models, parametric model registration, and corresponding texture maps. Experimental results demonstrate that our method achieves state-of-the-art synthesis of novel view and novel pose.
arXiv Detail & Related papers (2024-03-18T09:03:56Z) - Shape-biased Texture Agnostic Representations for Improved Textureless and Metallic Object Detection and 6D Pose Estimation [9.227450931458907]
textureless and metallic objects still a significant challenge due to few visual cues and the texture bias of CNNs.
We propose a strategy for inducing a shape bias to CNN training.
This methodology allows seamless data rendering and results in training data without consistent textural surfaces.
arXiv Detail & Related papers (2024-02-07T14:18:19Z) - Neural Texture Puppeteer: A Framework for Neural Geometry and Texture
Rendering of Articulated Shapes, Enabling Re-Identification at Interactive
Speed [2.8544822698499255]
We present a neural rendering pipeline for textured articulated shapes that we call Neural Texture Puppeteer.
A texture auto-encoder makes use of this information to encode textured images into a global latent code.
Our method can be applied to endangered species where data is limited.
arXiv Detail & Related papers (2023-11-28T10:51:05Z) - Mesh2Tex: Generating Mesh Textures from Image Queries [45.32242590651395]
In particular, textured stage textures from images of real objects match real images observations.
We present Mesh2Tex, which learns object geometry from uncorrelated collections of 3D object geometry.
arXiv Detail & Related papers (2023-04-12T13:58:25Z) - HQ3DAvatar: High Quality Controllable 3D Head Avatar [65.70885416855782]
This paper presents a novel approach to building highly photorealistic digital head avatars.
Our method learns a canonical space via an implicit function parameterized by a neural network.
At test time, our method is driven by a monocular RGB video.
arXiv Detail & Related papers (2023-03-25T13:56:33Z) - Delicate Textured Mesh Recovery from NeRF via Adaptive Surface
Refinement [78.48648360358193]
We present a novel framework that generates textured surface meshes from images.
Our approach begins by efficiently initializing the geometry and view-dependency appearance with a NeRF.
We jointly refine the appearance with geometry and bake it into texture images for real-time rendering.
arXiv Detail & Related papers (2023-03-03T17:14:44Z) - TexPose: Neural Texture Learning for Self-Supervised 6D Object Pose
Estimation [55.94900327396771]
We introduce neural texture learning for 6D object pose estimation from synthetic data.
We learn to predict realistic texture of objects from real image collections.
We learn pose estimation from pixel-perfect synthetic data.
arXiv Detail & Related papers (2022-12-25T13:36:32Z) - FastHuman: Reconstructing High-Quality Clothed Human in Minutes [18.643091757385626]
We propose an approach for optimizing high-quality clothed human body shapes in minutes.
Our method uses a mesh-based patch warping technique to ensure multi-view photometric consistency.
Our approach has demonstrated promising results on both synthetic and real-world datasets.
arXiv Detail & Related papers (2022-11-26T05:16:39Z) - MvDeCor: Multi-view Dense Correspondence Learning for Fine-grained 3D
Segmentation [91.6658845016214]
We propose to utilize self-supervised techniques in the 2D domain for fine-grained 3D shape segmentation tasks.
We render a 3D shape from multiple views, and set up a dense correspondence learning task within the contrastive learning framework.
As a result, the learned 2D representations are view-invariant and geometrically consistent.
arXiv Detail & Related papers (2022-08-18T00:48:15Z) - Fine Detailed Texture Learning for 3D Meshes with Generative Models [33.42114674602613]
This paper presents a method to reconstruct high-quality textured 3D models from both multi-view and single-view images.
In the first stage, we focus on learning accurate geometry, whereas in the second stage, we focus on learning the texture with a generative adversarial network.
We demonstrate that our method achieves superior 3D textured models compared to the previous works.
arXiv Detail & Related papers (2022-03-17T14:50:52Z) - NeuralHumanFVV: Real-Time Neural Volumetric Human Performance Rendering
using RGB Cameras [17.18904717379273]
4D reconstruction and rendering of human activities is critical for immersive VR/AR experience.
Recent advances still fail to recover fine geometry and texture results with the level of detail present in the input images from sparse multi-view RGB cameras.
We propose a real-time neural human performance capture and rendering system to generate both high-quality geometry and photo-realistic texture of human activities in arbitrary novel views.
arXiv Detail & Related papers (2021-03-13T12:03:38Z) - OSTeC: One-Shot Texture Completion [86.23018402732748]
We propose an unsupervised approach for one-shot 3D facial texture completion.
The proposed approach rotates an input image in 3D and fill-in the unseen regions by reconstructing the rotated image in a 2D face generator.
We frontalize the target image by projecting the completed texture into the generator.
arXiv Detail & Related papers (2020-12-30T23:53:26Z) - Intrinsic Autoencoders for Joint Neural Rendering and Intrinsic Image
Decomposition [67.9464567157846]
We propose an autoencoder for joint generation of realistic images from synthetic 3D models while simultaneously decomposing real images into their intrinsic shape and appearance properties.
Our experiments confirm that a joint treatment of rendering and decomposition is indeed beneficial and that our approach outperforms state-of-the-art image-to-image translation baselines both qualitatively and quantitatively.
arXiv Detail & Related papers (2020-06-29T12:53:58Z) - Two-shot Spatially-varying BRDF and Shape Estimation [89.29020624201708]
We propose a novel deep learning architecture with a stage-wise estimation of shape and SVBRDF.
We create a large-scale synthetic training dataset with domain-randomized geometry and realistic materials.
Experiments on both synthetic and real-world datasets show that our network trained on a synthetic dataset can generalize well to real-world images.
arXiv Detail & Related papers (2020-04-01T12:56:13Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.