Rendering Humans from Object-Occluded Monocular Videos
- URL: http://arxiv.org/abs/2308.04622v1
- Date: Tue, 8 Aug 2023 23:12:33 GMT
- Title: Rendering Humans from Object-Occluded Monocular Videos
- Authors: Tiange Xiang, Adam Sun, Jiajun Wu, Ehsan Adeli, Li Fei-Fei
- Abstract summary: 3D understanding and rendering of moving humans from monocular videos is a challenging task.
Existing methods cannot handle such defects due to two reasons.
We present OccNeRF, a neural rendering method that achieves better rendering of humans in severely occluded scenes.
- Score: 32.67336188239284
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: 3D understanding and rendering of moving humans from monocular videos is a
challenging task. Despite recent progress, the task remains difficult in
real-world scenarios, where obstacles may block the camera view and cause
partial occlusions in the captured videos. Existing methods cannot handle such
defects due to two reasons. First, the standard rendering strategy relies on
point-point mapping, which could lead to dramatic disparities between the
visible and occluded areas of the body. Second, the naive direct regression
approach does not consider any feasibility criteria (ie, prior information) for
rendering under occlusions. To tackle the above drawbacks, we present OccNeRF,
a neural rendering method that achieves better rendering of humans in severely
occluded scenes. As direct solutions to the two drawbacks, we propose
surface-based rendering by integrating geometry and visibility priors. We
validate our method on both simulated and real-world occlusions and demonstrate
our method's superiority.
Related papers
- OccGaussian: 3D Gaussian Splatting for Occluded Human Rendering [55.50438181721271]
Previous method utilizing NeRF for surface rendering to recover the occluded areas requires more than one day to train and several seconds to render occluded areas.
We propose OccGaussian based on 3D Gaussian Splatting, which can be trained within 6 minutes and produces high-quality human renderings up to 160 FPS with occluded input.
arXiv Detail & Related papers (2024-04-12T13:00:06Z) - Wild2Avatar: Rendering Humans Behind Occlusions [18.869570134874365]
We present Wild2Avatar, a neural rendering approach catered for occluded in-the-wild monocular videos.
In this work, we present Wild2Avatar, a neural rendering approach catered for occluded in-the-wild monocular videos.
arXiv Detail & Related papers (2023-12-31T09:01:34Z) - Decaf: Monocular Deformation Capture for Face and Hand Interactions [77.75726740605748]
This paper introduces the first method that allows tracking human hands interacting with human faces in 3D from single monocular RGB videos.
We model hands as articulated objects inducing non-rigid face deformations during an active interaction.
Our method relies on a new hand-face motion and interaction capture dataset with realistic face deformations acquired with a markerless multi-view camera system.
arXiv Detail & Related papers (2023-09-28T17:59:51Z) - Relightable and Animatable Neural Avatar from Sparse-View Video [66.77811288144156]
This paper tackles the challenge of creating relightable and animatable neural avatars from sparse-view (or even monocular) videos of dynamic humans under unknown illumination.
arXiv Detail & Related papers (2023-08-15T17:42:39Z) - Differentiable Blocks World: Qualitative 3D Decomposition by Rendering
Primitives [70.32817882783608]
We present an approach that produces a simple, compact, and actionable 3D world representation by means of 3D primitives.
Unlike existing primitive decomposition methods that rely on 3D input data, our approach operates directly on images.
We show that the resulting textured primitives faithfully reconstruct the input images and accurately model the visible 3D points.
arXiv Detail & Related papers (2023-07-11T17:58:31Z) - Vid2Avatar: 3D Avatar Reconstruction from Videos in the Wild via
Self-supervised Scene Decomposition [40.46674919612935]
We present Vid2Avatar, a method to learn human avatars from monocular in-the-wild videos.
Our method does not require any groundtruth supervision or priors extracted from large datasets of clothed human scans.
It solves the tasks of scene decomposition and surface reconstruction directly in 3D by modeling both the human and the background in the scene jointly.
arXiv Detail & Related papers (2023-02-22T18:59:17Z) - PointAvatar: Deformable Point-based Head Avatars from Videos [103.43941945044294]
PointAvatar is a deformable point-based representation that disentangles the source color into intrinsic albedo and normal-dependent shading.
We show that our method is able to generate animatable 3D avatars using monocular videos from multiple sources.
arXiv Detail & Related papers (2022-12-16T10:05:31Z) - MonoNHR: Monocular Neural Human Renderer [51.396845817689915]
We propose Monocular Neural Human Renderer (MonoNHR), a novel approach that renders robust free-viewpoint images of an arbitrary human given only a single image.
First, we propose to disentangle 3D geometry and texture features and to condition the texture inference on the 3D geometry features.
Second, we introduce a Mesh Inpainter module that inpaints the occluded parts exploiting human structural priors such as symmetry.
arXiv Detail & Related papers (2022-10-02T21:01:02Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.