Enhancing Neural Rendering Methods with Image Augmentations
- URL: http://arxiv.org/abs/2306.08904v1
- Date: Thu, 15 Jun 2023 07:18:27 GMT
- Title: Enhancing Neural Rendering Methods with Image Augmentations
- Authors: Juan C. P\'erez and Sara Rojas and Jesus Zarzar and Bernard Ghanem
- Abstract summary: We study the use of image augmentations in learning neural rendering methods (NRMs) for 3D scenes.
We find that introducing image augmentations during training presents challenges such as geometric and photometric inconsistencies.
Our experiments demonstrate the benefits of incorporating augmentations when learning NRMs, including improved photometric quality and surface reconstruction.
- Score: 59.00067936686825
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Faithfully reconstructing 3D geometry and generating novel views of scenes
are critical tasks in 3D computer vision. Despite the widespread use of image
augmentations across computer vision applications, their potential remains
underexplored when learning neural rendering methods (NRMs) for 3D scenes. This
paper presents a comprehensive analysis of the use of image augmentations in
NRMs, where we explore different augmentation strategies. We found that
introducing image augmentations during training presents challenges such as
geometric and photometric inconsistencies for learning NRMs from images.
Specifically, geometric inconsistencies arise from alterations in shapes,
positions, and orientations from the augmentations, disrupting spatial cues
necessary for accurate 3D reconstruction. On the other hand, photometric
inconsistencies arise from changes in pixel intensities introduced by the
augmentations, affecting the ability to capture the underlying 3D structures of
the scene. We alleviate these issues by focusing on color manipulations and
introducing learnable appearance embeddings that allow NRMs to explain away
photometric variations. Our experiments demonstrate the benefits of
incorporating augmentations when learning NRMs, including improved photometric
quality and surface reconstruction, as well as enhanced robustness against data
quality issues, such as reduced training data and image degradations.
Related papers
- GTR: Improving Large 3D Reconstruction Models through Geometry and Texture Refinement [51.97726804507328]
We propose a novel approach for 3D mesh reconstruction from multi-view images.
Our method takes inspiration from large reconstruction models that use a transformer-based triplane generator and a Neural Radiance Field (NeRF) model trained on multi-view images.
arXiv Detail & Related papers (2024-06-09T05:19:24Z) - 3D Facial Expressions through Analysis-by-Neural-Synthesis [30.2749903946587]
SMIRK (Spatial Modeling for Image-based Reconstruction of Kinesics) faithfully reconstructs expressive 3D faces from images.
We identify two key limitations in existing methods: shortcomings in their self-supervised training formulation, and a lack of expression diversity in the training images.
Our qualitative, quantitative and particularly our perceptual evaluations demonstrate that SMIRK achieves the new state-of-the art performance on accurate expression reconstruction.
arXiv Detail & Related papers (2024-04-05T14:00:07Z) - 2L3: Lifting Imperfect Generated 2D Images into Accurate 3D [16.66666619143761]
Multi-view (MV) 3D reconstruction is a promising solution to fuse generated MV images into consistent 3D objects.
However, the generated images usually suffer from inconsistent lighting, misaligned geometry, and sparse views, leading to poor reconstruction quality.
We present a novel 3D reconstruction framework that leverages intrinsic decomposition guidance, transient-mono prior guidance, and view augmentation to cope with the three issues.
arXiv Detail & Related papers (2024-01-29T02:30:31Z) - AugUndo: Scaling Up Augmentations for Monocular Depth Completion and Estimation [51.143540967290114]
We propose a method that unlocks a wide range of previously-infeasible geometric augmentations for unsupervised depth computation and estimation.
This is achieved by reversing, or undo''-ing, geometric transformations to the coordinates of the output depth, warping the depth map back to the original reference frame.
arXiv Detail & Related papers (2023-10-15T05:15:45Z) - CVRecon: Rethinking 3D Geometric Feature Learning For Neural
Reconstruction [12.53249207602695]
We propose an end-to-end 3D neural reconstruction framework CVRecon.
We exploit the rich geometric embedding in the cost volumes to facilitate 3D geometric feature learning.
arXiv Detail & Related papers (2023-04-28T05:30:19Z) - Learning Personalized High Quality Volumetric Head Avatars from
Monocular RGB Videos [47.94545609011594]
We propose a method to learn a high-quality implicit 3D head avatar from a monocular RGB video captured in the wild.
Our hybrid pipeline combines the geometry prior and dynamic tracking of a 3DMM with a neural radiance field to achieve fine-grained control and photorealism.
arXiv Detail & Related papers (2023-04-04T01:10:04Z) - Geometry-aware data augmentation for monocular 3D object detection [18.67567745336633]
This paper focuses on monocular 3D object detection, one of the essential modules in autonomous driving systems.
A key challenge is that the depth recovery problem is ill-posed in monocular data.
We conduct a thorough analysis to reveal how existing methods fail to robustly estimate depth when different geometry shifts occur.
We convert the aforementioned manipulations into four corresponding 3D-aware data augmentation techniques.
arXiv Detail & Related papers (2021-04-12T23:12:48Z) - PCLs: Geometry-aware Neural Reconstruction of 3D Pose with Perspective
Crop Layers [111.55817466296402]
We introduce Perspective Crop Layers (PCLs) - a form of perspective crop of the region of interest based on the camera geometry.
PCLs deterministically remove the location-dependent perspective effects while leaving end-to-end training and the number of parameters of the underlying neural network.
PCL offers an easy way to improve the accuracy of existing 3D reconstruction networks by making them geometry aware.
arXiv Detail & Related papers (2020-11-27T08:48:43Z) - Pix2Surf: Learning Parametric 3D Surface Models of Objects from Images [64.53227129573293]
We investigate the problem of learning to generate 3D parametric surface representations for novel object instances, as seen from one or more views.
We design neural networks capable of generating high-quality parametric 3D surfaces which are consistent between views.
Our method is supervised and trained on a public dataset of shapes from common object categories.
arXiv Detail & Related papers (2020-08-18T06:33:40Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.