RTMV: A Ray-Traced Multi-View Synthetic Dataset for Novel View Synthesis
- URL: http://arxiv.org/abs/2205.07058v1
- Date: Sat, 14 May 2022 13:15:32 GMT
- Title: RTMV: A Ray-Traced Multi-View Synthetic Dataset for Novel View Synthesis
- Authors: Jonathan Tremblay, Moustafa Meshry, Alex Evans, Jan Kautz, Alexander
Keller, Sameh Khamis, Charles Loop, Nathan Morrical, Koki Nagano, Towaki
Takikawa, Stan Birchfield
- Abstract summary: We present a large-scale synthetic dataset for novel view synthesis consisting of 300k images rendered from nearly 2000 complex scenes.
The dataset is orders of magnitude larger than existing synthetic datasets for novel view synthesis.
Using 4 distinct sources of high-quality 3D meshes, the scenes of our dataset exhibit challenging variations in camera views, lighting, shape, materials, and textures.
- Score: 104.53930611219654
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We present a large-scale synthetic dataset for novel view synthesis
consisting of ~300k images rendered from nearly 2000 complex scenes using
high-quality ray tracing at high resolution (1600 x 1600 pixels). The dataset
is orders of magnitude larger than existing synthetic datasets for novel view
synthesis, thus providing a large unified benchmark for both training and
evaluation. Using 4 distinct sources of high-quality 3D meshes, the scenes of
our dataset exhibit challenging variations in camera views, lighting, shape,
materials, and textures. Because our dataset is too large for existing methods
to process, we propose Sparse Voxel Light Field (SVLF), an efficient
voxel-based light field approach for novel view synthesis that achieves
comparable performance to NeRF on synthetic data, while being an order of
magnitude faster to train and two orders of magnitude faster to render. SVLF
achieves this speed by relying on a sparse voxel octree, careful voxel sampling
(requiring only a handful of queries per ray), and reduced network structure;
as well as ground truth depth maps at training time. Our dataset is generated
by NViSII, a Python-based ray tracing renderer, which is designed to be simple
for non-experts to use and share, flexible and powerful through its use of
scripting, and able to create high-quality and physically-based rendered
images. Experiments with a subset of our dataset allow us to compare standard
methods like NeRF and mip-NeRF for single-scene modeling, and pixelNeRF for
category-level modeling, pointing toward the need for future improvements in
this area.
Related papers
- HybridNeRF: Efficient Neural Rendering via Adaptive Volumetric Surfaces [71.1071688018433]
Neural radiance fields provide state-of-the-art view synthesis quality but tend to be slow to render.
We propose a method, HybridNeRF, that leverages the strengths of both representations by rendering most objects as surfaces.
We improve error rates by 15-30% while achieving real-time framerates (at least 36 FPS) for virtual-reality resolutions (2Kx2K)
arXiv Detail & Related papers (2023-12-05T22:04:49Z) - Learning Neural Duplex Radiance Fields for Real-Time View Synthesis [33.54507228895688]
We propose a novel approach to distill and bake NeRFs into highly efficient mesh-based neural representations.
We demonstrate the effectiveness and superiority of our approach via extensive experiments on a range of standard datasets.
arXiv Detail & Related papers (2023-04-20T17:59:52Z) - SPARF: Large-Scale Learning of 3D Sparse Radiance Fields from Few Input
Images [62.64942825962934]
We present SPARF, a large-scale ShapeNet-based synthetic dataset for novel view synthesis.
We propose a novel pipeline (SuRFNet) that learns to generate sparse voxel radiance fields from only few views.
SuRFNet employs partial SRFs from few/one images and a specialized SRF loss to learn to generate high-quality sparse voxel radiance fields.
arXiv Detail & Related papers (2022-12-18T14:56:22Z) - Fast Non-Rigid Radiance Fields from Monocularized Data [66.74229489512683]
This paper proposes a new method for full 360deg inward-facing novel view synthesis of non-rigidly deforming scenes.
At the core of our method are 1) An efficient deformation module that decouples the processing of spatial and temporal information for accelerated training and inference; and 2) A static module representing the canonical scene as a fast hash-encoded neural radiance field.
In both cases, our method is significantly faster than previous methods, converging in less than 7 minutes and achieving real-time framerates at 1K resolution, while obtaining a higher visual accuracy for generated novel views.
arXiv Detail & Related papers (2022-12-02T18:51:10Z) - Progressively-connected Light Field Network for Efficient View Synthesis [69.29043048775802]
We present a Progressively-connected Light Field network (ProLiF) for the novel view synthesis of complex forward-facing scenes.
ProLiF encodes a 4D light field, which allows rendering a large batch of rays in one training step for image- or patch-level losses.
arXiv Detail & Related papers (2022-07-10T13:47:20Z) - Learning Neural Light Fields with Ray-Space Embedding Networks [51.88457861982689]
We propose a novel neural light field representation that is compact and directly predicts integrated radiance along rays.
Our method achieves state-of-the-art quality on dense forward-facing datasets such as the Stanford Light Field dataset.
arXiv Detail & Related papers (2021-12-02T18:59:51Z) - The Synthinel-1 dataset: a collection of high resolution synthetic
overhead imagery for building segmentation [1.5293427903448025]
We develop an approach to rapidly and cheaply generate large and diverse virtual environments from which we can capture synthetic overhead imagery for training segmentation CNNs.
We use several benchmark dataset to demonstrate that Synthinel-1 is consistently beneficial when used to augment real-world training imagery.
arXiv Detail & Related papers (2020-01-15T04:30:45Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.