A Large-Scale Outdoor Multi-modal Dataset and Benchmark for Novel View
Synthesis and Implicit Scene Reconstruction
- URL: http://arxiv.org/abs/2301.06782v1
- Date: Tue, 17 Jan 2023 10:15:32 GMT
- Title: A Large-Scale Outdoor Multi-modal Dataset and Benchmark for Novel View
Synthesis and Implicit Scene Reconstruction
- Authors: Chongshan Lu, Fukun Yin, Xin Chen, Tao Chen, Gang YU, Jiayuan Fan
- Abstract summary: Neural Radiance Fields (NeRF) has achieved impressive results in single object scene reconstruction and novel view synthesis.
There is no unified outdoor scene dataset for large-scale NeRF evaluation due to expensive data acquisition and calibration costs.
In this paper, we propose a large-scale outdoor multi-modal dataset, OMMO dataset, containing complex land objects and scenes with calibrated images, point clouds and prompt annotations.
- Score: 26.122654478946227
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Neural Radiance Fields (NeRF) has achieved impressive results in single
object scene reconstruction and novel view synthesis, which have been
demonstrated on many single modality and single object focused indoor scene
datasets like DTU, BMVS, and NeRF Synthetic.However, the study of NeRF on
large-scale outdoor scene reconstruction is still limited, as there is no
unified outdoor scene dataset for large-scale NeRF evaluation due to expensive
data acquisition and calibration costs. In this paper, we propose a large-scale
outdoor multi-modal dataset, OMMO dataset, containing complex land objects and
scenes with calibrated images, point clouds and prompt annotations. Meanwhile,
a new benchmark for several outdoor NeRF-based tasks is established, such as
novel view synthesis, surface reconstruction, and multi-modal NeRF. To create
the dataset, we capture and collect a large number of real fly-view videos and
select high-quality and high-resolution clips from them. Then we design a
quality review module to refine images, remove low-quality frames and
fail-to-calibrate scenes through a learning-based automatic evaluation plus
manual review. Finally, a number of volunteers are employed to add the text
descriptions for each scene and key-frame to meet the potential multi-modal
requirements in the future. Compared with existing NeRF datasets, our dataset
contains abundant real-world urban and natural scenes with various scales,
camera trajectories, and lighting conditions. Experiments show that our dataset
can benchmark most state-of-the-art NeRF methods on different tasks. We will
release the dataset and model weights very soon.
Related papers
- Crowd-Sourced NeRF: Collecting Data from Production Vehicles for 3D Street View Reconstruction [11.257007940120582]
We present a crowd-sourced framework, which utilizes substantial data captured by production vehicles to reconstruct the scene with the NeRF model.
We highlight that we present a comprehensive framework that integrates data selection, sparse 3D reconstruction, sequence appearance embedding, depth supervision of ground surface, and occlusion completion.
We propose an application, named first-view navigation, which leveraged the NeRF model to generate 3D street view and guide the driver with a synthesized video.
arXiv Detail & Related papers (2024-06-24T03:30:20Z) - LiveHPS: LiDAR-based Scene-level Human Pose and Shape Estimation in Free
Environment [59.320414108383055]
We present LiveHPS, a novel single-LiDAR-based approach for scene-level human pose and shape estimation.
We propose a huge human motion dataset, named FreeMotion, which is collected in various scenarios with diverse human poses.
arXiv Detail & Related papers (2024-02-27T03:08:44Z) - ReconFusion: 3D Reconstruction with Diffusion Priors [104.73604630145847]
We present ReconFusion to reconstruct real-world scenes using only a few photos.
Our approach leverages a diffusion prior for novel view synthesis, trained on synthetic and multiview datasets.
Our method synthesizes realistic geometry and texture in underconstrained regions while preserving the appearance of observed regions.
arXiv Detail & Related papers (2023-12-05T18:59:58Z) - ReLight My NeRF: A Dataset for Novel View Synthesis and Relighting of
Real World Objects [14.526827265012045]
We introduce a novel dataset, dubbed ReNe (Relighting NeRF), framing real world objects under one-light-at-time (OLAT) conditions.
We release a total of 20 scenes depicting a variety of objects with complex geometry and challenging materials.
Each scene includes 2000 images, acquired from 50 different points of views under 40 different OLAT conditions.
arXiv Detail & Related papers (2023-04-20T16:43:58Z) - ViewNeRF: Unsupervised Viewpoint Estimation Using Category-Level Neural
Radiance Fields [35.89557494372891]
We introduce ViewNeRF, a Neural Radiance Field-based viewpoint estimation method.
Our method uses an analysis by synthesis approach, combining a conditional NeRF with a viewpoint predictor and a scene encoder.
Our model shows competitive results on synthetic and real datasets.
arXiv Detail & Related papers (2022-12-01T11:16:11Z) - CLONeR: Camera-Lidar Fusion for Occupancy Grid-aided Neural
Representations [77.90883737693325]
This paper proposes CLONeR, which significantly improves upon NeRF by allowing it to model large outdoor driving scenes observed from sparse input sensor views.
This is achieved by decoupling occupancy and color learning within the NeRF framework into separate Multi-Layer Perceptrons (MLPs) trained using LiDAR and camera data, respectively.
In addition, this paper proposes a novel method to build differentiable 3D Occupancy Grid Maps (OGM) alongside the NeRF model, and leverage this occupancy grid for improved sampling of points along a ray for rendering in metric space.
arXiv Detail & Related papers (2022-09-02T17:44:50Z) - RTMV: A Ray-Traced Multi-View Synthetic Dataset for Novel View Synthesis [104.53930611219654]
We present a large-scale synthetic dataset for novel view synthesis consisting of 300k images rendered from nearly 2000 complex scenes.
The dataset is orders of magnitude larger than existing synthetic datasets for novel view synthesis.
Using 4 distinct sources of high-quality 3D meshes, the scenes of our dataset exhibit challenging variations in camera views, lighting, shape, materials, and textures.
arXiv Detail & Related papers (2022-05-14T13:15:32Z) - Learning Dynamic View Synthesis With Few RGBD Cameras [60.36357774688289]
We propose to utilize RGBD cameras to synthesize free-viewpoint videos of dynamic indoor scenes.
We generate point clouds from RGBD frames and then render them into free-viewpoint videos via a neural feature.
We introduce a simple Regional Depth-Inpainting module that adaptively inpaints missing depth values to render complete novel views.
arXiv Detail & Related papers (2022-04-22T03:17:35Z) - iNeRF: Inverting Neural Radiance Fields for Pose Estimation [68.91325516370013]
We present iNeRF, a framework that performs mesh-free pose estimation by "inverting" a Neural RadianceField (NeRF)
NeRFs have been shown to be remarkably effective for the task of view synthesis.
arXiv Detail & Related papers (2020-12-10T18:36:40Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.