Related papers: Novel View Extrapolation with Video Diffusion Priors

Novel View Extrapolation with Video Diffusion Priors

URL: http://arxiv.org/abs/2411.14208v1
Date: Thu, 21 Nov 2024 15:16:48 GMT
Title: Novel View Extrapolation with Video Diffusion Priors
Authors: Kunhao Liu, Ling Shao, Shijian Lu,
Abstract summary: ViewExtrapolator is a novel view synthesis approach that leverages the generative priors of Stable Video Diffusion (SVD) for realistic novel view extrapolation. ViewExtrapolator can work with different types of 3D rendering such as views rendered from point clouds when only a single view or monocular video is available.
Score: 98.314893665023
License: http://creativecommons.org/licenses/by-sa/4.0/
Abstract: The field of novel view synthesis has made significant strides thanks to the development of radiance field methods. However, most radiance field techniques are far better at novel view interpolation than novel view extrapolation where the synthesis novel views are far beyond the observed training views. We design ViewExtrapolator, a novel view synthesis approach that leverages the generative priors of Stable Video Diffusion (SVD) for realistic novel view extrapolation. By redesigning the SVD denoising process, ViewExtrapolator refines the artifact-prone views rendered by radiance fields, greatly enhancing the clarity and realism of the synthesized novel views. ViewExtrapolator is a generic novel view extrapolator that can work with different types of 3D rendering such as views rendered from point clouds when only a single view or monocular video is available. Additionally, ViewExtrapolator requires no fine-tuning of SVD, making it both data-efficient and computation-efficient. Extensive experiments demonstrate the superiority of ViewExtrapolator in novel view extrapolation. Project page: \url{https://kunhao-liu.github.io/ViewExtrapolator/}.

Related papers

Exploiting Radiance Fields for Grasp Generation on Novel Synthetic Views [7.305342793164903]
We show initial results which indicate that novel view synthesis can provide additional context in generating grasp poses.<n>Our experiments on the Graspnet-1billion dataset show that novel views contributed force-closure grasps.<n>In the future we hope this work can be extended to improve grasp extraction from radiance fields constructed with a single input image.
arXiv Detail & Related papers (2025-05-16T17:23:09Z)
Synthesizing Consistent Novel Views via 3D Epipolar Attention without Re-Training [102.82553402539139]
Large diffusion models demonstrate remarkable zero-shot capabilities in novel view synthesis from a single image. These models often face challenges in maintaining consistency across novel and reference views. We propose to use epipolar geometry to locate and retrieve overlapping information from the input view. This information is then incorporated into the generation of target views, eliminating the need for training or fine-tuning.
arXiv Detail & Related papers (2025-02-25T14:04:22Z)
FreeVS: Generative View Synthesis on Free Driving Trajectory [55.49370963413221]
FreeVS is a novel fully generative approach that can synthesize camera views on free new trajectories in real driving scenes. FreeVS can be applied to any validation sequences without reconstruction process and synthesis views on novel trajectories.
arXiv Detail & Related papers (2024-10-23T17:59:11Z)
ViewCrafter: Taming Video Diffusion Models for High-fidelity Novel View Synthesis [63.169364481672915]
We propose textbfViewCrafter, a novel method for synthesizing high-fidelity novel views of generic scenes from single or sparse images. Our method takes advantage of the powerful generation capabilities of video diffusion model and the coarse 3D clues offered by point-based representation to generate high-quality video frames.
arXiv Detail & Related papers (2024-09-03T16:53:19Z)
Sampling for View Synthesis: From Local Light Field Fusion to Neural Radiance Fields and Beyond [27.339452004523082]
Local light field fusion proposes an algorithm for practical view synthesis from an irregular grid of sampled views. We achieve the perceptual quality of Nyquist rate view sampling while using up to 4000x fewer views. We reprise some of the recent results on sparse and even single image view synthesis.
arXiv Detail & Related papers (2024-08-08T16:56:03Z)
SceneRF: Self-Supervised Monocular 3D Scene Reconstruction with Radiance Fields [19.740018132105757]
SceneRF is a self-supervised monocular scene reconstruction method using only posed image sequences for training. At inference, a single input image suffices to hallucinate novel depth views which are fused together to obtain 3D scene reconstruction.
arXiv Detail & Related papers (2022-12-05T18:59:57Z)
Fast Non-Rigid Radiance Fields from Monocularized Data [66.74229489512683]
This paper proposes a new method for full 360deg inward-facing novel view synthesis of non-rigidly deforming scenes. At the core of our method are 1) An efficient deformation module that decouples the processing of spatial and temporal information for accelerated training and inference; and 2) A static module representing the canonical scene as a fast hash-encoded neural radiance field. In both cases, our method is significantly faster than previous methods, converging in less than 7 minutes and achieving real-time framerates at 1K resolution, while obtaining a higher visual accuracy for generated novel views.
arXiv Detail & Related papers (2022-12-02T18:51:10Z)
Generalizable Patch-Based Neural Rendering [46.41746536545268]
We propose a new paradigm for learning models that can synthesize novel views of unseen scenes. Our method is capable of predicting the color of a target ray in a novel scene directly, just from a collection of patches sampled from the scene. We show that our approach outperforms the state-of-the-art on novel view synthesis of unseen scenes even when being trained with considerably less data than prior work.
arXiv Detail & Related papers (2022-07-21T17:57:04Z)
Remote Sensing Novel View Synthesis with Implicit Multiplane Representations [26.33490094119609]
We propose a novel remote sensing view synthesis method by leveraging the recent advances in implicit neural representations. Considering the overhead and far depth imaging of remote sensing images, we represent the 3D space by combining implicit multiplane images (MPI) representation and deep neural networks. Images from any novel views can be freely rendered on the basis of the reconstructed model.
arXiv Detail & Related papers (2022-05-18T13:03:55Z)
Ray Priors through Reprojection: Improving Neural Radiance Fields for Novel View Extrapolation [35.47411859184933]
We study the novel view extrapolation setting that (1) the training images can well describe an object, and (2) there is a notable discrepancy between the training and test viewpoints' distributions. We propose a random ray casting policy that allows training unseen views using seen views. A ray atlas pre-computed from the observed rays' viewing directions could further enhance the rendering quality for extrapolated views.
arXiv Detail & Related papers (2022-05-12T07:21:17Z)
NeLF: Practical Novel View Synthesis with Neural Light Field [93.41020940730915]
We present a practical and robust deep learning solution for the novel view synthesis of complex scenes. In our approach, a continuous scene is represented as a light field, i.e., a set of rays, each of which has a corresponding color. Our method achieves state-of-the-art novel view synthesis results while maintaining an interactive frame rate.
arXiv Detail & Related papers (2021-05-15T01:20:30Z)
NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis [78.5281048849446]
We present a method that achieves state-of-the-art results for synthesizing novel views of complex scenes. Our algorithm represents a scene using a fully-connected (non-convolutional) deep network. Because volume rendering is naturally differentiable, the only input required to optimize our representation is a set of images with known camera poses.
arXiv Detail & Related papers (2020-03-19T17:57:23Z)

This list is automatically generated from the titles and abstracts of the papers in this site.