Sequence Matters: Harnessing Video Models in 3D Super-Resolution
- URL: http://arxiv.org/abs/2412.11525v3
- Date: Sat, 21 Dec 2024 10:41:08 GMT
- Title: Sequence Matters: Harnessing Video Models in 3D Super-Resolution
- Authors: Hyun-kyu Ko, Dongheok Park, Youngin Park, Byeonghyeon Lee, Juhee Han, Eunbyung Park,
- Abstract summary: 3D super-resolution aims to reconstruct high-fidelity 3D models from low-resolution (LR) multi-view images.
We perform a comprehensive study of 3D super-resolution by leveraging video super-resolution (VSR) models.
Our findings reveal that VSR models can perform remarkably well even on sequences that lack precise spatial alignment.
- Score: 3.009577929630171
- License:
- Abstract: 3D super-resolution aims to reconstruct high-fidelity 3D models from low-resolution (LR) multi-view images. Early studies primarily focused on single-image super-resolution (SISR) models to upsample LR images into high-resolution images. However, these methods often lack view consistency because they operate independently on each image. Although various post-processing techniques have been extensively explored to mitigate these inconsistencies, they have yet to fully resolve the issues. In this paper, we perform a comprehensive study of 3D super-resolution by leveraging video super-resolution (VSR) models. By utilizing VSR models, we ensure a higher degree of spatial consistency and can reference surrounding spatial information, leading to more accurate and detailed reconstructions. Our findings reveal that VSR models can perform remarkably well even on sequences that lack precise spatial alignment. Given this observation, we propose a simple yet practical approach to align LR images without involving fine-tuning or generating 'smooth' trajectory from the trained 3D models over LR images. The experimental results show that the surprisingly simple algorithms can achieve the state-of-the-art results of 3D super-resolution tasks on standard benchmark datasets, such as the NeRF-synthetic and MipNeRF-360 datasets. Project page: https://ko-lani.github.io/Sequence-Matters
Related papers
- SuperNeRF-GAN: A Universal 3D-Consistent Super-Resolution Framework for Efficient and Enhanced 3D-Aware Image Synthesis [59.73403876485574]
We propose SuperNeRF-GAN, a universal framework for 3D-consistent super-resolution.
A key highlight of SuperNeRF-GAN is its seamless integration with NeRF-based 3D-aware image synthesis methods.
Experimental results demonstrate the superior efficiency, 3D-consistency, and quality of our approach.
arXiv Detail & Related papers (2025-01-12T10:31:33Z) - DSplats: 3D Generation by Denoising Splats-Based Multiview Diffusion Models [67.50989119438508]
We introduce DSplats, a novel method that directly denoises multiview images using Gaussian-based Reconstructors to produce realistic 3D assets.
Our experiments demonstrate that DSplats not only produces high-quality, spatially consistent outputs, but also sets a new standard in single-image to 3D reconstruction.
arXiv Detail & Related papers (2024-12-11T07:32:17Z) - DiSR-NeRF: Diffusion-Guided View-Consistent Super-Resolution NeRF [50.458896463542494]
DiSR-NeRF is a diffusion-guided framework for view-consistent super-resolution (SR) NeRF.
We propose Iterative 3D Synchronization (I3DS) to mitigate the inconsistency problem via the inherent multi-view consistency property of NeRF.
arXiv Detail & Related papers (2024-04-01T03:06:23Z) - ComboVerse: Compositional 3D Assets Creation Using Spatially-Aware Diffusion Guidance [76.7746870349809]
We present ComboVerse, a 3D generation framework that produces high-quality 3D assets with complex compositions by learning to combine multiple models.
Our proposed framework emphasizes spatial alignment of objects, compared with standard score distillation sampling.
arXiv Detail & Related papers (2024-03-19T03:39:43Z) - Simple and Effective Synthesis of Indoor 3D Scenes [78.95697556834536]
We study the problem of immersive 3D indoor scenes from one or more images.
Our aim is to generate high-resolution images and videos from novel viewpoints.
We propose an image-to-image GAN that maps directly from reprojections of incomplete point clouds to full high-resolution RGB-D images.
arXiv Detail & Related papers (2022-04-06T17:54:46Z) - StyleNeRF: A Style-based 3D-Aware Generator for High-resolution Image
Synthesis [92.25145204543904]
StyleNeRF is a 3D-aware generative model for high-resolution image synthesis with high multi-view consistency.
It integrates the neural radiance field (NeRF) into a style-based generator.
It can synthesize high-resolution images at interactive rates while preserving 3D consistency at high quality.
arXiv Detail & Related papers (2021-10-18T02:37:01Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.