Related papers: One is All: Bridging the Gap Between Neural Radiance Fields Architectures with Progressive Volume Distillation

One is All: Bridging the Gap Between Neural Radiance Fields Architectures with Progressive Volume Distillation

URL: http://arxiv.org/abs/2211.15977v2
Date: Wed, 30 Nov 2022 01:40:36 GMT
Title: One is All: Bridging the Gap Between Neural Radiance Fields Architectures with Progressive Volume Distillation
Authors: Shuangkang Fang, Weixin Xu, Heng Wang, Yi Yang, Yufeng Wang, Shuchang Zhou
Abstract summary: Neural Radiance Fields (NeRF) methods have proved effective as compact, high-quality and versatile representations for 3D scenes. Various neural architectures are vying for the core structure of NeRF, including the plain Multi-Layer Perceptron (MLP), sparses, low-rank tensors, hashtables and their compositions. In this paper, we propose Progressive Volume Distillation (PVD), a systematic distillation method that allows any-to-any conversions.
Score: 26.144617488670963
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Neural Radiance Fields (NeRF) methods have proved effective as compact, high-quality and versatile representations for 3D scenes, and enable downstream tasks such as editing, retrieval, navigation, etc. Various neural architectures are vying for the core structure of NeRF, including the plain Multi-Layer Perceptron (MLP), sparse tensors, low-rank tensors, hashtables and their compositions. Each of these representations has its particular set of trade-offs. For example, the hashtable-based representations admit faster training and rendering but their lack of clear geometric meaning hampers downstream tasks like spatial-relation-aware editing. In this paper, we propose Progressive Volume Distillation (PVD), a systematic distillation method that allows any-to-any conversions between different architectures, including MLP, sparse or low-rank tensors, hashtables and their compositions. PVD consequently empowers downstream applications to optimally adapt the neural representations for the task at hand in a post hoc fashion. The conversions are fast, as distillation is progressively performed on different levels of volume representations, from shallower to deeper. We also employ special treatment of density to deal with its specific numerical instability problem. Empirical evidence is presented to validate our method on the NeRF-Synthetic, LLFF and TanksAndTemples datasets. For example, with PVD, an MLP-based NeRF model can be distilled from a hashtable-based Instant-NGP model at a 10X~20X faster speed than being trained the original NeRF from scratch, while achieving a superior level of synthesis quality. Code is available at https://github.com/megvii-research/AAAI2023-PVD.

Related papers

F-INR: Functional Tensor Decomposition for Implicit Neural Representations [7.183424522250937]
Implicit Representation (INR) has emerged as a powerful tool for encoding discrete signals into continuous, differentiable functions using neural networks. We propose F-INR, a framework that reformulates INR learning through functional decomposition, breaking down high-dimensional tasks into lightweight, axis-specific sub-networks.
arXiv Detail & Related papers (2025-03-27T13:51:31Z)
Splatfacto-W: A Nerfstudio Implementation of Gaussian Splatting for Unconstrained Photo Collections [25.154665328053333]
We introduce Splatfacto-W, an in-trivial approach that integrates per-Gaussian neural color features and per-image appearance embeddings into an rendering process. Our method improves the Peak Signal-to-Noise Ratio (PSNR) by an average of 5.3 dB compared to 3DGS, enhances training speed by 150 times compared to NeRF-based methods, and achieves a similar rendering speed to 3DGS.
arXiv Detail & Related papers (2024-07-17T04:02:54Z)
N-BVH: Neural ray queries with bounding volume hierarchies [51.430495562430565]
In 3D computer graphics, the bulk of a scene's memory usage is due to polygons and textures. We devise N-BVH, a neural compression architecture designed to answer arbitrary ray queries in 3D. Our method provides faithful approximations of visibility, depth, and appearance attributes.
arXiv Detail & Related papers (2024-05-25T13:54:34Z)
ToddlerDiffusion: Interactive Structured Image Generation with Cascaded Schrödinger Bridge [63.00793292863]
ToddlerDiffusion is a novel approach to decomposing the complex task of RGB image generation into simpler, interpretable stages. Our method, termed ToddlerDiffusion, cascades modality-specific models, each responsible for generating an intermediate representation. ToddlerDiffusion consistently outperforms state-of-the-art methods.
arXiv Detail & Related papers (2023-11-24T15:20:01Z)
VQ-NeRF: Vector Quantization Enhances Implicit Neural Representations [25.88881764546414]
VQ-NeRF is an efficient pipeline for enhancing implicit neural representations via vector quantization. We present an innovative multi-scale NeRF sampling scheme that concurrently optimize the NeRF model at both compressed and original scales. We incorporate a semantic loss function to improve the geometric fidelity and semantic coherence of our 3D reconstructions.
arXiv Detail & Related papers (2023-10-23T01:41:38Z)
Adaptive Multi-NeRF: Exploit Efficient Parallelism in Adaptive Multiple Scale Neural Radiance Field Rendering [3.8200916793910973]
Recent advances in Neural Radiance Fields (NeRF) have demonstrated significant potential for representing 3D scene appearances as implicit neural networks. However, the lengthy training and rendering process hinders the widespread adoption of this promising technique for real-time rendering applications. We present an effective adaptive multi-NeRF method designed to accelerate the neural rendering process for large scenes.
arXiv Detail & Related papers (2023-10-03T08:34:49Z)
Progressive Fourier Neural Representation for Sequential Video Compilation [75.43041679717376]
Motivated by continual learning, this work investigates how to accumulate and transfer neural implicit representations for multiple complex video data over sequential encoding sessions. We propose a novel method, Progressive Fourier Neural Representation (PFNR), that aims to find an adaptive and compact sub-module in Fourier space to encode videos in each training session. We validate our PFNR method on the UVG8/17 and DAVIS50 video sequence benchmarks and achieve impressive performance gains over strong continual learning baselines.
arXiv Detail & Related papers (2023-06-20T06:02:19Z)
Progressive Volume Distillation with Active Learning for Efficient NeRF Architecture Conversion [27.389511043400635]
Neural Fields (NeRF) have been widely adopted as practical and versatile representations for 3D scenes. We propose Progressive Volume Distillation with Active Learning (PVD-AL), a systematic distillation method. PVD-AL decomposes each structure into two parts and progressively performs distillation from shallower to deeper volume representation.
arXiv Detail & Related papers (2023-04-08T13:59:18Z)
AligNeRF: High-Fidelity Neural Radiance Fields via Alignment-Aware Training [100.33713282611448]
We conduct the first pilot study on training NeRF with high-resolution data. We propose the corresponding solutions, including marrying the multilayer perceptron with convolutional layers. Our approach is nearly free without introducing obvious training/testing costs.
arXiv Detail & Related papers (2022-11-17T17:22:28Z)
InfoNeRF: Ray Entropy Minimization for Few-Shot Neural Volume Rendering [55.70938412352287]
We present an information-theoretic regularization technique for few-shot novel view synthesis based on neural implicit representation. The proposed approach minimizes potential reconstruction inconsistency that happens due to insufficient viewpoints. We achieve consistently improved performance compared to existing neural view synthesis methods by large margins on multiple standard benchmarks.
arXiv Detail & Related papers (2021-12-31T11:56:01Z)
Dynamic Convolution for 3D Point Cloud Instance Segmentation [146.7971476424351]
We propose an approach to instance segmentation from 3D point clouds based on dynamic convolution. We gather homogeneous points that have identical semantic categories and close votes for the geometric centroids. The proposed approach is proposal-free, and instead exploits a convolution process that adapts to the spatial and semantic characteristics of each instance.
arXiv Detail & Related papers (2021-07-18T09:05:16Z)

This list is automatically generated from the titles and abstracts of the papers in this site.