Related papers: Compressible-composable NeRF via Rank-residual Decomposition

Compressible-composable NeRF via Rank-residual Decomposition

URL: http://arxiv.org/abs/2205.14870v1
Date: Mon, 30 May 2022 06:18:59 GMT
Title: Compressible-composable NeRF via Rank-residual Decomposition
Authors: Jiaxiang Tang, Xiaokang Chen, Jingbo Wang, Gang Zeng
Abstract summary: Neural Radiance Field (NeRF) has emerged as a compelling method to represent 3D objects and scenes for photo-realistic rendering. We present a neural representation that enables efficient and convenient manipulation of models. Our method is able to achieve comparable rendering quality to state-of-the-art methods, while enabling extra capability of compression and composition.
Score: 21.92736190195887
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Neural Radiance Field (NeRF) has emerged as a compelling method to represent 3D objects and scenes for photo-realistic rendering. However, its implicit representation causes difficulty in manipulating the models like the explicit mesh representation. Several recent advances in NeRF manipulation are usually restricted by a shared renderer network, or suffer from large model size. To circumvent the hurdle, in this paper, we present an explicit neural field representation that enables efficient and convenient manipulation of models. To achieve this goal, we learn a hybrid tensor rank decomposition of the scene without neural networks. Motivated by the low-rank approximation property of the SVD algorithm, we propose a rank-residual learning strategy to encourage the preservation of primary information in lower ranks. The model size can then be dynamically adjusted by rank truncation to control the levels of detail, achieving near-optimal compression without extra optimization. Furthermore, different models can be arbitrarily transformed and composed into one scene by concatenating along the rank dimension. The growth of storage cost can also be mitigated by compressing the unimportant objects in the composed scene. We demonstrate that our method is able to achieve comparable rendering quality to state-of-the-art methods, while enabling extra capability of compression and composition. Code will be made available at \url{https://github.com/ashawkey/CCNeRF}.

Related papers

R3GS: Gaussian Splatting for Robust Reconstruction and Relocalization in Unconstrained Image Collections [9.633163304379861]
R3GS is a robust reconstruction and relocalization framework tailored for unconstrained datasets.<n>To mitigate the adverse effects of transient objects on the reconstruction process, we ffne-tune a lightweight human detection network.<n>To address the challenges posed by sky regions in outdoor scenes, we propose an effective sky-handling technique that incorporates a depth prior as a constraint.
arXiv Detail & Related papers (2025-05-21T09:25:22Z)
Scaled Inverse Graphics: Efficiently Learning Large Sets of 3D Scenes [8.847448988112903]
We introduce a framework termed "scaled inverse graphics", aimed at efficiently learning large sets of scene representations. It operates in two stages: (i) training a compression model on a subset of scenes, then (ii) training NeRF models on the resulting smaller representations. In practice, we compact the representation of scenes by learning NeRFs in a latent space to reduce the image resolution, and sharing information across scenes to reduce NeRF representation complexity.
arXiv Detail & Related papers (2024-10-31T08:58:00Z)
SIGMA:Sinkhorn-Guided Masked Video Modeling [69.31715194419091]
Sinkhorn-guided Masked Video Modelling ( SIGMA) is a novel video pretraining method. We distribute features of space-time tubes evenly across a limited number of learnable clusters. Experimental results on ten datasets validate the effectiveness of SIGMA in learning more performant, temporally-aware, and robust video representations.
arXiv Detail & Related papers (2024-07-22T08:04:09Z)
DistillNeRF: Perceiving 3D Scenes from Single-Glance Images by Distilling Neural Fields and Foundation Model Features [65.8738034806085]
DistillNeRF is a self-supervised learning framework for understanding 3D environments in autonomous driving scenes. Our method is a generalizable feedforward model that predicts a rich neural scene representation from sparse, single-frame multi-view camera inputs.
arXiv Detail & Related papers (2024-06-17T21:15:13Z)
Neural NeRF Compression [19.853882143024]
Recent NeRFs utilize feature grids to improve rendering quality and speed. These representations introduce significant storage overhead. This paper presents a novel method for efficiently compressing a grid-based NeRF model.
arXiv Detail & Related papers (2024-06-13T09:12:26Z)
N-BVH: Neural ray queries with bounding volume hierarchies [51.430495562430565]
In 3D computer graphics, the bulk of a scene's memory usage is due to polygons and textures. We devise N-BVH, a neural compression architecture designed to answer arbitrary ray queries in 3D. Our method provides faithful approximations of visibility, depth, and appearance attributes.
arXiv Detail & Related papers (2024-05-25T13:54:34Z)
Hyper-VolTran: Fast and Generalizable One-Shot Image to 3D Object Structure via HyperNetworks [53.67497327319569]
We introduce a novel neural rendering technique to solve image-to-3D from a single view. Our approach employs the signed distance function as the surface representation and incorporates generalizable priors through geometry-encoding volumes and HyperNetworks. Our experiments show the advantages of our proposed approach with consistent results and rapid generation.
arXiv Detail & Related papers (2023-12-24T08:42:37Z)
Lossy Image Compression with Conditional Diffusion Models [25.158390422252097]
This paper outlines an end-to-end optimized lossy image compression framework using diffusion generative models. In contrast to VAE-based neural compression, where the (mean) decoder is a deterministic neural network, our decoder is a conditional diffusion model. Our approach yields stronger reported FID scores than the GAN-based model, while also yielding competitive performance with VAE-based models in several distortion metrics.
arXiv Detail & Related papers (2022-09-14T21:53:27Z)
PeRFception: Perception using Radiance Fields [72.99583614735545]
We create the first large-scale implicit representation datasets for perception tasks, called the PeRFception. It shows a significant memory compression rate (96.4%) from the original dataset, while containing both 2D and 3D information in a unified form. We construct the classification and segmentation models that directly take as input this implicit format and also propose a novel augmentation technique to avoid overfitting on backgrounds of images.
arXiv Detail & Related papers (2022-08-24T13:32:46Z)
ERF: Explicit Radiance Field Reconstruction From Scratch [12.254150867994163]
We propose a novel explicit dense 3D reconstruction approach that processes a set of images of a scene with sensor poses and calibrations and estimates a photo-real digital model. One of the key innovations is that the underlying volumetric representation is completely explicit. We show that our method is general and practical. It does not require a highly controlled lab setup for capturing, but allows for reconstructing scenes with a vast variety of objects.
arXiv Detail & Related papers (2022-02-28T19:37:12Z)
InfoNeRF: Ray Entropy Minimization for Few-Shot Neural Volume Rendering [55.70938412352287]
We present an information-theoretic regularization technique for few-shot novel view synthesis based on neural implicit representation. The proposed approach minimizes potential reconstruction inconsistency that happens due to insufficient viewpoints. We achieve consistently improved performance compared to existing neural view synthesis methods by large margins on multiple standard benchmarks.
arXiv Detail & Related papers (2021-12-31T11:56:01Z)
Perceptron Synthesis Network: Rethinking the Action Scale Variances in Videos [48.57686258913474]
Video action recognition has been partially addressed by the CNNs stacking of fixed-size 3D kernels. We propose to learn the optimal-scale kernels from the data. An textitaction perceptron synthesizer is proposed to generate the kernels from a bag of fixed-size kernels.
arXiv Detail & Related papers (2020-07-22T14:22:29Z)

This list is automatically generated from the titles and abstracts of the papers in this site.