Plenoptic PNG: Real-Time Neural Radiance Fields in 150 KB
- URL: http://arxiv.org/abs/2409.15689v1
- Date: Tue, 24 Sep 2024 03:06:22 GMT
- Title: Plenoptic PNG: Real-Time Neural Radiance Fields in 150 KB
- Authors: Jae Yong Lee, Yuqun Wu, Chuhang Zou, Derek Hoiem, Shenlong Wang,
- Abstract summary: This paper aims to encode a 3D scene into an extremely compact representation from 2D images.
It enables its transmittance, decoding and rendering in real-time across various platforms.
- Score: 29.267039546199094
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The goal of this paper is to encode a 3D scene into an extremely compact representation from 2D images and to enable its transmittance, decoding and rendering in real-time across various platforms. Despite the progress in NeRFs and Gaussian Splats, their large model size and specialized renderers make it challenging to distribute free-viewpoint 3D content as easily as images. To address this, we have designed a novel 3D representation that encodes the plenoptic function into sinusoidal function indexed dense volumes. This approach facilitates feature sharing across different locations, improving compactness over traditional spatial voxels. The memory footprint of the dense 3D feature grid can be further reduced using spatial decomposition techniques. This design combines the strengths of spatial hashing functions and voxel decomposition, resulting in a model size as small as 150 KB for each 3D scene. Moreover, PPNG features a lightweight rendering pipeline with only 300 lines of code that decodes its representation into standard GL textures and fragment shaders. This enables real-time rendering using the traditional GL pipeline, ensuring universal compatibility and efficiency across various platforms without additional dependencies.
Related papers
- 3D Convex Splatting: Radiance Field Rendering with 3D Smooth Convexes [87.01284850604495]
We introduce 3D Convexting (3DCS), which leverages 3D smooth convexes as primitives for modeling geometrically-meaningful radiance fields from multiview images.
3DCS achieves superior performance over 3DGS on benchmarks such as MipNeizer, Tanks and Temples, and Deep Blending.
Our results highlight the potential of 3D Convexting to become the new standard for high-quality scene reconstruction.
arXiv Detail & Related papers (2024-11-22T14:31:39Z) - Direct and Explicit 3D Generation from a Single Image [25.207277983430608]
We introduce a novel framework to directly generate explicit surface geometry and texture using multi-view 2D depth and RGB images.
We incorporate epipolar attention into the latent-to-pixel decoder for pixel-level multi-view consistency.
By back-projecting the generated depth pixels into 3D space, we create a structured 3D representation.
arXiv Detail & Related papers (2024-11-17T03:14:50Z) - EVER: Exact Volumetric Ellipsoid Rendering for Real-time View Synthesis [72.53316783628803]
We present Exact Volumetric Ellipsoid Rendering (EVER), a method for real-time differentiable emission-only volume rendering.
Unlike recentization based approach by 3D Gaussian Splatting (3DGS), our primitive based representation allows for exact volume rendering.
We show that our method is more accurate with blending issues than 3DGS and follow-up work on view rendering.
arXiv Detail & Related papers (2024-10-02T17:59:09Z) - Compact 3D Gaussian Splatting for Static and Dynamic Radiance Fields [13.729716867839509]
We propose a learnable mask strategy that significantly reduces the number of Gaussians while preserving high performance.
In addition, we propose a compact but effective representation of view-dependent color by employing a grid-based neural field.
Our work provides a comprehensive framework for 3D scene representation, achieving high performance, fast training, compactness, and real-time rendering.
arXiv Detail & Related papers (2024-08-07T14:56:34Z) - Lightplane: Highly-Scalable Components for Neural 3D Fields [54.59244949629677]
Lightplane Render and Splatter significantly reduce memory usage in 2D-3D mapping.
These innovations enable the processing of vastly more and higher resolution images with small memory and computational costs.
arXiv Detail & Related papers (2024-04-30T17:59:51Z) - Compress3D: a Compressed Latent Space for 3D Generation from a Single Image [27.53099431097921]
Triplane autoencoder encodes 3D models into a compact triplane latent space to compress both the 3D geometry and texture information.
We introduce a 3D-aware cross-attention mechanism, which utilizes low-resolution latent representations to query features from a high-resolution 3D feature volume.
Our approach enables the generation of high-quality 3D assets in merely 7 seconds on a single A100 GPU.
arXiv Detail & Related papers (2024-03-20T11:51:04Z) - Compact 3D Gaussian Representation for Radiance Field [14.729871192785696]
We propose a learnable mask strategy to reduce the number of 3D Gaussian points without sacrificing performance.
We also propose a compact but effective representation of view-dependent color by employing a grid-based neural field.
Our work provides a comprehensive framework for 3D scene representation, achieving high performance, fast training, compactness, and real-time rendering.
arXiv Detail & Related papers (2023-11-22T20:31:16Z) - EvaSurf: Efficient View-Aware Implicit Textured Surface Reconstruction on Mobile Devices [53.28220984270622]
We present an implicit textured $textbfSurf$ace reconstruction method on mobile devices.
Our method can reconstruct high-quality appearance and accurate mesh on both synthetic and real-world datasets.
Our method can be trained in just 1-2 hours using a single GPU and run on mobile devices at over 40 FPS (Frames Per Second)
arXiv Detail & Related papers (2023-11-16T11:30:56Z) - Efficient 3D Articulated Human Generation with Layered Surface Volumes [131.3802971483426]
We introduce layered surface volumes (LSVs) as a new 3D object representation for articulated digital humans.
LSVs represent a human body using multiple textured layers around a conventional template.
They exhibit exceptional efficiency in GAN settings, where a 2D generator learns to synthesize the RGBA textures for the individual layers.
arXiv Detail & Related papers (2023-07-11T17:50:02Z) - High-fidelity 3D GAN Inversion by Pseudo-multi-view Optimization [51.878078860524795]
We present a high-fidelity 3D generative adversarial network (GAN) inversion framework that can synthesize photo-realistic novel views.
Our approach enables high-fidelity 3D rendering from a single image, which is promising for various applications of AI-generated 3D content.
arXiv Detail & Related papers (2022-11-28T18:59:52Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.