ACORN: Adaptive Coordinate Networks for Neural Scene Representation
- URL: http://arxiv.org/abs/2105.02788v1
- Date: Thu, 6 May 2021 16:21:38 GMT
- Title: ACORN: Adaptive Coordinate Networks for Neural Scene Representation
- Authors: Julien N. P. Martel, David B. Lindell, Connor Z. Lin, Eric R. Chan,
Marco Monteiro and Gordon Wetzstein
- Abstract summary: Current neural representations fail to accurately represent images at resolutions greater than a megapixel or 3D scenes with more than a few hundred thousand polygons.
We introduce a new hybrid implicit-explicit network architecture and training strategy that adaptively allocates resources during training and inference.
We demonstrate the first experiments that fit gigapixel images to nearly 40 dB peak signal-to-noise ratio.
- Score: 40.04760307540698
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Neural representations have emerged as a new paradigm for applications in
rendering, imaging, geometric modeling, and simulation. Compared to traditional
representations such as meshes, point clouds, or volumes they can be flexibly
incorporated into differentiable learning-based pipelines. While recent
improvements to neural representations now make it possible to represent
signals with fine details at moderate resolutions (e.g., for images and 3D
shapes), adequately representing large-scale or complex scenes has proven a
challenge. Current neural representations fail to accurately represent images
at resolutions greater than a megapixel or 3D scenes with more than a few
hundred thousand polygons. Here, we introduce a new hybrid implicit-explicit
network architecture and training strategy that adaptively allocates resources
during training and inference based on the local complexity of a signal of
interest. Our approach uses a multiscale block-coordinate decomposition,
similar to a quadtree or octree, that is optimized during training. The network
architecture operates in two stages: using the bulk of the network parameters,
a coordinate encoder generates a feature grid in a single forward pass. Then,
hundreds or thousands of samples within each block can be efficiently evaluated
using a lightweight feature decoder. With this hybrid implicit-explicit network
architecture, we demonstrate the first experiments that fit gigapixel images to
nearly 40 dB peak signal-to-noise ratio. Notably this represents an increase in
scale of over 1000x compared to the resolution of previously demonstrated
image-fitting experiments. Moreover, our approach is able to represent 3D
shapes significantly faster and better than previous techniques; it reduces
training times from days to hours or minutes and memory requirements by over an
order of magnitude.
Related papers
- N-BVH: Neural ray queries with bounding volume hierarchies [51.430495562430565]
In 3D computer graphics, the bulk of a scene's memory usage is due to polygons and textures.
We devise N-BVH, a neural compression architecture designed to answer arbitrary ray queries in 3D.
Our method provides faithful approximations of visibility, depth, and appearance attributes.
arXiv Detail & Related papers (2024-05-25T13:54:34Z) - Hyper-VolTran: Fast and Generalizable One-Shot Image to 3D Object
Structure via HyperNetworks [53.67497327319569]
We introduce a novel neural rendering technique to solve image-to-3D from a single view.
Our approach employs the signed distance function as the surface representation and incorporates generalizable priors through geometry-encoding volumes and HyperNetworks.
Our experiments show the advantages of our proposed approach with consistent results and rapid generation.
arXiv Detail & Related papers (2023-12-24T08:42:37Z) - T-former: An Efficient Transformer for Image Inpainting [50.43302925662507]
A class of attention-based network architectures, called transformer, has shown significant performance on natural language processing fields.
In this paper, we design a novel attention linearly related to the resolution according to Taylor expansion, and based on this attention, a network called $T$-former is designed for image inpainting.
Experiments on several benchmark datasets demonstrate that our proposed method achieves state-of-the-art accuracy while maintaining a relatively low number of parameters and computational complexity.
arXiv Detail & Related papers (2023-05-12T04:10:42Z) - Neural Contourlet Network for Monocular 360 Depth Estimation [37.82642960470551]
We provide a new perspective that constructs an interpretable and sparse representation for a 360 image.
We propose a neural contourlet network consisting of a convolutional neural network and a contourlet transform branch.
In the encoder stage, we design a spatial-spectral fusion module to effectively fuse two types of cues.
arXiv Detail & Related papers (2022-08-03T02:25:55Z) - CoordX: Accelerating Implicit Neural Representation with a Split MLP
Architecture [2.6912336656165805]
Implicit neural representations with multi-layer perceptrons (MLPs) have recently gained prominence for a wide variety of tasks.
We propose a new split architecture, CoordX, to accelerate inference and training of coordinate-based representations.
We demonstrate a speedup of up to 2.92x compared to the baseline model for image, video, and 3D shape representation and rendering tasks.
arXiv Detail & Related papers (2022-01-28T21:30:42Z) - Meta-Learning Sparse Implicit Neural Representations [69.15490627853629]
Implicit neural representations are a promising new avenue of representing general signals.
Current approach is difficult to scale for a large number of signals or a data set.
We show that meta-learned sparse neural representations achieve a much smaller loss than dense meta-learned models.
arXiv Detail & Related papers (2021-10-27T18:02:53Z) - Neural Geometric Level of Detail: Real-time Rendering with Implicit 3D
Shapes [77.6741486264257]
We introduce an efficient neural representation that, for the first time, enables real-time rendering of high-fidelity neural SDFs.
We show that our representation is 2-3 orders of magnitude more efficient in terms of rendering speed compared to previous works.
arXiv Detail & Related papers (2021-01-26T18:50:22Z) - Learned Initializations for Optimizing Coordinate-Based Neural
Representations [47.408295381897815]
Coordinate-based neural representations have shown significant promise as an alternative to discrete, array-based representations.
We propose applying standard meta-learning algorithms to learn the initial weight parameters for these fully-connected networks.
We explore these benefits across a variety of tasks, including representing 2D images, reconstructing CT scans, and recovering 3D shapes and scenes from 2D image observations.
arXiv Detail & Related papers (2020-12-03T18:59:52Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.