Related papers: ACORN: Adaptive Coordinate Networks for Neural Scene Representation

ACORN: Adaptive Coordinate Networks for Neural Scene Representation

URL: http://arxiv.org/abs/2105.02788v1
Date: Thu, 6 May 2021 16:21:38 GMT
Title: ACORN: Adaptive Coordinate Networks for Neural Scene Representation
Authors: Julien N. P. Martel, David B. Lindell, Connor Z. Lin, Eric R. Chan, Marco Monteiro and Gordon Wetzstein
Abstract summary: Current neural representations fail to accurately represent images at resolutions greater than a megapixel or 3D scenes with more than a few hundred thousand polygons. We introduce a new hybrid implicit-explicit network architecture and training strategy that adaptively allocates resources during training and inference. We demonstrate the first experiments that fit gigapixel images to nearly 40 dB peak signal-to-noise ratio.
Score: 40.04760307540698
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Neural representations have emerged as a new paradigm for applications in rendering, imaging, geometric modeling, and simulation. Compared to traditional representations such as meshes, point clouds, or volumes they can be flexibly incorporated into differentiable learning-based pipelines. While recent improvements to neural representations now make it possible to represent signals with fine details at moderate resolutions (e.g., for images and 3D shapes), adequately representing large-scale or complex scenes has proven a challenge. Current neural representations fail to accurately represent images at resolutions greater than a megapixel or 3D scenes with more than a few hundred thousand polygons. Here, we introduce a new hybrid implicit-explicit network architecture and training strategy that adaptively allocates resources during training and inference based on the local complexity of a signal of interest. Our approach uses a multiscale block-coordinate decomposition, similar to a quadtree or octree, that is optimized during training. The network architecture operates in two stages: using the bulk of the network parameters, a coordinate encoder generates a feature grid in a single forward pass. Then, hundreds or thousands of samples within each block can be efficiently evaluated using a lightweight feature decoder. With this hybrid implicit-explicit network architecture, we demonstrate the first experiments that fit gigapixel images to nearly 40 dB peak signal-to-noise ratio. Notably this represents an increase in scale of over 1000x compared to the resolution of previously demonstrated image-fitting experiments. Moreover, our approach is able to represent 3D shapes significantly faster and better than previous techniques; it reduces training times from days to hours or minutes and memory requirements by over an order of magnitude.

Related papers

LoFi: Neural Local Fields for Scalable Image Reconstruction [11.544632963705858]
We introduce a coordinate-based framework for solving imaging inverse problems, termed LoFi (Local Field) LoFi processes local information at each coordinate separately by multi-layer perceptrons (MLPs), recovering the object at that specific coordinate. LoFi achieves excellent generalization to out-of-distribution data with memory usage almost independent of image resolution.
arXiv Detail & Related papers (2024-11-07T18:58:57Z)
N-BVH: Neural ray queries with bounding volume hierarchies [51.430495562430565]
In 3D computer graphics, the bulk of a scene's memory usage is due to polygons and textures. We devise N-BVH, a neural compression architecture designed to answer arbitrary ray queries in 3D. Our method provides faithful approximations of visibility, depth, and appearance attributes.
arXiv Detail & Related papers (2024-05-25T13:54:34Z)
Hyper-VolTran: Fast and Generalizable One-Shot Image to 3D Object Structure via HyperNetworks [53.67497327319569]
We introduce a novel neural rendering technique to solve image-to-3D from a single view. Our approach employs the signed distance function as the surface representation and incorporates generalizable priors through geometry-encoding volumes and HyperNetworks. Our experiments show the advantages of our proposed approach with consistent results and rapid generation.
arXiv Detail & Related papers (2023-12-24T08:42:37Z)
T-former: An Efficient Transformer for Image Inpainting [50.43302925662507]
A class of attention-based network architectures, called transformer, has shown significant performance on natural language processing fields. In this paper, we design a novel attention linearly related to the resolution according to Taylor expansion, and based on this attention, a network called $T$-former is designed for image inpainting. Experiments on several benchmark datasets demonstrate that our proposed method achieves state-of-the-art accuracy while maintaining a relatively low number of parameters and computational complexity.
arXiv Detail & Related papers (2023-05-12T04:10:42Z)
Neural Contourlet Network for Monocular 360 Depth Estimation [37.82642960470551]
We provide a new perspective that constructs an interpretable and sparse representation for a 360 image. We propose a neural contourlet network consisting of a convolutional neural network and a contourlet transform branch. In the encoder stage, we design a spatial-spectral fusion module to effectively fuse two types of cues.
arXiv Detail & Related papers (2022-08-03T02:25:55Z)
CoordX: Accelerating Implicit Neural Representation with a Split MLP Architecture [2.6912336656165805]
Implicit neural representations with multi-layer perceptrons (MLPs) have recently gained prominence for a wide variety of tasks. We propose a new split architecture, CoordX, to accelerate inference and training of coordinate-based representations. We demonstrate a speedup of up to 2.92x compared to the baseline model for image, video, and 3D shape representation and rendering tasks.
arXiv Detail & Related papers (2022-01-28T21:30:42Z)
Meta-Learning Sparse Implicit Neural Representations [69.15490627853629]
Implicit neural representations are a promising new avenue of representing general signals. Current approach is difficult to scale for a large number of signals or a data set. We show that meta-learned sparse neural representations achieve a much smaller loss than dense meta-learned models.
arXiv Detail & Related papers (2021-10-27T18:02:53Z)
Neural Geometric Level of Detail: Real-time Rendering with Implicit 3D Shapes [77.6741486264257]
We introduce an efficient neural representation that, for the first time, enables real-time rendering of high-fidelity neural SDFs. We show that our representation is 2-3 orders of magnitude more efficient in terms of rendering speed compared to previous works.
arXiv Detail & Related papers (2021-01-26T18:50:22Z)
Learned Initializations for Optimizing Coordinate-Based Neural Representations [47.408295381897815]
Coordinate-based neural representations have shown significant promise as an alternative to discrete, array-based representations. We propose applying standard meta-learning algorithms to learn the initial weight parameters for these fully-connected networks. We explore these benefits across a variety of tasks, including representing 2D images, reconstructing CT scans, and recovering 3D shapes and scenes from 2D image observations.
arXiv Detail & Related papers (2020-12-03T18:59:52Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.