ReLU Fields: The Little Non-linearity That Could
- URL: http://arxiv.org/abs/2205.10824v2
- Date: Mon, 3 Jul 2023 00:27:54 GMT
- Title: ReLU Fields: The Little Non-linearity That Could
- Authors: Animesh Karnewar and Tobias Ritschel and Oliver Wang and Niloy J.
Mitra
- Abstract summary: We investigate what is the smallest change to grid-based representations that allows for retaining the high fidelity result ofs.
We show that such an approach becomes competitive with the state-of-the-art.
- Score: 62.228229880658404
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In many recent works, multi-layer perceptions (MLPs) have been shown to be
suitable for modeling complex spatially-varying functions including images and
3D scenes. Although the MLPs are able to represent complex scenes with
unprecedented quality and memory footprint, this expressive power of the MLPs,
however, comes at the cost of long training and inference times. On the other
hand, bilinear/trilinear interpolation on regular grid based representations
can give fast training and inference times, but cannot match the quality of
MLPs without requiring significant additional memory. Hence, in this work, we
investigate what is the smallest change to grid-based representations that
allows for retaining the high fidelity result of MLPs while enabling fast
reconstruction and rendering times. We introduce a surprisingly simple change
that achieves this task -- simply allowing a fixed non-linearity (ReLU) on
interpolated grid values. When combined with coarse to-fine optimization, we
show that such an approach becomes competitive with the state-of-the-art. We
report results on radiance fields, and occupancy fields, and compare against
multiple existing alternatives. Code and data for the paper are available at
https://geometry.cs.ucl.ac.uk/projects/2022/relu_fields.
Related papers
- MeshFeat: Multi-Resolution Features for Neural Fields on Meshes [38.93284476165776]
Parametric feature grid encodings have gained significant attention as an encoding approach for neural fields.
We propose MeshFeat, a parametric feature encoding tailored to meshes, for which we adapt the idea of multi-resolution feature grids from Euclidean space.
We show a significant speed-up compared to previous representations while maintaining comparable reconstruction quality for texture reconstruction and BRDF representation.
arXiv Detail & Related papers (2024-07-18T15:29:48Z) - ResFields: Residual Neural Fields for Spatiotemporal Signals [61.44420761752655]
ResFields is a novel class of networks specifically designed to effectively represent complex temporal signals.
We conduct comprehensive analysis of the properties of ResFields and propose a matrix factorization technique to reduce the number of trainable parameters.
We demonstrate the practical utility of ResFields by showcasing its effectiveness in capturing dynamic 3D scenes from sparse RGBD cameras.
arXiv Detail & Related papers (2023-09-06T16:59:36Z) - Strip-MLP: Efficient Token Interaction for Vision MLP [31.02197585697145]
We introduce textbfStrip-MLP to enrich the token interaction power in three ways.
Strip-MLP significantly improves the performance of spatial-based models on small datasets.
Models achieve higher average Top-1 accuracy than existing datasets by +2.44% on Caltech-101 and +2.16% on CIFAR-100.
arXiv Detail & Related papers (2023-07-21T09:40:42Z) - Multiscale Representation for Real-Time Anti-Aliasing Neural Rendering [84.37776381343662]
Mip-NeRF proposes a multiscale representation as a conical frustum to encode scale information.
We propose mip voxel grids (Mip-VoG), an explicit multiscale representation for real-time anti-aliasing rendering.
Our approach is the first to offer multiscale training and real-time anti-aliasing rendering simultaneously.
arXiv Detail & Related papers (2023-04-20T04:05:22Z) - Hybrid Mesh-neural Representation for 3D Transparent Object
Reconstruction [30.66452291775852]
We propose a novel method to reconstruct the 3D shapes of transparent objects using hand-held captured images under natural light conditions.
It combines the advantage of explicit mesh and multi-layer perceptron (MLP) network, a hybrid representation, to simplify the capture used in recent contributions.
arXiv Detail & Related papers (2022-03-23T17:58:56Z) - CoordX: Accelerating Implicit Neural Representation with a Split MLP
Architecture [2.6912336656165805]
Implicit neural representations with multi-layer perceptrons (MLPs) have recently gained prominence for a wide variety of tasks.
We propose a new split architecture, CoordX, to accelerate inference and training of coordinate-based representations.
We demonstrate a speedup of up to 2.92x compared to the baseline model for image, video, and 3D shape representation and rendering tasks.
arXiv Detail & Related papers (2022-01-28T21:30:42Z) - RepMLPNet: Hierarchical Vision MLP with Re-parameterized Locality [113.1414517605892]
We propose a methodology, Locality Injection, to incorporate local priors into an FC layer.
RepMLPNet is the first that seamlessly transfer to Cityscapes semantic segmentation.
arXiv Detail & Related papers (2021-12-21T10:28:17Z) - Hire-MLP: Vision MLP via Hierarchical Rearrangement [58.33383667626998]
Hire-MLP is a simple yet competitive vision architecture via rearrangement.
The proposed Hire-MLP architecture is built with simple channel-mixing operations, thus enjoys high flexibility and inference speed.
Experiments show that our Hire-MLP achieves state-of-the-art performance on the ImageNet-1K benchmark.
arXiv Detail & Related papers (2021-08-30T16:11:04Z) - CycleMLP: A MLP-like Architecture for Dense Prediction [26.74203747156439]
CycleMLP is a versatile backbone for visual recognition and dense predictions.
It can cope with various image sizes and achieves linear computational complexity to image size by using local windows.
CycleMLP aims to provide a competitive baseline on object detection, instance segmentation, and semantic segmentation for models.
arXiv Detail & Related papers (2021-07-21T17:23:06Z) - Recurrent Multi-view Alignment Network for Unsupervised Surface
Registration [79.72086524370819]
Learning non-rigid registration in an end-to-end manner is challenging due to the inherent high degrees of freedom and the lack of labeled training data.
We propose to represent the non-rigid transformation with a point-wise combination of several rigid transformations.
We also introduce a differentiable loss function that measures the 3D shape similarity on the projected multi-view 2D depth images.
arXiv Detail & Related papers (2020-11-24T14:22:42Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.