Related papers: ReLU Fields: The Little Non-linearity That Could

ReLU Fields: The Little Non-linearity That Could

URL: http://arxiv.org/abs/2205.10824v2
Date: Mon, 3 Jul 2023 00:27:54 GMT
Title: ReLU Fields: The Little Non-linearity That Could
Authors: Animesh Karnewar and Tobias Ritschel and Oliver Wang and Niloy J. Mitra
Abstract summary: We investigate what is the smallest change to grid-based representations that allows for retaining the high fidelity result ofs. We show that such an approach becomes competitive with the state-of-the-art.
Score: 62.228229880658404
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: In many recent works, multi-layer perceptions (MLPs) have been shown to be suitable for modeling complex spatially-varying functions including images and 3D scenes. Although the MLPs are able to represent complex scenes with unprecedented quality and memory footprint, this expressive power of the MLPs, however, comes at the cost of long training and inference times. On the other hand, bilinear/trilinear interpolation on regular grid based representations can give fast training and inference times, but cannot match the quality of MLPs without requiring significant additional memory. Hence, in this work, we investigate what is the smallest change to grid-based representations that allows for retaining the high fidelity result of MLPs while enabling fast reconstruction and rendering times. We introduce a surprisingly simple change that achieves this task -- simply allowing a fixed non-linearity (ReLU) on interpolated grid values. When combined with coarse to-fine optimization, we show that such an approach becomes competitive with the state-of-the-art. We report results on radiance fields, and occupancy fields, and compare against multiple existing alternatives. Code and data for the paper are available at https://geometry.cs.ucl.ac.uk/projects/2022/relu_fields.

Related papers

Converting MLPs into Polynomials in Closed Form [0.7234862895932991]
We derive theoretically closed-form least-squares optimal approximations of feedforward networks. We show that quadratic approximants can be used to create SVD-based adversarial examples.
arXiv Detail & Related papers (2025-02-03T03:54:41Z)
MeshFeat: Multi-Resolution Features for Neural Fields on Meshes [38.93284476165776]
Parametric feature grid encodings have gained significant attention as an encoding approach for neural fields. We propose MeshFeat, a parametric feature encoding tailored to meshes, for which we adapt the idea of multi-resolution feature grids from Euclidean space. We show a significant speed-up compared to previous representations while maintaining comparable reconstruction quality for texture reconstruction and BRDF representation.
arXiv Detail & Related papers (2024-07-18T15:29:48Z)
ResFields: Residual Neural Fields for Spatiotemporal Signals [61.44420761752655]
ResFields is a novel class of networks specifically designed to effectively represent complex temporal signals. We conduct comprehensive analysis of the properties of ResFields and propose a matrix factorization technique to reduce the number of trainable parameters. We demonstrate the practical utility of ResFields by showcasing its effectiveness in capturing dynamic 3D scenes from sparse RGBD cameras.
arXiv Detail & Related papers (2023-09-06T16:59:36Z)
Strip-MLP: Efficient Token Interaction for Vision MLP [31.02197585697145]
We introduce textbfStrip-MLP to enrich the token interaction power in three ways. Strip-MLP significantly improves the performance of spatial-based models on small datasets. Models achieve higher average Top-1 accuracy than existing datasets by +2.44% on Caltech-101 and +2.16% on CIFAR-100.
arXiv Detail & Related papers (2023-07-21T09:40:42Z)
Multiscale Representation for Real-Time Anti-Aliasing Neural Rendering [84.37776381343662]
Mip-NeRF proposes a multiscale representation as a conical frustum to encode scale information. We propose mip voxel grids (Mip-VoG), an explicit multiscale representation for real-time anti-aliasing rendering. Our approach is the first to offer multiscale training and real-time anti-aliasing rendering simultaneously.
arXiv Detail & Related papers (2023-04-20T04:05:22Z)
Hybrid Mesh-neural Representation for 3D Transparent Object Reconstruction [30.66452291775852]
We propose a novel method to reconstruct the 3D shapes of transparent objects using hand-held captured images under natural light conditions. It combines the advantage of explicit mesh and multi-layer perceptron (MLP) network, a hybrid representation, to simplify the capture used in recent contributions.
arXiv Detail & Related papers (2022-03-23T17:58:56Z)
CoordX: Accelerating Implicit Neural Representation with a Split MLP Architecture [2.6912336656165805]
Implicit neural representations with multi-layer perceptrons (MLPs) have recently gained prominence for a wide variety of tasks. We propose a new split architecture, CoordX, to accelerate inference and training of coordinate-based representations. We demonstrate a speedup of up to 2.92x compared to the baseline model for image, video, and 3D shape representation and rendering tasks.
arXiv Detail & Related papers (2022-01-28T21:30:42Z)
RepMLPNet: Hierarchical Vision MLP with Re-parameterized Locality [113.1414517605892]
We propose a methodology, Locality Injection, to incorporate local priors into an FC layer. RepMLPNet is the first that seamlessly transfer to Cityscapes semantic segmentation.
arXiv Detail & Related papers (2021-12-21T10:28:17Z)
Hire-MLP: Vision MLP via Hierarchical Rearrangement [58.33383667626998]
Hire-MLP is a simple yet competitive vision architecture via rearrangement. The proposed Hire-MLP architecture is built with simple channel-mixing operations, thus enjoys high flexibility and inference speed. Experiments show that our Hire-MLP achieves state-of-the-art performance on the ImageNet-1K benchmark.
arXiv Detail & Related papers (2021-08-30T16:11:04Z)
CycleMLP: A MLP-like Architecture for Dense Prediction [26.74203747156439]
CycleMLP is a versatile backbone for visual recognition and dense predictions. It can cope with various image sizes and achieves linear computational complexity to image size by using local windows. CycleMLP aims to provide a competitive baseline on object detection, instance segmentation, and semantic segmentation for models.
arXiv Detail & Related papers (2021-07-21T17:23:06Z)
Recurrent Multi-view Alignment Network for Unsupervised Surface Registration [79.72086524370819]
Learning non-rigid registration in an end-to-end manner is challenging due to the inherent high degrees of freedom and the lack of labeled training data. We propose to represent the non-rigid transformation with a point-wise combination of several rigid transformations. We also introduce a differentiable loss function that measures the 3D shape similarity on the projected multi-view 2D depth images.
arXiv Detail & Related papers (2020-11-24T14:22:42Z)

This list is automatically generated from the titles and abstracts of the papers in this site.