Related papers: CoordX: Accelerating Implicit Neural Representation with a Split MLP Architecture

CoordX: Accelerating Implicit Neural Representation with a Split MLP Architecture

URL: http://arxiv.org/abs/2201.12425v1
Date: Fri, 28 Jan 2022 21:30:42 GMT
Title: CoordX: Accelerating Implicit Neural Representation with a Split MLP Architecture
Authors: Ruofan Liang, Hongyi Sun, Nandita Vijaykumar
Abstract summary: Implicit neural representations with multi-layer perceptrons (MLPs) have recently gained prominence for a wide variety of tasks. We propose a new split architecture, CoordX, to accelerate inference and training of coordinate-based representations. We demonstrate a speedup of up to 2.92x compared to the baseline model for image, video, and 3D shape representation and rendering tasks.
Score: 2.6912336656165805
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Implicit neural representations with multi-layer perceptrons (MLPs) have recently gained prominence for a wide variety of tasks such as novel view synthesis and 3D object representation and rendering. However, a significant challenge with these representations is that both training and inference with an MLP over a large number of input coordinates to learn and represent an image, video, or 3D object, require large amounts of computation and incur long processing times. In this work, we aim to accelerate inference and training of coordinate-based MLPs for implicit neural representations by proposing a new split MLP architecture, CoordX. With CoordX, the initial layers are split to learn each dimension of the input coordinates separately. The intermediate features are then fused by the last layers to generate the learned signal at the corresponding coordinate point. This significantly reduces the amount of computation required and leads to large speedups in training and inference, while achieving similar accuracy as the baseline MLP. This approach thus aims at first learning functions that are a decomposition of the original signal and then fusing them to generate the learned signal. Our proposed architecture can be generally used for many implicit neural representation tasks with no additional memory overheads. We demonstrate a speedup of up to 2.92x compared to the baseline model for image, video, and 3D shape representation and rendering tasks.

Related papers

Coordinates Are NOT Lonely -- Codebook Prior Helps Implicit Neural 3D Representations [29.756718435405983]
Implicit neural 3D representation has achieved impressive results in surface or scene reconstruction and novel view synthesis. Existing approaches, such as Neural Radiance Field (NeRF) and its variants, usually require dense input views. We introduce a novel coordinate-based model, CoCo-INR, for implicit neural 3D representation.
arXiv Detail & Related papers (2022-10-20T11:13:50Z)
Large-Margin Representation Learning for Texture Classification [67.94823375350433]
This paper presents a novel approach combining convolutional layers (CLs) and large-margin metric learning for training supervised models on small datasets for texture classification. The experimental results on texture and histopathologic image datasets have shown that the proposed approach achieves competitive accuracy with lower computational cost and faster convergence when compared to equivalent CNNs.
arXiv Detail & Related papers (2022-06-17T04:07:45Z)
ReLU Fields: The Little Non-linearity That Could [62.228229880658404]
We investigate what is the smallest change to grid-based representations that allows for retaining the high fidelity result ofs. We show that such an approach becomes competitive with the state-of-the-art.
arXiv Detail & Related papers (2022-05-22T13:42:31Z)
UNeXt: MLP-based Rapid Medical Image Segmentation Network [80.16644725886968]
UNet and its latest extensions like TransUNet have been the leading medical image segmentation methods in recent years. We propose UNeXt which is a Convolutional multilayer perceptron based network for image segmentation. We show that we reduce the number of parameters by 72x, decrease the computational complexity by 68x, and improve the inference speed by 10x while also obtaining better segmentation performance.
arXiv Detail & Related papers (2022-03-09T18:58:22Z)
MINER: Multiscale Implicit Neural Representations [43.36327238440042]
We introduce a new neural signal representation designed for the efficient high-resolution representation of large-scale signals. The key innovation in our multiscale implicit neural representation (MINER) is an internal representation via a Laplacian pyramid. We demonstrate that it requires fewer than 25% of the parameters, 33% of the memory footprint, and 10% of the time of competing techniques such as ACORN to reach the same representation error.
arXiv Detail & Related papers (2022-02-07T21:49:33Z)
Meta-Learning Sparse Implicit Neural Representations [69.15490627853629]
Implicit neural representations are a promising new avenue of representing general signals. Current approach is difficult to scale for a large number of signals or a data set. We show that meta-learned sparse neural representations achieve a much smaller loss than dense meta-learned models.
arXiv Detail & Related papers (2021-10-27T18:02:53Z)
Graph-Based 3D Multi-Person Pose Estimation Using Multi-View Images [79.70127290464514]
We decompose the task into two stages, i.e. person localization and pose estimation. And we propose three task-specific graph neural networks for effective message passing. Our approach achieves state-of-the-art performance on CMU Panoptic and Shelf datasets.
arXiv Detail & Related papers (2021-09-13T11:44:07Z)
ACORN: Adaptive Coordinate Networks for Neural Scene Representation [40.04760307540698]
Current neural representations fail to accurately represent images at resolutions greater than a megapixel or 3D scenes with more than a few hundred thousand polygons. We introduce a new hybrid implicit-explicit network architecture and training strategy that adaptively allocates resources during training and inference. We demonstrate the first experiments that fit gigapixel images to nearly 40 dB peak signal-to-noise ratio.
arXiv Detail & Related papers (2021-05-06T16:21:38Z)
RepMLP: Re-parameterizing Convolutions into Fully-connected Layers for Image Recognition [123.59890802196797]
We propose RepMLP, a multi-layer-perceptron-style neural network building block for image recognition. We construct convolutional layers inside a RepMLP during training and merge them into the FC for inference. By inserting RepMLP in traditional CNN, we improve ResNets by 1.8% accuracy on ImageNet, 2.9% for face recognition, and 2.3% mIoU on Cityscapes with lower FLOPs.
arXiv Detail & Related papers (2021-05-05T06:17:40Z)
Learned Initializations for Optimizing Coordinate-Based Neural Representations [47.408295381897815]
Coordinate-based neural representations have shown significant promise as an alternative to discrete, array-based representations. We propose applying standard meta-learning algorithms to learn the initial weight parameters for these fully-connected networks. We explore these benefits across a variety of tasks, including representing 2D images, reconstructing CT scans, and recovering 3D shapes and scenes from 2D image observations.
arXiv Detail & Related papers (2020-12-03T18:59:52Z)

This list is automatically generated from the titles and abstracts of the papers in this site.