Neural Implicit Dictionary via Mixture-of-Expert Training
- URL: http://arxiv.org/abs/2207.03691v1
- Date: Fri, 8 Jul 2022 05:07:19 GMT
- Title: Neural Implicit Dictionary via Mixture-of-Expert Training
- Authors: Peihao Wang, Zhiwen Fan, Tianlong Chen, Zhangyang Wang
- Abstract summary: We present a generic INR framework that achieves both data and training efficiency by learning a Neural Implicit Dictionary (NID)
Our NID assembles a group of coordinate-based Impworks which are tuned to span the desired function space.
Our experiments show that, NID can improve reconstruction of 2D images or 3D scenes by 2 orders of magnitude faster with up to 98% less input data.
- Score: 111.08941206369508
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Representing visual signals by coordinate-based deep fully-connected networks
has been shown advantageous in fitting complex details and solving inverse
problems than discrete grid-based representation. However, acquiring such a
continuous Implicit Neural Representation (INR) requires tedious per-scene
training on tons of signal measurements, which limits its practicality. In this
paper, we present a generic INR framework that achieves both data and training
efficiency by learning a Neural Implicit Dictionary (NID) from a data
collection and representing INR as a functional combination of basis sampled
from the dictionary. Our NID assembles a group of coordinate-based subnetworks
which are tuned to span the desired function space. After training, one can
instantly and robustly acquire an unseen scene representation by solving the
coding coefficients. To parallelly optimize a large group of networks, we
borrow the idea from Mixture-of-Expert (MoE) to design and train our network
with a sparse gating mechanism. Our experiments show that, NID can improve
reconstruction of 2D images or 3D scenes by 2 orders of magnitude faster with
up to 98% less input data. We further demonstrate various applications of NID
in image inpainting and occlusion removal, which are considered to be
challenging with vanilla INR. Our codes are available in
https://github.com/VITA-Group/Neural-Implicit-Dict.
Related papers
- Adaptively Placed Multi-Grid Scene Representation Networks for Large-Scale Data Visualization [16.961769402078264]
Scene representation networks (SRNs) have been recently proposed for compression and visualization of scientific data.
We address this shortcoming with an adaptively placed multi-grid SRN (APMGSRN)
We also release an open-source neural volume rendering application that allows plug-and-play rendering with any PyTorch-based SRN.
arXiv Detail & Related papers (2023-07-16T19:36:19Z) - Progressive Fourier Neural Representation for Sequential Video
Compilation [75.43041679717376]
Motivated by continual learning, this work investigates how to accumulate and transfer neural implicit representations for multiple complex video data over sequential encoding sessions.
We propose a novel method, Progressive Fourier Neural Representation (PFNR), that aims to find an adaptive and compact sub-module in Fourier space to encode videos in each training session.
We validate our PFNR method on the UVG8/17 and DAVIS50 video sequence benchmarks and achieve impressive performance gains over strong continual learning baselines.
arXiv Detail & Related papers (2023-06-20T06:02:19Z) - Deep Learning on Implicit Neural Representations of Shapes [14.596732196310978]
Implicit Neural Representations (INRs) have emerged as a powerful tool to encode continuously a variety of different signals.
In this paper, we propose inr2vec, a framework that can compute a compact latent representation for an input INR in a single inference pass.
We verify that inr2vec can embed effectively the 3D shapes represented by the input INRs and show how the produced embeddings can be fed into deep learning pipelines.
arXiv Detail & Related papers (2023-02-10T18:55:49Z) - Versatile Neural Processes for Learning Implicit Neural Representations [57.090658265140384]
We propose Versatile Neural Processes (VNP), which largely increases the capability of approximating functions.
Specifically, we introduce a bottleneck encoder that produces fewer and informative context tokens, relieving the high computational cost.
We demonstrate the effectiveness of the proposed VNP on a variety of tasks involving 1D, 2D and 3D signals.
arXiv Detail & Related papers (2023-01-21T04:08:46Z) - Signal Processing for Implicit Neural Representations [80.38097216996164]
Implicit Neural Representations (INRs) encode continuous multi-media data via multi-layer perceptrons.
Existing works manipulate such continuous representations via processing on their discretized instance.
We propose an implicit neural signal processing network, dubbed INSP-Net, via differential operators on INR.
arXiv Detail & Related papers (2022-10-17T06:29:07Z) - Coordinate Translator for Learning Deformable Medical Image Registration [15.057534618761268]
We propose a novel deformable registration network, im2grid, that uses multiple CoTr's with the hierarchical features extracted from a CNN encoder.
We compare im2grid with the state-of-the-art DL and non-DL methods for unsupervised 3D magnetic resonance image registration.
Our experiments show that im2grid outperforms these methods both qualitatively and quantitatively.
arXiv Detail & Related papers (2022-03-05T21:23:03Z) - Spatial Dependency Networks: Neural Layers for Improved Generative Image
Modeling [79.15521784128102]
We introduce a novel neural network for building image generators (decoders) and apply it to variational autoencoders (VAEs)
In our spatial dependency networks (SDNs), feature maps at each level of a deep neural net are computed in a spatially coherent way.
We show that augmenting the decoder of a hierarchical VAE by spatial dependency layers considerably improves density estimation.
arXiv Detail & Related papers (2021-03-16T07:01:08Z) - Learning Deep Interleaved Networks with Asymmetric Co-Attention for
Image Restoration [65.11022516031463]
We present a deep interleaved network (DIN) that learns how information at different states should be combined for high-quality (HQ) images reconstruction.
In this paper, we propose asymmetric co-attention (AsyCA) which is attached at each interleaved node to model the feature dependencies.
Our presented DIN can be trained end-to-end and applied to various image restoration tasks.
arXiv Detail & Related papers (2020-10-29T15:32:00Z) - When CNNs Meet Random RNNs: Towards Multi-Level Analysis for RGB-D
Object and Scene Recognition [10.796613905980609]
We propose a novel framework that extracts discriminative feature representations from multi-modal RGB-D images for object and scene recognition tasks.
To cope with the high dimensionality of CNN activations, a random weighted pooling scheme has been proposed.
Experiments verify that fully randomized structure in RNN stage encodes CNN activations to discriminative solid features successfully.
arXiv Detail & Related papers (2020-04-26T10:58:27Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.