Locality-Aware Generalizable Implicit Neural Representation
- URL: http://arxiv.org/abs/2310.05624v2
- Date: Thu, 12 Oct 2023 05:33:19 GMT
- Title: Locality-Aware Generalizable Implicit Neural Representation
- Authors: Doyup Lee, Chiheon Kim, Minsu Cho, Wook-Shin Han
- Abstract summary: Generalizable implicit neural representation (INR) enables a single continuous function to represent multiple data instances.
We propose a novel framework for generalizable INR that combines a transformer encoder with a locality-aware INR decoder.
Our framework significantly outperforms previous generalizable INRs and validates the usefulness of the locality-aware latents for downstream tasks.
- Score: 54.93702310461174
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Generalizable implicit neural representation (INR) enables a single
continuous function, i.e., a coordinate-based neural network, to represent
multiple data instances by modulating its weights or intermediate features
using latent codes. However, the expressive power of the state-of-the-art
modulation is limited due to its inability to localize and capture fine-grained
details of data entities such as specific pixels and rays. To address this
issue, we propose a novel framework for generalizable INR that combines a
transformer encoder with a locality-aware INR decoder. The transformer encoder
predicts a set of latent tokens from a data instance to encode local
information into each latent token. The locality-aware INR decoder extracts a
modulation vector by selectively aggregating the latent tokens via
cross-attention for a coordinate input and then predicts the output by
progressively decoding with coarse-to-fine modulation through multiple
frequency bandwidths. The selective token aggregation and the multi-band
feature modulation enable us to learn locality-aware representation in spatial
and spectral aspects, respectively. Our framework significantly outperforms
previous generalizable INRs and validates the usefulness of the locality-aware
latents for downstream tasks such as image generation.
Related papers
- INCODE: Implicit Neural Conditioning with Prior Knowledge Embeddings [4.639495398851869]
Implicit Neural Representations (INRs) have revolutionized signal representation by leveraging neural networks to provide continuous and smooth representations of complex data.
We introduce INCODE, a novel approach that enhances the control of the sinusoidal-based activation function in INRs using deep prior knowledge.
Our approach not only excels in representation, but also extends its prowess to tackle complex tasks such as audio, image, and 3D shape reconstructions.
arXiv Detail & Related papers (2023-10-28T23:16:49Z) - Disorder-invariant Implicit Neural Representation [32.510321385245774]
Implicit neural representation (INR) characterizes the attributes of a signal as a function of corresponding coordinates.
We propose the disorder-invariant implicit neural representation (DINER) by augmenting a hash-table to a traditional INR backbone.
arXiv Detail & Related papers (2023-04-03T09:28:48Z) - DINER: Disorder-Invariant Implicit Neural Representation [33.10256713209207]
Implicit neural representation (INR) characterizes the attributes of a signal as a function of corresponding coordinates.
We propose the disorder-invariant implicit neural representation (DINER) by augmenting a hash-table to a traditional INR backbone.
arXiv Detail & Related papers (2022-11-15T03:34:24Z) - Signal Processing for Implicit Neural Representations [80.38097216996164]
Implicit Neural Representations (INRs) encode continuous multi-media data via multi-layer perceptrons.
Existing works manipulate such continuous representations via processing on their discretized instance.
We propose an implicit neural signal processing network, dubbed INSP-Net, via differential operators on INR.
arXiv Detail & Related papers (2022-10-17T06:29:07Z) - Neural Implicit Dictionary via Mixture-of-Expert Training [111.08941206369508]
We present a generic INR framework that achieves both data and training efficiency by learning a Neural Implicit Dictionary (NID)
Our NID assembles a group of coordinate-based Impworks which are tuned to span the desired function space.
Our experiments show that, NID can improve reconstruction of 2D images or 3D scenes by 2 orders of magnitude faster with up to 98% less input data.
arXiv Detail & Related papers (2022-07-08T05:07:19Z) - PINs: Progressive Implicit Networks for Multi-Scale Neural
Representations [68.73195473089324]
We propose a progressive positional encoding, exposing a hierarchical structure to incremental sets of frequency encodings.
Our model accurately reconstructs scenes with wide frequency bands and learns a scene representation at progressive level of detail.
Experiments on several 2D and 3D datasets show improvements in reconstruction accuracy, representational capacity and training speed compared to baselines.
arXiv Detail & Related papers (2022-02-09T20:33:37Z) - Rethinking Global Context in Crowd Counting [70.54184500538338]
A pure transformer is used to extract features with global information from overlapping image patches.
Inspired by classification, we add a context token to the input sequence, to facilitate information exchange with tokens corresponding to image patches.
arXiv Detail & Related papers (2021-05-23T12:44:27Z) - Volumetric Transformer Networks [88.85542905676712]
We introduce a learnable module, the volumetric transformer network (VTN)
VTN predicts channel-wise warping fields so as to reconfigure intermediate CNN features spatially and channel-wisely.
Our experiments show that VTN consistently boosts the features' representation power and consequently the networks' accuracy on fine-grained image recognition and instance-level image retrieval.
arXiv Detail & Related papers (2020-07-18T14:00:12Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.