ASMR: Activation-sharing Multi-resolution Coordinate Networks For Efficient Inference
- URL: http://arxiv.org/abs/2405.12398v1
- Date: Mon, 20 May 2024 22:35:34 GMT
- Title: ASMR: Activation-sharing Multi-resolution Coordinate Networks For Efficient Inference
- Authors: Jason Chun Lok Li, Steven Tin Sui Luo, Le Xu, Ngai Wong,
- Abstract summary: Coordinate network or implicit neural representation (INR) is a fast-emerging method for encoding natural signals.
We propose the Activation-Sharing Multi-Resolution (ASMR) coordinate network that combines multi-resolution coordinate decomposition with hierarchical modulations.
We show that ASMR can reduce the MAC of a vanilla SIREN model by up to 500x while achieving an even higher reconstruction quality than its SIREN baseline.
- Score: 6.005712471509875
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Coordinate network or implicit neural representation (INR) is a fast-emerging method for encoding natural signals (such as images and videos) with the benefits of a compact neural representation. While numerous methods have been proposed to increase the encoding capabilities of an INR, an often overlooked aspect is the inference efficiency, usually measured in multiply-accumulate (MAC) count. This is particularly critical in use cases where inference throughput is greatly limited by hardware constraints. To this end, we propose the Activation-Sharing Multi-Resolution (ASMR) coordinate network that combines multi-resolution coordinate decomposition with hierarchical modulations. Specifically, an ASMR model enables the sharing of activations across grids of the data. This largely decouples its inference cost from its depth which is directly correlated to its reconstruction capability, and renders a near O(1) inference complexity irrespective of the number of layers. Experiments show that ASMR can reduce the MAC of a vanilla SIREN model by up to 500x while achieving an even higher reconstruction quality than its SIREN baseline.
Related papers
- Single-Layer Learnable Activation for Implicit Neural Representation (SL$^{2}$A-INR) [6.572456394600755]
Implicit Representation (INR) leveraging a neural network to transform coordinate input into corresponding attributes has driven significant advances in vision-related domains.
We propose SL$2$A-INR with a single-layer learnable activation function, prompting the effectiveness of traditional ReLU-baseds.
Our method performs superior across diverse tasks, including image representation, 3D shape reconstruction, single image super-resolution, CT reconstruction, and novel view.
arXiv Detail & Related papers (2024-09-17T02:02:15Z) - Locality-Aware Generalizable Implicit Neural Representation [54.93702310461174]
Generalizable implicit neural representation (INR) enables a single continuous function to represent multiple data instances.
We propose a novel framework for generalizable INR that combines a transformer encoder with a locality-aware INR decoder.
Our framework significantly outperforms previous generalizable INRs and validates the usefulness of the locality-aware latents for downstream tasks.
arXiv Detail & Related papers (2023-10-09T11:26:58Z) - Multi-Dimensional Refinement Graph Convolutional Network with Robust
Decouple Loss for Fine-Grained Skeleton-Based Action Recognition [19.031036881780107]
We propose a flexible attention block called Channel-Variable Spatial-Temporal Attention (CVSTA) to enhance the discriminative power of spatial-temporal joints.
Based on CVSTA, we construct a Multi-Dimensional Refinement Graph Convolutional Network (MDR-GCN), which can improve the discrimination among channel-, joint- and frame-level features.
Furthermore, we propose a Robust Decouple Loss (RDL), which significantly boosts the effect of the CVSTA and reduces the impact of noise.
arXiv Detail & Related papers (2023-06-27T09:23:36Z) - Joint Channel Estimation and Feedback with Masked Token Transformers in
Massive MIMO Systems [74.52117784544758]
This paper proposes an encoder-decoder based network that unveils the intrinsic frequency-domain correlation within the CSI matrix.
The entire encoder-decoder network is utilized for channel compression.
Our method outperforms state-of-the-art channel estimation and feedback techniques in joint tasks.
arXiv Detail & Related papers (2023-06-08T06:15:17Z) - Over-and-Under Complete Convolutional RNN for MRI Reconstruction [57.95363471940937]
Recent deep learning-based methods for MR image reconstruction usually leverage a generic auto-encoder architecture.
We propose an Over-and-Under Complete Convolu?tional Recurrent Neural Network (OUCR), which consists of an overcomplete and an undercomplete Convolutional Recurrent Neural Network(CRNN)
The proposed method achieves significant improvements over the compressed sensing and popular deep learning-based methods with less number of trainable parameters.
arXiv Detail & Related papers (2021-06-16T15:56:34Z) - Learning Frequency-aware Dynamic Network for Efficient Super-Resolution [56.98668484450857]
This paper explores a novel frequency-aware dynamic network for dividing the input into multiple parts according to its coefficients in the discrete cosine transform (DCT) domain.
In practice, the high-frequency part will be processed using expensive operations and the lower-frequency part is assigned with cheap operations to relieve the computation burden.
Experiments conducted on benchmark SISR models and datasets show that the frequency-aware dynamic network can be employed for various SISR neural architectures.
arXiv Detail & Related papers (2021-03-15T12:54:26Z) - Compute and memory efficient universal sound source separation [23.152611264259225]
We provide a family of efficient neural network architectures for general purpose audio source separation.
The backbone structure of this convolutional network is the SUccessive DOwnsampling and Resampling of Multi-Resolution Features (SuDoRM-RF)
Our experiments show that SuDoRM-RF models perform comparably and even surpass several state-of-the-art benchmarks.
arXiv Detail & Related papers (2021-03-03T19:16:53Z) - Efficient Low-Latency Speech Enhancement with Mobile Audio Streaming
Networks [6.82469220191368]
We propose Mobile Audio Streaming Networks (MASnet) for efficient low-latency speech enhancement.
MASnet processes linear-scale spectrograms, transforming successive noisy frames into complex-valued ratio masks.
arXiv Detail & Related papers (2020-08-17T12:18:34Z) - Deep Denoising Neural Network Assisted Compressive Channel Estimation
for mmWave Intelligent Reflecting Surfaces [99.34306447202546]
This paper proposes a deep denoising neural network assisted compressive channel estimation for mmWave IRS systems.
We first introduce a hybrid passive/active IRS architecture, where very few receive chains are employed to estimate the uplink user-to-IRS channels.
The complete channel matrix can be reconstructed from the limited measurements based on compressive sensing.
arXiv Detail & Related papers (2020-06-03T12:18:57Z) - Deep Adaptive Inference Networks for Single Image Super-Resolution [72.7304455761067]
Single image super-resolution (SISR) has witnessed tremendous progress in recent years owing to the deployment of deep convolutional neural networks (CNNs)
In this paper, we take a step forward to address this issue by leveraging the adaptive inference networks for deep SISR (AdaDSR)
Our AdaDSR involves an SISR model as backbone and a lightweight adapter module which takes image features and resource constraint as input and predicts a map of local network depth.
arXiv Detail & Related papers (2020-04-08T10:08:20Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.