MINR: Efficient Implicit Neural Representations for Multi-Image Encoding
- URL: http://arxiv.org/abs/2508.13471v1
- Date: Tue, 19 Aug 2025 03:05:14 GMT
- Title: MINR: Efficient Implicit Neural Representations for Multi-Image Encoding
- Authors: Wenyong Zhou, Taiqiang Wu, Zhengwu Liu, Yuxin Cheng, Chen Zhang, Ngai Wong,
- Abstract summary: Implicit Neural Representations (INRs) aim to parameterize discrete signals through implicit continuous functions.<n>We propose MINR, sharing specific layers to encode multi-image efficiently.<n> MINR scales effectively to handle 100 images, maintaining an average peak signal-to-noise ratio (PSNR) of 34 dB.
- Score: 10.161354612863295
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Implicit Neural Representations (INRs) aim to parameterize discrete signals through implicit continuous functions. However, formulating each image with a separate neural network~(typically, a Multi-Layer Perceptron (MLP)) leads to computational and storage inefficiencies when encoding multi-images. To address this issue, we propose MINR, sharing specific layers to encode multi-image efficiently. We first compare the layer-wise weight distributions for several trained INRs and find that corresponding intermediate layers follow highly similar distribution patterns. Motivated by this, we share these intermediate layers across multiple images while preserving the input and output layers as input-specific. In addition, we design an extra novel projection layer for each image to capture its unique features. Experimental results on image reconstruction and super-resolution tasks demonstrate that MINR can save up to 60\% parameters while maintaining comparable performance. Particularly, MINR scales effectively to handle 100 images, maintaining an average peak signal-to-noise ratio (PSNR) of 34 dB. Further analysis of various backbones proves the robustness of the proposed MINR.
Related papers
- OSDM-MReg: Multimodal Image Registration based One Step Diffusion Model [8.619958921346184]
Multimodal remote sensing image registration aligns images from different sensors for data fusion and analysis.<n>We propose OSDM-MReg, a novel multimodal image registration framework based image-to-image translation.<n> Experiments demonstrate superior accuracy and efficiency across various multimodal registration tasks.
arXiv Detail & Related papers (2025-04-08T13:32:56Z) - Layer- and Timestep-Adaptive Differentiable Token Compression Ratios for Efficient Diffusion Transformers [55.87192133758051]
Diffusion Transformers (DiTs) have achieved state-of-the-art (SOTA) image generation quality but suffer from high latency and memory inefficiency.<n>We propose DiffCR, a dynamic DiT inference framework with differentiable compression ratios.
arXiv Detail & Related papers (2024-12-22T02:04:17Z) - HIIF: Hierarchical Encoding based Implicit Image Function for Continuous Super-resolution [16.652558917081954]
We propose textbfHIIF for continuous image super-resolution.<n>Our approach embeds a multi-head linear attention mechanism within the implicit attention network by taking additional non-local information into account.<n>Our experiments show that, when integrated with different backbone encoders, HIIF outperforms the state-of-the-art continuous image super-resolution methods by up to 0.17dB in PSNR.
arXiv Detail & Related papers (2024-12-04T22:35:20Z) - SL$^{2}$A-INR: Single-Layer Learnable Activation for Implicit Neural Representation [6.572456394600755]
Implicit Neural Representation (INR) leveraging a neural network to transform coordinate input into corresponding attributes has driven significant advances in vision-related domains.<n>We show that these challenges can be alleviated by introducing a novel approach in INR architecture.<n>Specifically, we propose SL$2$A-INR, a hybrid network that combines a single-layer learnable activation function with an synthesis that uses traditional ReLU activations.
arXiv Detail & Related papers (2024-09-17T02:02:15Z) - Invertible Residual Rescaling Models [46.28263683643467]
Invertible Rescaling Networks (IRNs) and their variants have witnessed remarkable achievements in various image processing tasks like image rescaling.
We propose Invertible Residual Rescaling Models (IRRM) for image rescaling by learning a bijection between a high-resolution image and its low-resolution counterpart with a specific distribution.
Our IRRM has respectively PSNR gains of at least 0.3 dB over HCFlow and IRN in the x4 rescaling while only using 60% parameters and 50% FLOPs.
arXiv Detail & Related papers (2024-05-05T14:14:49Z) - NeuroPictor: Refining fMRI-to-Image Reconstruction via Multi-individual Pretraining and Multi-level Modulation [55.51412454263856]
This paper proposes to directly modulate the generation process of diffusion models using fMRI signals.
By training with about 67,000 fMRI-image pairs from various individuals, our model enjoys superior fMRI-to-image decoding capacity.
arXiv Detail & Related papers (2024-03-27T02:42:52Z) - Increasing diversity of omni-directional images generated from single
image using cGAN based on MLPMixer [0.0]
The previous method has relied on the generative adversarial networks based on convolutional neural networks (CNN)
TheMixer has been proposed as an alternative to the self-attention in the transformer, which captures long-range dependencies and contextual information.
As a result, competitive performance has been achieved with reduced memory consumption and computational cost.
arXiv Detail & Related papers (2023-09-15T03:43:29Z) - MLP-SRGAN: A Single-Dimension Super Resolution GAN using MLP-Mixer [0.05219568203653523]
We propose a novel architecture calledversa-SRGAN, which is a single-dimension Super Resolution Generative Adrial Network (SRGAN)
SRGAN is trained and validated using high resolution (HR) FLAIR MRI from the MSSEG2 challenge dataset.
Results show-SRGAN results in sharper edges, less blurring, preserves more texture and fine-anatomical detail, with fewer parameters, faster training/evaluation time, and smaller model size than existing methods.
arXiv Detail & Related papers (2023-03-11T04:05:57Z) - Signal Processing for Implicit Neural Representations [80.38097216996164]
Implicit Neural Representations (INRs) encode continuous multi-media data via multi-layer perceptrons.
Existing works manipulate such continuous representations via processing on their discretized instance.
We propose an implicit neural signal processing network, dubbed INSP-Net, via differential operators on INR.
arXiv Detail & Related papers (2022-10-17T06:29:07Z) - Multi-Stage Progressive Image Restoration [167.6852235432918]
We propose a novel synergistic design that can optimally balance these competing goals.
Our main proposal is a multi-stage architecture, that progressively learns restoration functions for the degraded inputs.
The resulting tightly interlinked multi-stage architecture, named as MPRNet, delivers strong performance gains on ten datasets.
arXiv Detail & Related papers (2021-02-04T18:57:07Z) - MDCN: Multi-scale Dense Cross Network for Image Super-Resolution [35.59973281360541]
We propose a Multi-scale Dense Cross Network (MDCN), which achieves great performance with fewer parameters and less execution time.
MDCN consists of multi-scale dense cross blocks (MDCBs), hierarchical feature distillation block (HFDB), and dynamic reconstruction block (DRB)
arXiv Detail & Related papers (2020-08-30T03:50:19Z) - Cross-Scale Internal Graph Neural Network for Image Super-Resolution [147.77050877373674]
Non-local self-similarity in natural images has been well studied as an effective prior in image restoration.
For single image super-resolution (SISR), most existing deep non-local methods only exploit similar patches within the same scale of the low-resolution (LR) input image.
This is achieved using a novel cross-scale internal graph neural network (IGNN)
arXiv Detail & Related papers (2020-06-30T10:48:40Z) - Locally Masked Convolution for Autoregressive Models [107.4635841204146]
LMConv is a simple modification to the standard 2D convolution that allows arbitrary masks to be applied to the weights at each location in the image.
We learn an ensemble of distribution estimators that share parameters but differ in generation order, achieving improved performance on whole-image density estimation.
arXiv Detail & Related papers (2020-06-22T17:59:07Z) - CoMIR: Contrastive Multimodal Image Representation for Registration [4.543268895439618]
We propose contrastive coding to learn shared, dense image representations, referred to as CoMIRs (Contrastive Multimodal Image Representations)
CoMIRs enable the registration of multimodal images where existing registration methods often fail due to a lack of sufficiently similar image structures.
arXiv Detail & Related papers (2020-06-11T10:51:33Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.