Related papers: Latent Modulated Function for Computational Optimal Continuous Image Representation

Latent Modulated Function for Computational Optimal Continuous Image Representation

URL: http://arxiv.org/abs/2404.16451v1
Date: Thu, 25 Apr 2024 09:30:38 GMT
Title: Latent Modulated Function for Computational Optimal Continuous Image Representation
Authors: Zongyao He, Zhi Jin,
Abstract summary: We propose a novel Latent Modulated Rendering (LMF) algorithm for continuous image representation. We show that converting existing INR-based methods to LMF can reduce the computational cost by up to 99.9%. Experiments demonstrate that converting existing INR-based methods to LMF can reduce inference by up to 57 times, and save up to 76% parameters.
Score: 20.678662838709542
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: The recent work Local Implicit Image Function (LIIF) and subsequent Implicit Neural Representation (INR) based works have achieved remarkable success in Arbitrary-Scale Super-Resolution (ASSR) by using MLP to decode Low-Resolution (LR) features. However, these continuous image representations typically implement decoding in High-Resolution (HR) High-Dimensional (HD) space, leading to a quadratic increase in computational cost and seriously hindering the practical applications of ASSR. To tackle this problem, we propose a novel Latent Modulated Function (LMF), which decouples the HR-HD decoding process into shared latent decoding in LR-HD space and independent rendering in HR Low-Dimensional (LD) space, thereby realizing the first computational optimal paradigm of continuous image representation. Specifically, LMF utilizes an HD MLP in latent space to generate latent modulations of each LR feature vector. This enables a modulated LD MLP in render space to quickly adapt to any input feature vector and perform rendering at arbitrary resolution. Furthermore, we leverage the positive correlation between modulation intensity and input image complexity to design a Controllable Multi-Scale Rendering (CMSR) algorithm, offering the flexibility to adjust the decoding efficiency based on the rendering precision. Extensive experiments demonstrate that converting existing INR-based ASSR methods to LMF can reduce the computational cost by up to 99.9%, accelerate inference by up to 57 times, and save up to 76% of parameters, while maintaining competitive performance. The code is available at https://github.com/HeZongyao/LMF.

Related papers

Pixel to Gaussian: Ultra-Fast Continuous Super-Resolution with 2D Gaussian Modeling [50.34513854725803]
Arbitrary-scale super-resolution (ASSR) aims to reconstruct high-resolution (HR) images from low-resolution (LR) inputs with arbitrary upsampling factors. We propose a novel ContinuousSR framework with a Pixel-to-Gaussian paradigm, which explicitly reconstructs 2D continuous HR signals from LR images using Gaussian Splatting.
arXiv Detail & Related papers (2025-03-09T13:43:57Z)
Progressive Mixed-Precision Decoding for Efficient LLM Inference [49.05448842542558]
We introduce Progressive Mixed-Precision Decoding (PMPD) to address the memory-boundedness of decoding. PMPD achieves 1.4$-$12.2$times$ speedup in matrix-vector multiplications over fp16 models. Our approach delivers a throughput gain of 3.8$-$8.0$times$ over fp16 models and up to 1.54$times$ over uniform quantization approaches.
arXiv Detail & Related papers (2024-10-17T11:46:33Z)
Frequency-Assisted Mamba for Remote Sensing Image Super-Resolution [49.902047563260496]
We develop the first attempt to integrate the Vision State Space Model (Mamba) for remote sensing image (RSI) super-resolution. To achieve better SR reconstruction, building upon Mamba, we devise a Frequency-assisted Mamba framework, dubbed FMSR. Our FMSR features a multi-level fusion architecture equipped with the Frequency Selection Module (FSM), Vision State Space Module (VSSM), and Hybrid Gate Module (HGM)
arXiv Detail & Related papers (2024-05-08T11:09:24Z)
MixNet: Efficient Global Modeling for Ultra-High-Definition Image Restoration [36.15948393000783]
We propose a novel image restoration method called MixNet, which introduces an alternative approach to global modeling approaches. To capture the longrange dependency of features without introducing excessive computational complexity, we present the Global Feature Modulation Layer (GFML) We conduct extensive experiments on four UHD image restoration tasks, including low-light image enhancement, underwater image enhancement, image deblurring and image demoireing, and the comprehensive results demonstrate that our proposed method surpasses the performance of current state-of-the-art methods.
arXiv Detail & Related papers (2024-01-19T12:40:54Z)
Transforming Image Super-Resolution: A ConvFormer-based Efficient Approach [58.57026686186709]
We introduce the Convolutional Transformer layer (ConvFormer) and propose a ConvFormer-based Super-Resolution network (CFSR) CFSR inherits the advantages of both convolution-based and transformer-based approaches. Experiments demonstrate that CFSR strikes an optimal balance between computational cost and performance.
arXiv Detail & Related papers (2024-01-11T03:08:00Z)
Efficient Model Agnostic Approach for Implicit Neural Representation Based Arbitrary-Scale Image Super-Resolution [5.704360536038803]
Single image super-resolution (SISR) has experienced significant advancements, primarily driven by deep convolutional networks. Traditional networks are limited to upscaling images to a fixed scale, leading to the utilization of implicit neural functions for generating arbitrarily scaled images. We introduce a novel and efficient framework, the Mixture of Experts Implicit Super-Resolution (MoEISR), which enables super-resolution at arbitrary scales.
arXiv Detail & Related papers (2023-11-20T05:34:36Z)
Low-Resolution Self-Attention for Semantic Segmentation [96.81482872022237]
We introduce the Low-Resolution Self-Attention (LRSA) mechanism to capture global context at a significantly reduced computational cost. Our approach involves computing self-attention in a fixed low-resolution space regardless of the input image's resolution. We demonstrate the effectiveness of our LRSA approach by building the LRFormer, a vision transformer with an encoder-decoder structure.
arXiv Detail & Related papers (2023-10-08T06:10:09Z)
Implicit Diffusion Models for Continuous Super-Resolution [65.45848137914592]
This paper introduces an Implicit Diffusion Model (IDM) for high-fidelity continuous image super-resolution. IDM integrates an implicit neural representation and a denoising diffusion model in a unified end-to-end framework. The scaling factor regulates the resolution and accordingly modulates the proportion of the LR information and generated features in the final output.
arXiv Detail & Related papers (2023-03-29T07:02:20Z)
OPE-SR: Orthogonal Position Encoding for Designing a Parameter-free Upsampling Module in Arbitrary-scale Image Super-Resolution [11.74426147465809]
Implicit neural representation (INR) is a popular approach for arbitrary-scale image super-resolution. We propose an OPE-Upscale module to replace the INR-based upsampling module for arbitrary-scale image super-resolution.
arXiv Detail & Related papers (2023-03-02T09:26:14Z)
Spatially-Adaptive Feature Modulation for Efficient Image Super-Resolution [90.16462805389943]
We develop a spatially-adaptive feature modulation (SAFM) mechanism upon a vision transformer (ViT)-like block. Proposed method is $3times$ smaller than state-of-the-art efficient SR methods.
arXiv Detail & Related papers (2023-02-27T14:19:31Z)
DCS-RISR: Dynamic Channel Splitting for Efficient Real-world Image Super-Resolution [15.694407977871341]
Real-world image super-resolution (RISR) has received increased focus for improving the quality of SR images under unknown complex degradation. Existing methods rely on the heavy SR models to enhance low-resolution (LR) images of different degradation levels. We propose a novel Dynamic Channel Splitting scheme for efficient Real-world Image Super-Resolution, termed DCS-RISR.
arXiv Detail & Related papers (2022-12-15T04:34:57Z)
Deep Generative Adversarial Residual Convolutional Networks for Real-World Super-Resolution [31.934084942626257]
We propose a deep Super-Resolution Residual Convolutional Generative Adversarial Network (SRResCGAN) It follows the real-world degradation settings by adversarial training the model with pixel-wise supervision in the HR domain from its generated LR counterpart. The proposed network exploits the residual learning by minimizing the energy-based objective function with powerful image regularization and convex optimization techniques.
arXiv Detail & Related papers (2020-05-03T00:12:38Z)

This list is automatically generated from the titles and abstracts of the papers in this site.