Toward DNN of LUTs: Learning Efficient Image Restoration with Multiple
Look-Up Tables
- URL: http://arxiv.org/abs/2303.14506v1
- Date: Sat, 25 Mar 2023 16:00:33 GMT
- Title: Toward DNN of LUTs: Learning Efficient Image Restoration with Multiple
Look-Up Tables
- Authors: Jiacheng Li, Chang Chen, Zhen Cheng, Zhiwei Xiong
- Abstract summary: High-definition screens on edge devices stimulate a strong demand for efficient image restoration algorithms.
The size of a single look-up table grows exponentially with the increase of its indexing capacity.
We propose a universal method to construct multiple LUTs like a neural network, termed MuLUT.
- Score: 47.15181829317732
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: The widespread usage of high-definition screens on edge devices stimulates a
strong demand for efficient image restoration algorithms. The way of caching
deep learning models in a look-up table (LUT) is recently introduced to respond
to this demand. However, the size of a single LUT grows exponentially with the
increase of its indexing capacity, which restricts its receptive field and thus
the performance. To overcome this intrinsic limitation of the single-LUT
solution, we propose a universal method to construct multiple LUTs like a
neural network, termed MuLUT. Firstly, we devise novel complementary indexing
patterns, as well as a general implementation for arbitrary patterns, to
construct multiple LUTs in parallel. Secondly, we propose a re-indexing
mechanism to enable hierarchical indexing between cascaded LUTs. Finally, we
introduce channel indexing to allow cross-channel interaction, enabling LUTs to
process color channels jointly. In these principled ways, the total size of
MuLUT is linear to its indexing capacity, yielding a practical solution to
obtain superior performance with the enlarged receptive field. We examine the
advantage of MuLUT on various image restoration tasks, including
super-resolution, demosaicing, denoising, and deblocking. MuLUT achieves a
significant improvement over the single-LUT solution, e.g., up to 1.1dB PSNR
for super-resolution and up to 2.8dB PSNR for grayscale denoising, while
preserving its efficiency, which is 100$\times$ less in energy cost compared
with lightweight deep neural networks. Our code and trained models are publicly
available at https://github.com/ddlee-cn/MuLUT.
Related papers
- LeRF: Learning Resampling Function for Adaptive and Efficient Image Interpolation [64.34935748707673]
Recent deep neural networks (DNNs) have made impressive progress in performance by introducing learned data priors.
We propose a novel method of Learning Resampling (termed LeRF) which takes advantage of both the structural priors learned by DNNs and the locally continuous assumption.
LeRF assigns spatially varying resampling functions to input image pixels and learns to predict the shapes of these resampling functions with a neural network.
arXiv Detail & Related papers (2024-07-13T16:09:45Z) - PolyLUT-Add: FPGA-based LUT Inference with Wide Inputs [1.730979251211628]
This work introduces PolyLUT-Add, a technique that enhances neuron connectivity by combining $A$ PolyLUT sub-neurons via addition to improve accuracy.
We evaluate our implementation over the MNIST, Jet Substructure classification, and Network Intrusion Detection benchmark and found that for similar accuracy, PolyLUT-Add achieves a LUT reduction of $2.0-13.9times$ with a $1.2-1.6times$ decrease in latency.
arXiv Detail & Related papers (2024-06-07T13:00:57Z) - NeuraLUT: Hiding Neural Network Density in Boolean Synthesizable Functions [2.7086888205833968]
Field-Programmable Gate Array (FPGA) accelerators have proven successful in handling latency- and resource-critical deep neural network (DNN) inference tasks.
We propose relaxing the boundaries of neurons and mapping entire sub-networks to a single LUT.
We validate our proposed method on a known latency-critical task, jet substructure tagging, and on the classical computer vision task, digit classification using MNIST.
arXiv Detail & Related papers (2024-02-29T16:10:21Z) - Hundred-Kilobyte Lookup Tables for Efficient Single-Image Super-Resolution [7.403264755337134]
Super-resolution (SR) schemes make heavy use of convolutional neural networks (CNNs), which involve intensive multiply-accumulate (MAC) operations.
This contradicts the regime of edge AI that often runs on devices strained by power, computing, and storage resources.
This work tackles this storage hurdle and innovates hundred-kilobyte LUT (HKLUT) models to amenable to on-chip cache.
arXiv Detail & Related papers (2023-12-11T04:07:34Z) - Spatially-Adaptive Feature Modulation for Efficient Image
Super-Resolution [90.16462805389943]
We develop a spatially-adaptive feature modulation (SAFM) mechanism upon a vision transformer (ViT)-like block.
Proposed method is $3times$ smaller than state-of-the-art efficient SR methods.
arXiv Detail & Related papers (2023-02-27T14:19:31Z) - Effective Invertible Arbitrary Image Rescaling [77.46732646918936]
Invertible Neural Networks (INN) are able to increase upscaling accuracy significantly by optimizing the downscaling and upscaling cycle jointly.
A simple and effective invertible arbitrary rescaling network (IARN) is proposed to achieve arbitrary image rescaling by training only one model in this work.
It is shown to achieve a state-of-the-art (SOTA) performance in bidirectional arbitrary rescaling without compromising perceptual quality in LR outputs.
arXiv Detail & Related papers (2022-09-26T22:22:30Z) - Neural Implicit Dictionary via Mixture-of-Expert Training [111.08941206369508]
We present a generic INR framework that achieves both data and training efficiency by learning a Neural Implicit Dictionary (NID)
Our NID assembles a group of coordinate-based Impworks which are tuned to span the desired function space.
Our experiments show that, NID can improve reconstruction of 2D images or 3D scenes by 2 orders of magnitude faster with up to 98% less input data.
arXiv Detail & Related papers (2022-07-08T05:07:19Z) - MultiRes-NetVLAD: Augmenting Place Recognition Training with
Low-Resolution Imagery [28.875236694573815]
We augment NetVLAD representation learning with low-resolution image pyramid encoding.
The resultant multi-resolution feature pyramid can be conveniently aggregated through VLAD into a single compact representation.
We show that the underlying learnt feature tensor can be combined with existing multi-scale approaches to improve their baseline performance.
arXiv Detail & Related papers (2022-02-18T11:53:01Z) - Instant Neural Graphics Primitives with a Multiresolution Hash Encoding [67.33850633281803]
We present a versatile new input encoding that permits the use of a smaller network without sacrificing quality.
A small neural network is augmented by a multiresolution hash table of trainable feature vectors whose values are optimized through a gradient descent.
We achieve a combined speed of several orders of magnitude, enabling training of high-quality neural graphics primitives in a matter of seconds.
arXiv Detail & Related papers (2022-01-16T07:22:47Z) - FarSee-Net: Real-Time Semantic Segmentation by Efficient Multi-scale
Context Aggregation and Feature Space Super-resolution [14.226301825772174]
We introduce a novel and efficient module called Cascaded Factorized Atrous Spatial Pyramid Pooling (CF-ASPP)
It is a lightweight cascaded structure for Convolutional Neural Networks (CNNs) to efficiently leverage context information.
We achieve 68.4% mIoU at 84 fps on the Cityscapes test set with a single Nivida Titan X (Maxwell) GPU card.
arXiv Detail & Related papers (2020-03-09T03:53:57Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.