4K-Resolution Photo Exposure Correction at 125 FPS with ~8K Parameters
- URL: http://arxiv.org/abs/2311.08759v1
- Date: Wed, 15 Nov 2023 08:01:12 GMT
- Title: 4K-Resolution Photo Exposure Correction at 125 FPS with ~8K Parameters
- Authors: Yijie Zhou, Chao Li, Jin Liang, Tianyi Xu, Xin Liu, Jun Xu
- Abstract summary: In this paper, we propose extremely light-weight (with only 8K parameters) Multi-Scale Linear Transformation (MSLT) networks.
MSLT networks can process 4K-resolution sRGB images at 125 Frame-Per-Second (FPS) by a Titan GTX GPU.
Experiments on two benchmark datasets demonstrate the efficiency of our MSLTs against the state-of-the-arts on photo exposure correction.
- Score: 9.410502389242815
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The illumination of improperly exposed photographs has been widely corrected
using deep convolutional neural networks or Transformers. Despite with
promising performance, these methods usually suffer from large parameter
amounts and heavy computational FLOPs on high-resolution photographs. In this
paper, we propose extremely light-weight (with only ~8K parameters) Multi-Scale
Linear Transformation (MSLT) networks under the multi-layer perception
architecture, which can process 4K-resolution sRGB images at 125
Frame-Per-Second (FPS) by a Titan RTX GPU. Specifically, the proposed MSLT
networks first decompose an input image into high and low frequency layers by
Laplacian pyramid techniques, and then sequentially correct different layers by
pixel-adaptive linear transformation, which is implemented by efficient
bilateral grid learning or 1x1 convolutions. Experiments on two benchmark
datasets demonstrate the efficiency of our MSLTs against the state-of-the-arts
on photo exposure correction. Extensive ablation studies validate the
effectiveness of our contributions. The code is available at
https://github.com/Zhou-Yijie/MSLTNet.
Related papers
- An Image is Worth 32 Tokens for Reconstruction and Generation [54.24414696392026]
Transformer-based 1-Dimensional Tokenizer (TiTok) is an innovative approach that tokenizes images into 1D latent sequences.
TiTok achieves competitive performance to state-of-the-art approaches.
Our best-performing variant can significantly surpasses DiT-XL/2 (gFID 2.13 vs. 3.04) while still generating high-quality samples 74x faster.
arXiv Detail & Related papers (2024-06-11T17:59:56Z) - Reciprocal Attention Mixing Transformer for Lightweight Image Restoration [6.3159191692241095]
We propose a lightweight image restoration network, Reciprocal Attention Mixing Transformer (RAMiT)
It employs bi-dimensional (spatial and channel) self-attentions in parallel with different numbers of multi-heads.
It achieves state-of-the-art performance on multiple lightweight IR tasks, including super-resolution, color denoising, grayscale denoising, low-light enhancement, and deraining.
arXiv Detail & Related papers (2023-05-19T06:55:04Z) - CoordFill: Efficient High-Resolution Image Inpainting via Parameterized
Coordinate Querying [52.91778151771145]
In this paper, we try to break the limitations for the first time thanks to the recent development of continuous implicit representation.
Experiments show that the proposed method achieves real-time performance on the 2048$times$2048 images using a single GTX 2080 Ti GPU.
arXiv Detail & Related papers (2023-03-15T11:13:51Z) - Ultra-High-Definition Low-Light Image Enhancement: A Benchmark and
Transformer-Based Method [51.30748775681917]
We consider the task of low-light image enhancement (LLIE) and introduce a large-scale database consisting of images at 4K and 8K resolution.
We conduct systematic benchmarking studies and provide a comparison of current LLIE algorithms.
As a second contribution, we introduce LLFormer, a transformer-based low-light enhancement method.
arXiv Detail & Related papers (2022-12-22T09:05:07Z) - {\mu}Split: efficient image decomposition for microscopy data [50.794670705085835]
muSplit is a dedicated approach for trained image decomposition in the context of fluorescence microscopy images.
We introduce lateral contextualization (LC), a novel meta-architecture that enables the memory efficient incorporation of large image-context.
We apply muSplit to five decomposition tasks, one on a synthetic dataset, four others derived from real microscopy data.
arXiv Detail & Related papers (2022-11-23T11:26:24Z) - CUF: Continuous Upsampling Filters [25.584630142930123]
In this paper, we consider one of the most important operations in image processing: upsampling.
We propose to parameterize upsampling kernels as neural fields.
This parameterization leads to a compact architecture that obtains a 40-fold reduction in the number of parameters when compared with competing arbitrary-scale super-resolution architectures.
arXiv Detail & Related papers (2022-10-13T12:45:51Z) - Progressively-connected Light Field Network for Efficient View Synthesis [69.29043048775802]
We present a Progressively-connected Light Field network (ProLiF) for the novel view synthesis of complex forward-facing scenes.
ProLiF encodes a 4D light field, which allows rendering a large batch of rays in one training step for image- or patch-level losses.
arXiv Detail & Related papers (2022-07-10T13:47:20Z) - Single UHD Image Dehazing via Interpretable Pyramid Network [10.00144096602321]
Currently, most single image dehazing models cannot run an ultra-high-resolution (UHD) image with a single GPU in real-time.
We introduce the principle of infinite approximation of Taylor's theorem with the Laplace pyramid pattern to build a model which is capable of handling 4K images in real-time.
arXiv Detail & Related papers (2022-02-17T11:14:12Z) - Spatial-Separated Curve Rendering Network for Efficient and
High-Resolution Image Harmonization [59.19214040221055]
We propose a novel spatial-separated curve rendering network (S$2$CRNet) for efficient and high-resolution image harmonization.
The proposed method reduces more than 90% parameters compared with previous methods.
Our method can work smoothly on higher resolution images in real-time which is more than 10$times$ faster than the existing methods.
arXiv Detail & Related papers (2021-09-13T07:20:16Z) - High-Resolution Photorealistic Image Translation in Real-Time: A
Laplacian Pyramid Translation Network [23.981019687483506]
We focus on speeding-up the high-resolution photorealistic I2IT tasks based on closed-form Laplacian pyramid decomposition and reconstruction.
We propose a Laplacian Pyramid Translation Network (N) to simultaneously perform these two tasks.
Our model avoids most of the heavy computation consumed by processing high-resolution feature maps and faithfully preserves the image details.
arXiv Detail & Related papers (2021-05-19T15:05:22Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.