Learning Image-adaptive 3D Lookup Tables for High Performance Photo
Enhancement in Real-time
- URL: http://arxiv.org/abs/2009.14468v1
- Date: Wed, 30 Sep 2020 06:34:57 GMT
- Title: Learning Image-adaptive 3D Lookup Tables for High Performance Photo
Enhancement in Real-time
- Authors: Hui Zeng, Jianrui Cai, Lida Li, Zisheng Cao, Lei Zhang
- Abstract summary: In this paper, we learn image-adaptive 3-dimensional lookup tables (3D LUTs) to achieve fast and robust photo enhancement.
We learn 3D LUTs from annotated data using pairwise or unpaired learning.
We learn multiple basis 3D LUTs and a small convolutional neural network (CNN) simultaneously in an end-to-end manner.
- Score: 33.93249921871407
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Recent years have witnessed the increasing popularity of learning based
methods to enhance the color and tone of photos. However, many existing photo
enhancement methods either deliver unsatisfactory results or consume too much
computational and memory resources, hindering their application to
high-resolution images (usually with more than 12 megapixels) in practice. In
this paper, we learn image-adaptive 3-dimensional lookup tables (3D LUTs) to
achieve fast and robust photo enhancement. 3D LUTs are widely used for
manipulating color and tone of photos, but they are usually manually tuned and
fixed in camera imaging pipeline or photo editing tools. We, for the first time
to our best knowledge, propose to learn 3D LUTs from annotated data using
pairwise or unpaired learning. More importantly, our learned 3D LUT is
image-adaptive for flexible photo enhancement. We learn multiple basis 3D LUTs
and a small convolutional neural network (CNN) simultaneously in an end-to-end
manner. The small CNN works on the down-sampled version of the input image to
predict content-dependent weights to fuse the multiple basis 3D LUTs into an
image-adaptive one, which is employed to transform the color and tone of source
images efficiently. Our model contains less than 600K parameters and takes less
than 2 ms to process an image of 4K resolution using one Titan RTX GPU. While
being highly efficient, our model also outperforms the state-of-the-art photo
enhancement methods by a large margin in terms of PSNR, SSIM and a color
difference metric on two publically available benchmark datasets.
Related papers
- ViewDiff: 3D-Consistent Image Generation with Text-to-Image Models [65.22994156658918]
We present a method that learns to generate multi-view images in a single denoising process from real-world data.
We design an autoregressive generation that renders more 3D-consistent images at any viewpoint.
arXiv Detail & Related papers (2024-03-04T07:57:05Z) - Splatter Image: Ultra-Fast Single-View 3D Reconstruction [67.96212093828179]
Splatter Image is based on Gaussian Splatting, which allows fast and high-quality reconstruction of 3D scenes from multiple images.
We learn a neural network that, at test time, performs reconstruction in a feed-forward manner, at 38 FPS.
On several synthetic, real, multi-category and large-scale benchmark datasets, we achieve better results in terms of PSNR, LPIPS, and other metrics while training and evaluating much faster than prior works.
arXiv Detail & Related papers (2023-12-20T16:14:58Z) - NILUT: Conditional Neural Implicit 3D Lookup Tables for Image
Enhancement [82.75363196702381]
3D lookup tables (3D LUTs) are a key component for image enhancement.
Current approaches for learning and applying 3D LUTs are notably fast, yet not so memory-efficient.
We propose a Neural Implicit LUT (NILUT), an implicitly defined continuous 3D color transformation parameterized by a neural network.
arXiv Detail & Related papers (2023-06-20T22:06:39Z) - Generative Multiplane Neural Radiance for 3D-Aware Image Generation [102.15322193381617]
We present a method to efficiently generate 3D-aware high-resolution images that are view-consistent across multiple target views.
Our GMNR model generates 3D-aware images of 1024 X 1024 pixels with 17.6 FPS on a single V100.
arXiv Detail & Related papers (2023-04-03T17:41:20Z) - 4D LUT: Learnable Context-Aware 4D Lookup Table for Image Enhancement [50.49396123016185]
We propose a novel learnable context-aware 4-dimensional lookup table (4D LUT)
It achieves content-dependent enhancement of different contents in each image via adaptively learning of photo context.
Compared with traditional 3D LUT, i.e., RGB mapping to RGB, 4D LUT enables finer control of color transformations for pixels with different content in each image.
arXiv Detail & Related papers (2022-09-05T04:00:57Z) - SepLUT: Separable Image-adaptive Lookup Tables for Real-time Image
Enhancement [21.963622337032344]
We present SepLUT (separable image-adaptive lookup table) to tackle the above limitations.
Specifically, we separate a single color transform into a cascade of component-independent and component-correlated sub-transforms instantiated as 1D and 3D LUTs.
In this way, the capabilities of two sub-transforms can facilitate each other, where the 3D LUT complements the ability to mix up color components.
arXiv Detail & Related papers (2022-07-18T02:27:19Z) - Data Efficient 3D Learner via Knowledge Transferred from 2D Model [30.077342050473515]
We deal with the data scarcity challenge of 3D tasks by transferring knowledge from strong 2D models via RGB-D images.
We utilize a strong and well-trained semantic segmentation model for 2D images to augment RGB-D images with pseudo-label.
Our method already outperforms existing state-of-the-art that is tailored for 3D label efficiency.
arXiv Detail & Related papers (2022-03-16T09:14:44Z) - Real-time Image Enhancer via Learnable Spatial-aware 3D Lookup Tables [12.4260963890153]
We propose a novel real-time image enhancer via learnable spatial-aware 3-dimentional lookup tables(LUTs)
We learn the spatial-aware 3D LUTs and fuse them according to the aforementioned weights in an end-to-end manner.
Our model outperforms SOTA image enhancement methods on public datasets both subjectively and objectively.
arXiv Detail & Related papers (2021-08-19T14:04:59Z) - Lightweight Multi-View 3D Pose Estimation through Camera-Disentangled
Representation [57.11299763566534]
We present a solution to recover 3D pose from multi-view images captured with spatially calibrated cameras.
We exploit 3D geometry to fuse input images into a unified latent representation of pose, which is disentangled from camera view-points.
Our architecture then conditions the learned representation on camera projection operators to produce accurate per-view 2d detections.
arXiv Detail & Related papers (2020-04-05T12:52:29Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.