Real-time Image Enhancer via Learnable Spatial-aware 3D Lookup Tables
- URL: http://arxiv.org/abs/2108.08697v1
- Date: Thu, 19 Aug 2021 14:04:59 GMT
- Title: Real-time Image Enhancer via Learnable Spatial-aware 3D Lookup Tables
- Authors: Tao Wang, Yong Li, Jingyang Peng, Yipeng Ma, Xian Wang, Fenglong Song,
Youliang Yan
- Abstract summary: We propose a novel real-time image enhancer via learnable spatial-aware 3-dimentional lookup tables(LUTs)
We learn the spatial-aware 3D LUTs and fuse them according to the aforementioned weights in an end-to-end manner.
Our model outperforms SOTA image enhancement methods on public datasets both subjectively and objectively.
- Score: 12.4260963890153
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Recently, deep learning-based image enhancement algorithms achieved
state-of-the-art (SOTA) performance on several publicly available datasets.
However, most existing methods fail to meet practical requirements either for
visual perception or for computation efficiency, especially for high-resolution
images. In this paper, we propose a novel real-time image enhancer via
learnable spatial-aware 3-dimentional lookup tables(3D LUTs), which well
considers global scenario and local spatial information. Specifically, we
introduce a light weight two-head weight predictor that has two outputs. One is
a 1D weight vector used for image-level scenario adaptation, the other is a 3D
weight map aimed for pixel-wise category fusion. We learn the spatial-aware 3D
LUTs and fuse them according to the aforementioned weights in an end-to-end
manner. The fused LUT is then used to transform the source image into the
target tone in an efficient way. Extensive results show that our model
outperforms SOTA image enhancement methods on public datasets both subjectively
and objectively, and that our model only takes about 4ms to process a 4K
resolution image on one NVIDIA V100 GPU.
Related papers
- OpenDlign: Enhancing Open-World 3D Learning with Depth-Aligned Images [17.344430840048094]
We present OpenDlign, a novel open-world 3D model using depth-aligned images for robust multimodal alignment.
OpenDlign achieves high zero-shot and few-shot performance on diverse 3D tasks, despite only fine-tuning 6 million parameters.
arXiv Detail & Related papers (2024-04-25T11:53:36Z) - Compress3D: a Compressed Latent Space for 3D Generation from a Single Image [27.53099431097921]
Triplane autoencoder encodes 3D models into a compact triplane latent space to compress both the 3D geometry and texture information.
We introduce a 3D-aware cross-attention mechanism, which utilizes low-resolution latent representations to query features from a high-resolution 3D feature volume.
Our approach enables the generation of high-quality 3D assets in merely 7 seconds on a single A100 GPU.
arXiv Detail & Related papers (2024-03-20T11:51:04Z) - 3DiffTection: 3D Object Detection with Geometry-Aware Diffusion Features [70.50665869806188]
3DiffTection is a state-of-the-art method for 3D object detection from single images.
We fine-tune a diffusion model to perform novel view synthesis conditioned on a single image.
We further train the model on target data with detection supervision.
arXiv Detail & Related papers (2023-11-07T23:46:41Z) - PonderV2: Pave the Way for 3D Foundation Model with A Universal
Pre-training Paradigm [114.47216525866435]
We introduce a novel universal 3D pre-training framework designed to facilitate the acquisition of efficient 3D representation.
For the first time, PonderV2 achieves state-of-the-art performance on 11 indoor and outdoor benchmarks, implying its effectiveness.
arXiv Detail & Related papers (2023-10-12T17:59:57Z) - Simple and Effective Synthesis of Indoor 3D Scenes [78.95697556834536]
We study the problem of immersive 3D indoor scenes from one or more images.
Our aim is to generate high-resolution images and videos from novel viewpoints.
We propose an image-to-image GAN that maps directly from reprojections of incomplete point clouds to full high-resolution RGB-D images.
arXiv Detail & Related papers (2022-04-06T17:54:46Z) - Learning Image-adaptive 3D Lookup Tables for High Performance Photo
Enhancement in Real-time [33.93249921871407]
In this paper, we learn image-adaptive 3-dimensional lookup tables (3D LUTs) to achieve fast and robust photo enhancement.
We learn 3D LUTs from annotated data using pairwise or unpaired learning.
We learn multiple basis 3D LUTs and a small convolutional neural network (CNN) simultaneously in an end-to-end manner.
arXiv Detail & Related papers (2020-09-30T06:34:57Z) - PerMO: Perceiving More at Once from a Single Image for Autonomous
Driving [76.35684439949094]
We present a novel approach to detect, segment, and reconstruct complete textured 3D models of vehicles from a single image.
Our approach combines the strengths of deep learning and the elegance of traditional techniques.
We have integrated these algorithms with an autonomous driving system.
arXiv Detail & Related papers (2020-07-16T05:02:45Z) - Lightweight Multi-View 3D Pose Estimation through Camera-Disentangled
Representation [57.11299763566534]
We present a solution to recover 3D pose from multi-view images captured with spatially calibrated cameras.
We exploit 3D geometry to fuse input images into a unified latent representation of pose, which is disentangled from camera view-points.
Our architecture then conditions the learned representation on camera projection operators to produce accurate per-view 2d detections.
arXiv Detail & Related papers (2020-04-05T12:52:29Z) - ZoomNet: Part-Aware Adaptive Zooming Neural Network for 3D Object
Detection [69.68263074432224]
We present a novel framework named ZoomNet for stereo imagery-based 3D detection.
The pipeline of ZoomNet begins with an ordinary 2D object detection model which is used to obtain pairs of left-right bounding boxes.
To further exploit the abundant texture cues in RGB images for more accurate disparity estimation, we introduce a conceptually straight-forward module -- adaptive zooming.
arXiv Detail & Related papers (2020-03-01T17:18:08Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.