FastLLVE: Real-Time Low-Light Video Enhancement with Intensity-Aware
Lookup Table
- URL: http://arxiv.org/abs/2308.06749v1
- Date: Sun, 13 Aug 2023 11:54:14 GMT
- Title: FastLLVE: Real-Time Low-Light Video Enhancement with Intensity-Aware
Lookup Table
- Authors: Wenhao Li, Guangyang Wu, Wenyi Wang, Peiran Ren and Xiaohong Liu
- Abstract summary: We propose an efficient pipeline named FastLLVE to maintain inter-frame brightness consistency effectively.
FastLLVE can process 1,080p videos at $mathit50+$ Frames Per Second (FPS), which is $mathit2 times$ faster than CNN-based methods in inference time.
- Score: 21.77469059123589
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Low-Light Video Enhancement (LLVE) has received considerable attention in
recent years. One of the critical requirements of LLVE is inter-frame
brightness consistency, which is essential for maintaining the temporal
coherence of the enhanced video. However, most existing single-image-based
methods fail to address this issue, resulting in flickering effect that
degrades the overall quality after enhancement. Moreover, 3D Convolution Neural
Network (CNN)-based methods, which are designed for video to maintain
inter-frame consistency, are computationally expensive, making them impractical
for real-time applications. To address these issues, we propose an efficient
pipeline named FastLLVE that leverages the Look-Up-Table (LUT) technique to
maintain inter-frame brightness consistency effectively. Specifically, we
design a learnable Intensity-Aware LUT (IA-LUT) module for adaptive
enhancement, which addresses the low-dynamic problem in low-light scenarios.
This enables FastLLVE to perform low-latency and low-complexity enhancement
operations while maintaining high-quality results. Experimental results on
benchmark datasets demonstrate that our method achieves the State-Of-The-Art
(SOTA) performance in terms of both image quality and inter-frame brightness
consistency. More importantly, our FastLLVE can process 1,080p videos at
$\mathit{50+}$ Frames Per Second (FPS), which is $\mathit{2 \times}$ faster
than SOTA CNN-based methods in inference time, making it a promising solution
for real-time applications. The code is available at
https://github.com/Wenhao-Li-777/FastLLVE.
Related papers
- SparseTem: Boosting the Efficiency of CNN-Based Video Encoders by Exploiting Temporal Continuity [15.872209884833977]
We propose a memory-efficient scheduling method to eliminate memory overhead and an online adjustment mechanism to minimize accuracy degradation.
SparseTem achieves speedup of 1.79x for EfficientDet and 4.72x for CRNN, with minimal accuracy drop and no additional memory overhead.
arXiv Detail & Related papers (2024-10-28T07:13:25Z) - Data Overfitting for On-Device Super-Resolution with Dynamic Algorithm and Compiler Co-Design [18.57172631588624]
We propose a Dynamic Deep neural network assisted by a Content-Aware data processing pipeline to reduce the model number down to one.
Our method achieves better PSNR and real-time performance (33 FPS) on an off-the-shelf mobile phone.
arXiv Detail & Related papers (2024-07-03T05:17:26Z) - Binarized Low-light Raw Video Enhancement [49.65466843856074]
Deep neural networks have achieved excellent performance on low-light raw video enhancement.
In this paper, we explore the feasibility of applying the extremely compact binary neural network (BNN) to low-light raw video enhancement.
arXiv Detail & Related papers (2024-03-29T02:55:07Z) - Scalable Neural Video Representations with Learnable Positional Features [73.51591757726493]
We show how to train neural representations with learnable positional features (NVP) that effectively amortize a video as latent codes.
We demonstrate the superiority of NVP on the popular UVG benchmark; compared with prior arts, NVP not only trains 2 times faster (less than 5 minutes) but also exceeds their encoding quality as 34.07rightarrow$34.57 (measured with the PSNR metric)
arXiv Detail & Related papers (2022-10-13T08:15:08Z) - Deep Parametric 3D Filters for Joint Video Denoising and Illumination
Enhancement in Video Super Resolution [96.89588203312451]
This paper presents a new parametric representation called Deep Parametric 3D Filters (DP3DF)
DP3DF incorporates local information to enable simultaneous denoising, illumination enhancement, and SR efficiently in a single encoder-and-decoder network.
Also, a dynamic residual frame is jointly learned with the DP3DF via a shared backbone to further boost the SR quality.
arXiv Detail & Related papers (2022-07-05T03:57:25Z) - Investigating Tradeoffs in Real-World Video Super-Resolution [90.81396836308085]
Real-world video super-resolution (VSR) models are often trained with diverse degradations to improve generalizability.
To alleviate the first tradeoff, we propose a degradation scheme that reduces up to 40% of training time without sacrificing performance.
To facilitate fair comparisons, we propose the new VideoLQ dataset, which contains a large variety of real-world low-quality video sequences.
arXiv Detail & Related papers (2021-11-24T18:58:21Z) - Efficient Spatio-Temporal Recurrent Neural Network for Video Deblurring [39.63844562890704]
Real-time deblurring still remains a challenging task due to the complexity of spatially and temporally varying blur itself.
We adopt residual dense blocks into RNN cells, so as to efficiently extract the spatial features of the current frame.
We contribute a novel dataset (BSD) to the community, by collecting paired/sharp video clips using a co-axis beam splitter acquisition system.
arXiv Detail & Related papers (2021-06-30T12:53:02Z) - Low-Fidelity End-to-End Video Encoder Pre-training for Temporal Action
Localization [96.73647162960842]
TAL is a fundamental yet challenging task in video understanding.
Existing TAL methods rely on pre-training a video encoder through action classification supervision.
We introduce a novel low-fidelity end-to-end (LoFi) video encoder pre-training method.
arXiv Detail & Related papers (2021-03-28T22:18:14Z) - Deep Space-Time Video Upsampling Networks [47.62807427163614]
Video super-resolution (VSR) and frame (FI) are traditional computer vision problems.
We propose an end-to-end framework for the space-time video upsampling by efficiently merging VSR and FI into a joint framework.
Results show better results both quantitatively and qualitatively, while reducing the time (x7 faster) and the number of parameters (30%) compared to baselines.
arXiv Detail & Related papers (2020-04-06T07:04:21Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.