UVEB: A Large-scale Benchmark and Baseline Towards Real-World Underwater Video Enhancement
- URL: http://arxiv.org/abs/2404.14542v2
- Date: Sat, 27 Apr 2024 09:55:51 GMT
- Title: UVEB: A Large-scale Benchmark and Baseline Towards Real-World Underwater Video Enhancement
- Authors: Yaofeng Xie, Lingwei Kong, Kai Chen, Ziqiang Zheng, Xiao Yu, Zhibin Yu, Bing Zheng,
- Abstract summary: We construct the first large-scale high-resolution underwater video enhancement benchmark (UVEB)
It contains 1,308 pairs of video sequences and more than 453,000 high-resolution with 38% Ultra-High-Definition (UHD) 4K frame pairs.
We also propose the first supervised underwater video enhancement method, UVE-Net. UVE-Net converts the current frame information into convolutional kernels and passes them to adjacent frames for efficient inter-frame information exchange.
- Score: 17.98919244870724
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Learning-based underwater image enhancement (UIE) methods have made great progress. However, the lack of large-scale and high-quality paired training samples has become the main bottleneck hindering the development of UIE. The inter-frame information in underwater videos can accelerate or optimize the UIE process. Thus, we constructed the first large-scale high-resolution underwater video enhancement benchmark (UVEB) to promote the development of underwater vision.It contains 1,308 pairs of video sequences and more than 453,000 high-resolution with 38\% Ultra-High-Definition (UHD) 4K frame pairs. UVEB comes from multiple countries, containing various scenes and video degradation types to adapt to diverse and complex underwater environments. We also propose the first supervised underwater video enhancement method, UVE-Net. UVE-Net converts the current frame information into convolutional kernels and passes them to adjacent frames for efficient inter-frame information exchange. By fully utilizing the redundant degraded information of underwater videos, UVE-Net completes video enhancement better. Experiments show the effective network design and good performance of UVE-Net.
Related papers
- Enhance-A-Video: Better Generated Video for Free [57.620595159855064]
We introduce a training-free approach to enhance the coherence and quality of DiT-based generated videos.
Our approach can be easily applied to most DiT-based video generation frameworks without any retraining or fine-tuning.
arXiv Detail & Related papers (2025-02-11T12:22:35Z) - UnDIVE: Generalized Underwater Video Enhancement Using Generative Priors [9.438388237767105]
We propose a two-stage framework for enhancing underwater videos.
The first stage uses a denoising diffusion descriptive model to learn a generative prior from unlabeled data.
In the second stage, this prior is incorporated into a physics-based image formulation for spatial enhancement.
Our method enables real-time and computationally-efficient processing of high-resolution underwater videos at lower resolutions.
arXiv Detail & Related papers (2024-11-08T11:16:36Z) - Enhancing Underwater Imaging with 4-D Light Fields: Dataset and Method [77.80712860663886]
4-D light fields (LFs) enhance underwater imaging plagued by light absorption, scattering, and other challenges.
We propose a progressive framework for underwater 4-D LF image enhancement and depth estimation.
We construct the first 4-D LF-based underwater image dataset for quantitative evaluation and supervised training of learning-based methods.
arXiv Detail & Related papers (2024-08-30T15:06:45Z) - LU2Net: A Lightweight Network for Real-time Underwater Image Enhancement [4.353142366661057]
Lightweight Underwater Unet (LU2Net) is a novel U-shape network designed specifically for real-time enhancement of underwater images.
LU2Net is capable of providing well-enhanced underwater images at a speed 8 times faster than the current state-of-the-art underwater image enhancement method.
arXiv Detail & Related papers (2024-06-21T08:33:13Z) - Hierarchical Patch Diffusion Models for High-Resolution Video Generation [50.42746357450949]
We develop deep context fusion, which propagates context information from low-scale to high-scale patches in a hierarchical manner.
We also propose adaptive computation, which allocates more network capacity and computation towards coarse image details.
The resulting model sets a new state-of-the-art FVD score of 66.32 and Inception Score of 87.68 in class-conditional video generation.
arXiv Detail & Related papers (2024-06-12T01:12:53Z) - AID: Adapting Image2Video Diffusion Models for Instruction-guided Video Prediction [88.70116693750452]
Text-guided video prediction (TVP) involves predicting the motion of future frames from the initial frame according to an instruction.
Previous TVP methods make significant breakthroughs by adapting Stable Diffusion for this task.
We introduce the Multi-Modal Large Language Model (MLLM) to predict future video states based on initial frames and text instructions.
arXiv Detail & Related papers (2024-06-10T17:02:08Z) - End-To-End Underwater Video Enhancement: Dataset and Model [6.153714458213646]
Underwater video enhancement (UVE) aims to improve the visibility and frame quality of underwater videos.
Existing methods primarily focus on developing image enhancement algorithms to enhance each frame independently.
This study represents the first comprehensive exploration of UVE to our knowledge.
arXiv Detail & Related papers (2024-03-18T06:24:46Z) - Event-guided Multi-patch Network with Self-supervision for Non-uniform
Motion Deblurring [113.96237446327795]
We present a novel self-supervised event-guided deep hierarchical Multi-patch Network to deal with blurry images and videos.
We also propose an event-guided architecture to exploit motion cues contained in videos to tackle complex blur in videos.
Our MPN achieves the state of the art on the GoPro and VideoDeblurring datasets with a 40x faster runtime compared to current multi-scale methods.
arXiv Detail & Related papers (2023-02-14T15:58:00Z) - Medium Transmission Map Matters for Learning to Restore Real-World
Underwater Images [3.0980025155565376]
We introduce the media transmission map as guidance to assist in image enhancement.
The proposed method can achieve advanced results of 22.6 dB on the challenging Test-R90 with an impressive 30 times faster than the existing models.
arXiv Detail & Related papers (2022-03-17T16:13:52Z) - Deep Video Prior for Video Consistency and Propagation [58.250209011891904]
We present a novel and general approach for blind video temporal consistency.
Our method is only trained on a pair of original and processed videos directly instead of a large dataset.
We show that temporal consistency can be achieved by training a convolutional neural network on a video with Deep Video Prior.
arXiv Detail & Related papers (2022-01-27T16:38:52Z) - LAFFNet: A Lightweight Adaptive Feature Fusion Network for Underwater
Image Enhancement [6.338178373376447]
We propose a lightweight adaptive feature fusion network (LAFFNet) for underwater image enhancement.
Our method reduces the number of parameters from 2.5M to 0.15M but outperforms state-of-the-art algorithms by extensive experiments.
arXiv Detail & Related papers (2021-05-04T05:31:10Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.