End-To-End Underwater Video Enhancement: Dataset and Model
- URL: http://arxiv.org/abs/2403.11506v1
- Date: Mon, 18 Mar 2024 06:24:46 GMT
- Title: End-To-End Underwater Video Enhancement: Dataset and Model
- Authors: Dazhao Du, Enhan Li, Lingyu Si, Fanjiang Xu, Jianwei Niu,
- Abstract summary: Underwater video enhancement (UVE) aims to improve the visibility and frame quality of underwater videos.
Existing methods primarily focus on developing image enhancement algorithms to enhance each frame independently.
This study represents the first comprehensive exploration of UVE to our knowledge.
- Score: 6.153714458213646
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Underwater video enhancement (UVE) aims to improve the visibility and frame quality of underwater videos, which has significant implications for marine research and exploration. However, existing methods primarily focus on developing image enhancement algorithms to enhance each frame independently. There is a lack of supervised datasets and models specifically tailored for UVE tasks. To fill this gap, we construct the Synthetic Underwater Video Enhancement (SUVE) dataset, comprising 840 diverse underwater-style videos paired with ground-truth reference videos. Based on this dataset, we train a novel underwater video enhancement model, UVENet, which utilizes inter-frame relationships to achieve better enhancement performance. Through extensive experiments on both synthetic and real underwater videos, we demonstrate the effectiveness of our approach. This study represents the first comprehensive exploration of UVE to our knowledge. The code is available at https://anonymous.4open.science/r/UVENet.
Related papers
- Enhance-A-Video: Better Generated Video for Free [57.620595159855064]
We introduce a training-free approach to enhance the coherence and quality of DiT-based generated videos.
Our approach can be easily applied to most DiT-based video generation frameworks without any retraining or fine-tuning.
arXiv Detail & Related papers (2025-02-11T12:22:35Z) - VideoWorld: Exploring Knowledge Learning from Unlabeled Videos [119.35107657321902]
This work explores whether a deep generative model can learn complex knowledge solely from visual input.
We develop VideoWorld, an auto-regressive video generation model trained on unlabeled video data, and test its knowledge acquisition abilities in video-based Go and robotic control tasks.
arXiv Detail & Related papers (2025-01-16T18:59:10Z) - HUPE: Heuristic Underwater Perceptual Enhancement with Semantic Collaborative Learning [62.264673293638175]
Existing underwater image enhancement methods primarily focus on improving visual quality while overlooking practical implications.
We propose a invertible network for underwater perception enhancement, dubbed H, which enhances visual quality and demonstrates flexibility in handling other downstream tasks.
arXiv Detail & Related papers (2024-11-27T12:37:03Z) - UnDIVE: Generalized Underwater Video Enhancement Using Generative Priors [9.438388237767105]
We propose a two-stage framework for enhancing underwater videos.
The first stage uses a denoising diffusion descriptive model to learn a generative prior from unlabeled data.
In the second stage, this prior is incorporated into a physics-based image formulation for spatial enhancement.
Our method enables real-time and computationally-efficient processing of high-resolution underwater videos at lower resolutions.
arXiv Detail & Related papers (2024-11-08T11:16:36Z) - Video Instruction Tuning With Synthetic Data [84.64519990333406]
We create a high-quality synthetic dataset specifically for video instruction-following, namely LLaVA-Video-178K.
This dataset includes key tasks such as detailed captioning, open-ended question-answering (QA), and multiple-choice QA.
By training on this dataset, in combination with existing visual instruction tuning data, we introduce LLaVA-Video, a new video LMM.
arXiv Detail & Related papers (2024-10-03T17:36:49Z) - What Matters in Detecting AI-Generated Videos like Sora? [51.05034165599385]
Gap between synthetic and real-world videos remains under-explored.
In this study, we compare real-world videos with those generated by a state-of-the-art AI model, Stable Video Diffusion.
Our model is capable of detecting videos generated by Sora with high accuracy, even without exposure to any Sora videos during training.
arXiv Detail & Related papers (2024-06-27T23:03:58Z) - UVEB: A Large-scale Benchmark and Baseline Towards Real-World Underwater Video Enhancement [17.98919244870724]
We construct the first large-scale high-resolution underwater video enhancement benchmark (UVEB)
It contains 1,308 pairs of video sequences and more than 453,000 high-resolution with 38% Ultra-High-Definition (UHD) 4K frame pairs.
We also propose the first supervised underwater video enhancement method, UVE-Net. UVE-Net converts the current frame information into convolutional kernels and passes them to adjacent frames for efficient inter-frame information exchange.
arXiv Detail & Related papers (2024-04-22T19:29:12Z) - Separated Attention: An Improved Cycle GAN Based Under Water Image Enhancement Method [0.0]
We have utilized the cycle consistent learning technique of the state-of-the-art Cycle GAN model with modification in the loss function.
We trained the Cycle GAN model with the modified loss functions on the benchmarked Enhancing Underwater Visual Perception dataset.
The upgraded images provide better results from conventional models and further for under water navigation, pose estimation, saliency prediction, object detection and tracking.
arXiv Detail & Related papers (2024-04-11T11:12:06Z) - InternVideo: General Video Foundation Models via Generative and
Discriminative Learning [52.69422763715118]
We present general video foundation models, InternVideo, for dynamic and complex video-level understanding tasks.
InternVideo efficiently explores masked video modeling and video-language contrastive learning as the pretraining objectives.
InternVideo achieves state-of-the-art performance on 39 video datasets from extensive tasks including video action recognition/detection, video-language alignment, and open-world video applications.
arXiv Detail & Related papers (2022-12-06T18:09:49Z) - Perceptual underwater image enhancement with deep learning and physical
priors [35.37760003463292]
We propose two perceptual enhancement models, each of which uses a deep enhancement model with a detection perceptor.
Due to the lack of training data, a hybrid underwater image synthesis model, which fuses physical priors and data-driven cues, is proposed to synthesize training data.
Experimental results show the superiority of our proposed method over several state-of-the-art methods on both real-world and synthetic underwater datasets.
arXiv Detail & Related papers (2020-08-21T22:11:34Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.