UnDIVE: Generalized Underwater Video Enhancement Using Generative Priors
- URL: http://arxiv.org/abs/2411.05886v1
- Date: Fri, 08 Nov 2024 11:16:36 GMT
- Title: UnDIVE: Generalized Underwater Video Enhancement Using Generative Priors
- Authors: Suhas Srinath, Aditya Chandrasekar, Hemang Jamadagni, Rajiv Soundararajan, Prathosh A P,
- Abstract summary: We propose a two-stage framework for enhancing underwater videos.
The first stage uses a denoising diffusion descriptive model to learn a generative prior from unlabeled data.
In the second stage, this prior is incorporated into a physics-based image formulation for spatial enhancement.
Our method enables real-time and computationally-efficient processing of high-resolution underwater videos at lower resolutions.
- Score: 9.438388237767105
- License:
- Abstract: With the rise of marine exploration, underwater imaging has gained significant attention as a research topic. Underwater video enhancement has become crucial for real-time computer vision tasks in marine exploration. However, most existing methods focus on enhancing individual frames and neglect video temporal dynamics, leading to visually poor enhancements. Furthermore, the lack of ground-truth references limits the use of abundant available underwater video data in many applications. To address these issues, we propose a two-stage framework for enhancing underwater videos. The first stage uses a denoising diffusion probabilistic model to learn a generative prior from unlabeled data, capturing robust and descriptive feature representations. In the second stage, this prior is incorporated into a physics-based image formulation for spatial enhancement, while also enforcing temporal consistency between video frames. Our method enables real-time and computationally-efficient processing of high-resolution underwater videos at lower resolutions, and offers efficient enhancement in the presence of diverse water-types. Extensive experiments on four datasets show that our approach generalizes well and outperforms existing enhancement methods. Our code is available at github.com/suhas-srinath/undive.
Related papers
- Video Depth Anything: Consistent Depth Estimation for Super-Long Videos [60.857723250653976]
We propose Video Depth Anything for high-quality, consistent depth estimation in super-long videos.
Our model is trained on a joint dataset of video depth and unlabeled images, similar to Depth Anything V2.
Our approach sets a new state-of-the-art in zero-shot video depth estimation.
arXiv Detail & Related papers (2025-01-21T18:53:30Z) - HUPE: Heuristic Underwater Perceptual Enhancement with Semantic Collaborative Learning [62.264673293638175]
Existing underwater image enhancement methods primarily focus on improving visual quality while overlooking practical implications.
We propose a invertible network for underwater perception enhancement, dubbed H, which enhances visual quality and demonstrates flexibility in handling other downstream tasks.
arXiv Detail & Related papers (2024-11-27T12:37:03Z) - Depth Any Video with Scalable Synthetic Data [98.42356740981839]
We develop a scalable synthetic data pipeline, capturing real-time video depth data from diverse synthetic environments.
We leverage the powerful priors of generative video diffusion models to handle real-world videos effectively.
Our model outperforms all previous generative depth models in terms of spatial accuracy and temporal consistency.
arXiv Detail & Related papers (2024-10-14T17:59:46Z) - LU2Net: A Lightweight Network for Real-time Underwater Image Enhancement [4.353142366661057]
Lightweight Underwater Unet (LU2Net) is a novel U-shape network designed specifically for real-time enhancement of underwater images.
LU2Net is capable of providing well-enhanced underwater images at a speed 8 times faster than the current state-of-the-art underwater image enhancement method.
arXiv Detail & Related papers (2024-06-21T08:33:13Z) - WaterMono: Teacher-Guided Anomaly Masking and Enhancement Boosting for Robust Underwater Self-Supervised Monocular Depth Estimation [4.909989222186828]
We propose WaterMono, a novel framework for depth estimation and image enhancement.
It incorporates the following key measures: (1) We present a Teacher-Guided Anomaly Mask to identify dynamic regions within the images; (2) We employ depth information combined with the Underwater Image Formation Model to generate enhanced images, which in turn contribute to the depth estimation task; and (3) We utilize a rotated distillation strategy to enhance the model's rotational robustness.
arXiv Detail & Related papers (2024-06-19T08:49:45Z) - UVEB: A Large-scale Benchmark and Baseline Towards Real-World Underwater Video Enhancement [17.98919244870724]
We construct the first large-scale high-resolution underwater video enhancement benchmark (UVEB)
It contains 1,308 pairs of video sequences and more than 453,000 high-resolution with 38% Ultra-High-Definition (UHD) 4K frame pairs.
We also propose the first supervised underwater video enhancement method, UVE-Net. UVE-Net converts the current frame information into convolutional kernels and passes them to adjacent frames for efficient inter-frame information exchange.
arXiv Detail & Related papers (2024-04-22T19:29:12Z) - End-To-End Underwater Video Enhancement: Dataset and Model [6.153714458213646]
Underwater video enhancement (UVE) aims to improve the visibility and frame quality of underwater videos.
Existing methods primarily focus on developing image enhancement algorithms to enhance each frame independently.
This study represents the first comprehensive exploration of UVE to our knowledge.
arXiv Detail & Related papers (2024-03-18T06:24:46Z) - PUGAN: Physical Model-Guided Underwater Image Enhancement Using GAN with
Dual-Discriminators [120.06891448820447]
How to obtain clear and visually pleasant images has become a common concern of people.
The task of underwater image enhancement (UIE) has also emerged as the times require.
In this paper, we propose a physical model-guided GAN model for UIE, referred to as PUGAN.
Our PUGAN outperforms state-of-the-art methods in both qualitative and quantitative metrics.
arXiv Detail & Related papers (2023-06-15T07:41:12Z) - Space-Time Crop & Attend: Improving Cross-modal Video Representation
Learning [88.71867887257274]
We show that spatial augmentations such as cropping work well for videos too, but that previous implementations could not do this at a scale sufficient for it to work well.
To address this issue, we first introduce Feature Crop, a method to simulate such augmentations much more efficiently directly in feature space.
Second, we show that as opposed to naive average pooling, the use of transformer-based attention performance improves significantly.
arXiv Detail & Related papers (2021-03-18T12:32:24Z) - Exploiting Raw Images for Real-Scene Super-Resolution [105.18021110372133]
We study the problem of real-scene single image super-resolution to bridge the gap between synthetic data and real captured images.
We propose a method to generate more realistic training data by mimicking the imaging process of digital cameras.
We also develop a two-branch convolutional neural network to exploit the radiance information originally-recorded in raw images.
arXiv Detail & Related papers (2021-02-02T16:10:15Z) - Perceptual underwater image enhancement with deep learning and physical
priors [35.37760003463292]
We propose two perceptual enhancement models, each of which uses a deep enhancement model with a detection perceptor.
Due to the lack of training data, a hybrid underwater image synthesis model, which fuses physical priors and data-driven cues, is proposed to synthesize training data.
Experimental results show the superiority of our proposed method over several state-of-the-art methods on both real-world and synthetic underwater datasets.
arXiv Detail & Related papers (2020-08-21T22:11:34Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.