Underwater Image Restoration Through a Prior Guided Hybrid Sense Approach and Extensive Benchmark Analysis
- URL: http://arxiv.org/abs/2501.02701v1
- Date: Mon, 06 Jan 2025 01:06:37 GMT
- Title: Underwater Image Restoration Through a Prior Guided Hybrid Sense Approach and Extensive Benchmark Analysis
- Authors: Xiaojiao Guo, Xuhang Chen, Shuqiang Wang, Chi-Man Pun,
- Abstract summary: The framework operates on multiple scales, employing the proposed textbfDetail Restorer module to restore low-level detailed features.<n>We construct a benchmark using paired training data from three real-world underwater datasets.<n>We tested 14 traditional and retrained 23 deep learning existing underwater image restoration methods on this benchmark, obtaining metric results for each approach.
- Score: 37.544713547176855
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Underwater imaging grapples with challenges from light-water interactions, leading to color distortions and reduced clarity. In response to these challenges, we propose a novel Color Balance Prior \textbf{Guided} \textbf{Hyb}rid \textbf{Sens}e \textbf{U}nderwater \textbf{I}mage \textbf{R}estoration framework (\textbf{GuidedHybSensUIR}). This framework operates on multiple scales, employing the proposed \textbf{Detail Restorer} module to restore low-level detailed features at finer scales and utilizing the proposed \textbf{Feature Contextualizer} module to capture long-range contextual relations of high-level general features at a broader scale. The hybridization of these different scales of sensing results effectively addresses color casts and restores blurry details. In order to effectively point out the evolutionary direction for the model, we propose a novel \textbf{Color Balance Prior} as a strong guide in the feature contextualization step and as a weak guide in the final decoding phase. We construct a comprehensive benchmark using paired training data from three real-world underwater datasets and evaluate on six test sets, including three paired and three unpaired, sourced from four real-world underwater datasets. Subsequently, we tested 14 traditional and retrained 23 deep learning existing underwater image restoration methods on this benchmark, obtaining metric results for each approach. This effort aims to furnish a valuable benchmarking dataset for standard basis for comparison. The extensive experiment results demonstrate that our method outperforms 37 other state-of-the-art methods overall on various benchmark datasets and metrics, despite not achieving the best results in certain individual cases. The code and dataset are available at \href{https://github.com/CXH-Research/GuidedHybSensUIR}{https://github.com/CXH-Research/GuidedHybSensUIR}.
Related papers
- A Generative Data Framework with Authentic Supervision for Underwater Image Restoration and Enhancement [51.382274157144714]
We develop a generative data framework based on unpaired image-to-image translation.<n>The framework constructs synthetic datasets with precise ground-truth labels.<n>Experiments show that models trained on our synthetic data achieve comparable or superior color restoration and generalization performance to those trained on existing benchmarks.
arXiv Detail & Related papers (2025-11-18T14:20:17Z) - SWAGSplatting: Semantic-guided Water-scene Augmented Gaussian Splatting [9.070464075411472]
We propose a novel framework that leverages multimodal cross-knowledge to create semantic-guided 3D Gaussian Splatting for robust and high-fidelity deep-sea scene reconstruction.<n>Our approach consistently outperforms state-of-the-art methods on SeaThru-NeRF and Submerged3D datasets.
arXiv Detail & Related papers (2025-08-31T11:20:02Z) - Dataset Condensation with Color Compensation [1.8962690634270805]
Existing methods struggle with two: image-level selection methods (Coreset Selection, dataset Quantization) suffer from condensation inefficiency.<n>We find that a critical problem in dataset condensation is the oversight of color's dual role as an information carrier and a basic semantic representation unit.<n>We propose DC3: a dataset condensation framework with Color Compensation.
arXiv Detail & Related papers (2025-08-02T01:44:23Z) - Generating Synthetic Stereo Datasets using 3D Gaussian Splatting and Expert Knowledge Transfer [16.040335263873278]
We introduce a 3D Gaussian Splatting (3DGS)-based pipeline for stereo dataset generation, offering an efficient alternative to Neural Radiance Fields (NeRF)-based methods.<n>We find that when fine-tuning stereo models on 3DGS-generated datasets, we demonstrate competitive performance in zero-shot generalization benchmarks.
arXiv Detail & Related papers (2025-06-05T11:41:09Z) - Plenodium: UnderWater 3D Scene Reconstruction with Plenoptic Medium Representation [31.47797579690604]
We present Plenodium, a 3D representation framework capable of jointly modeling both objects and participating media.<n>In contrast to existing medium representations that rely solely on view-dependent modeling, our novel plenoptic medium representation incorporates both directional and positional information.<n>Experiments on real-world underwater datasets demonstrate that our method achieves significant improvements in 3D reconstruction.
arXiv Detail & Related papers (2025-05-27T14:37:58Z) - RUSplatting: Robust 3D Gaussian Splatting for Sparse-View Underwater Scene Reconstruction [9.070464075411472]
This paper presents an enhanced Gaussian Splatting-based framework that improves both the visual quality and accuracy of deep underwater rendering.<n>We propose decoupled learning for RGB channels, guided by the physics of underwater attenuation, to enable more accurate colour restoration.<n>We release a newly collected dataset, Submerged3D, captured specifically in deep-sea environments.
arXiv Detail & Related papers (2025-05-21T16:42:15Z) - TextSplat: Text-Guided Semantic Fusion for Generalizable Gaussian Splatting [46.753153357441505]
Generalizable Gaussian Splatting has enabled robust 3D reconstruction from sparse input views.
We propose TextSplat--the first text-driven Generalizable Gaussian Splatting framework.
arXiv Detail & Related papers (2025-04-13T14:14:10Z) - DEPTHOR: Depth Enhancement from a Practical Light-Weight dToF Sensor and RGB Image [8.588871458005114]
We propose a novel completion-based method, named DEPTHOR, for depth enhancement in computer vision.
First, we simulate real-world dToF data from the accurate ground truth in synthetic datasets to enable noise-robust training.
Second, we design a novel network that incorporates monocular depth estimation (MDE), leveraging global depth relationships and contextual information to improve prediction in challenging regions.
arXiv Detail & Related papers (2025-04-02T11:02:21Z) - MetricGold: Leveraging Text-To-Image Latent Diffusion Models for Metric Depth Estimation [9.639797094021988]
MetricGold is a novel approach that harnesses generative diffusion model's rich priors to improve metric depth estimation.
Our experiments demonstrate robust generalization across diverse datasets, producing sharper and higher quality metric depth estimates.
arXiv Detail & Related papers (2024-11-16T20:59:01Z) - DepthSplat: Connecting Gaussian Splatting and Depth [90.06180236292866]
We present DepthSplat to connect Gaussian splatting and depth estimation.
We first contribute a robust multi-view depth model by leveraging pre-trained monocular depth features.
We also show that Gaussian splatting can serve as an unsupervised pre-training objective.
arXiv Detail & Related papers (2024-10-17T17:59:58Z) - TranSplat: Generalizable 3D Gaussian Splatting from Sparse Multi-View Images with Transformers [14.708092244093665]
We develop a strategy that utilizes a predicted depth confidence map to guide accurate local feature matching.
We present a novel G-3DGS method named TranSplat, which obtains the best performance on both the RealEstate10K and ACID benchmarks.
arXiv Detail & Related papers (2024-08-25T08:37:57Z) - Metrically Scaled Monocular Depth Estimation through Sparse Priors for
Underwater Robots [0.0]
We formulate a deep learning model that fuses sparse depth measurements from triangulated features to improve the depth predictions.
The network is trained in a supervised fashion on the forward-looking underwater dataset, FLSea.
The method achieves real-time performance, running at 160 FPS on a laptop GPU and 7 FPS on a single CPU core.
arXiv Detail & Related papers (2023-10-25T16:32:31Z) - MV-JAR: Masked Voxel Jigsaw and Reconstruction for LiDAR-Based
Self-Supervised Pre-Training [58.07391711548269]
Masked Voxel Jigsaw and Reconstruction (MV-JAR) method for LiDAR-based self-supervised pre-training.
Masked Voxel Jigsaw and Reconstruction (MV-JAR) method for LiDAR-based self-supervised pre-training.
arXiv Detail & Related papers (2023-03-23T17:59:02Z) - Dataset Distillation via Factorization [58.8114016318593]
We introduce a emphdataset factorization approach, termed emphHaBa, which is a plug-and-play strategy portable to any existing dataset distillation (DD) baseline.
emphHaBa explores decomposing a dataset into two components: data emphHallucination networks and emphBases.
Our method can yield significant improvement on downstream classification tasks compared with previous state of the arts, while reducing the total number of compressed parameters by up to 65%.
arXiv Detail & Related papers (2022-10-30T08:36:19Z) - DoubleMix: Simple Interpolation-Based Data Augmentation for Text
Classification [56.817386699291305]
This paper proposes a simple yet effective data augmentation approach termed DoubleMix.
DoubleMix first generates several perturbed samples for each training data.
It then uses the perturbed data and original data to carry out a two-step in the hidden space of neural models.
arXiv Detail & Related papers (2022-09-12T15:01:04Z) - Bridge the Gap Between CV and NLP! A Gradient-based Textual Adversarial
Attack Framework [17.17479625646699]
We propose a unified framework to craft textual adversarial samples.
In this paper, we instantiate our framework with an attack algorithm named Textual Projected Gradient Descent (T-PGD)
arXiv Detail & Related papers (2021-10-28T17:31:51Z) - Revisiting Deep Local Descriptor for Improved Few-Shot Classification [56.74552164206737]
We show how one can improve the quality of embeddings by leveraging textbfDense textbfClassification and textbfAttentive textbfPooling.
We suggest to pool feature maps by applying attentive pooling instead of the widely used global average pooling (GAP) to prepare embeddings for few-shot classification.
arXiv Detail & Related papers (2021-03-30T00:48:28Z) - Generalizing Face Forgery Detection with High-frequency Features [63.33397573649408]
Current CNN-based detectors tend to overfit to method-specific color textures and thus fail to generalize.
We propose to utilize the high-frequency noises for face forgery detection.
The first is the multi-scale high-frequency feature extraction module that extracts high-frequency noises at multiple scales.
The second is the residual-guided spatial attention module that guides the low-level RGB feature extractor to concentrate more on forgery traces from a new perspective.
arXiv Detail & Related papers (2021-03-23T08:19:21Z) - 3D Dense Geometry-Guided Facial Expression Synthesis by Adversarial
Learning [54.24887282693925]
We propose a novel framework to exploit 3D dense (depth and surface normals) information for expression manipulation.
We use an off-the-shelf state-of-the-art 3D reconstruction model to estimate the depth and create a large-scale RGB-Depth dataset.
Our experiments demonstrate that the proposed method outperforms the competitive baseline and existing arts by a large margin.
arXiv Detail & Related papers (2020-09-30T17:12:35Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.