Capsule Enhanced Variational AutoEncoder for Underwater Image Reconstruction
- URL: http://arxiv.org/abs/2406.01294v1
- Date: Mon, 3 Jun 2024 13:04:42 GMT
- Title: Capsule Enhanced Variational AutoEncoder for Underwater Image Reconstruction
- Authors: Rita Pucci, Niki Martinel,
- Abstract summary: We introduce a novel architecture that jointly tackles both issues by drawing inspiration from the discrete features quantization approach of Vector Quantized Variational Autoencoder (myVQVAE)
Our model combines an encoding network, that compresses the input into a latent representation, with two independent decoding networks, that enhance/reconstruct images using only the latent representation.
With the usage of capsule layers, we also overcome the differentiabilty issues of myVQVAE making our solution trainable in an end-to-end fashion without the need for particular optimization tricks.
- Score: 8.16306466526838
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Underwater image analysis is crucial for marine monitoring. However, it presents two major challenges (i) the visual quality of the images is often degraded due to wavelength-dependent light attenuation, scattering, and water types; (ii) capturing and storing high-resolution images is limited by hardware, which hinders long-term environmental analyses. Recently, deep neural networks have been introduced for underwater enhancement yet neglecting the challenge posed by the limitations of autonomous underwater image acquisition systems. We introduce a novel architecture that jointly tackles both issues by drawing inspiration from the discrete features quantization approach of Vector Quantized Variational Autoencoder (\myVQVAE). Our model combines an encoding network, that compresses the input into a latent representation, with two independent decoding networks, that enhance/reconstruct images using only the latent representation. One decoder focuses on the spatial information while the other captures information about the entities in the image by leveraging the concept of capsules. With the usage of capsule layers, we also overcome the differentiabilty issues of \myVQVAE making our solution trainable in an end-to-end fashion without the need for particular optimization tricks. Capsules perform feature quantization in a fully differentiable manner. We conducted thorough quantitative and qualitative evaluations on 6 benchmark datasets to assess the effectiveness of our contributions. Results demonstrate that we perform better than existing methods (eg, about $+1.4dB$ gain on the challenging LSUI Test-L400 dataset), while significantly reducing the amount of space needed for data storage (ie, $3\times$ more efficient).
Related papers
- DRACO-DehazeNet: An Efficient Image Dehazing Network Combining Detail Recovery and a Novel Contrastive Learning Paradigm [3.649619954898362]
Detail Recovery And Contrastive DehazeNet is a detailed image recovery network that tailors enhancements to specific dehazed scene contexts.
A major innovation is its ability to train effectively with limited data, achieved through a novel quadruplet loss-based contrastive dehazing paradigm.
arXiv Detail & Related papers (2024-10-18T16:48:31Z) - Harnessing Multi-resolution and Multi-scale Attention for Underwater Image Restoration [3.686808512438363]
Recent underwater image restoration methods either analyze the input image at full resolution, or progressively from high to low resolution, yielding reliable semantic information but reduced spatial accuracy.
Here, we propose a lightweight multi-stage network called Lit-Net that focuses on multi-resolution and multi-scale image analysis.
Our novel encoder block utilizes parallel $1times1$ convolution layers to capture local information and speed up operations.
arXiv Detail & Related papers (2024-08-19T11:36:48Z) - DGNet: Dynamic Gradient-Guided Network for Water-Related Optics Image
Enhancement [77.0360085530701]
Underwater image enhancement (UIE) is a challenging task due to the complex degradation caused by underwater environments.
Previous methods often idealize the degradation process, and neglect the impact of medium noise and object motion on the distribution of image features.
Our approach utilizes predicted images to dynamically update pseudo-labels, adding a dynamic gradient to optimize the network's gradient space.
arXiv Detail & Related papers (2023-12-12T06:07:21Z) - UWFormer: Underwater Image Enhancement via a Semi-Supervised Multi-Scale Transformer [26.15238399758745]
Underwater images often exhibit poor quality, distorted color balance and low contrast.
Current deep learning methods rely on Neural Convolutional Networks (CNNs) that lack the multi-scale enhancement.
We propose a Multi-scale Transformer-based Network for enhancing images at multiple frequencies via semi-supervised learning.
arXiv Detail & Related papers (2023-10-31T06:19:09Z) - PUGAN: Physical Model-Guided Underwater Image Enhancement Using GAN with
Dual-Discriminators [120.06891448820447]
How to obtain clear and visually pleasant images has become a common concern of people.
The task of underwater image enhancement (UIE) has also emerged as the times require.
In this paper, we propose a physical model-guided GAN model for UIE, referred to as PUGAN.
Our PUGAN outperforms state-of-the-art methods in both qualitative and quantitative metrics.
arXiv Detail & Related papers (2023-06-15T07:41:12Z) - UW-CVGAN: UnderWater Image Enhancement with Capsules Vectors
Quantization [25.23797117677732]
We introduce Underwater Capsules Vectors GAN UWCVGAN based on the discrete features quantization paradigm from VQGAN for this task.
The proposed UWCVGAN combines an encoding network, which compresses the image into its latent representation, with a decoding network, able to reconstruct the enhancement of the image from the only latent representation.
arXiv Detail & Related papers (2023-02-02T15:00:03Z) - Adaptive Uncertainty Distribution in Deep Learning for Unsupervised
Underwater Image Enhancement [1.9249287163937976]
One of the main challenges in deep learning-based underwater image enhancement is the limited availability of high-quality training data.
We propose a novel unsupervised underwater image enhancement framework that employs a conditional variational autoencoder (cVAE) to train a deep learning model.
We show that our proposed framework yields competitive performance compared to other state-of-the-art approaches in quantitative as well as qualitative metrics.
arXiv Detail & Related papers (2022-12-18T01:07:20Z) - Learning Enriched Features for Fast Image Restoration and Enhancement [166.17296369600774]
This paper presents a holistic goal of maintaining spatially-precise high-resolution representations through the entire network.
We learn an enriched set of features that combines contextual information from multiple scales, while simultaneously preserving the high-resolution spatial details.
Our approach achieves state-of-the-art results for a variety of image processing tasks, including defocus deblurring, image denoising, super-resolution, and image enhancement.
arXiv Detail & Related papers (2022-04-19T17:59:45Z) - Spatially-Adaptive Image Restoration using Distortion-Guided Networks [51.89245800461537]
We present a learning-based solution for restoring images suffering from spatially-varying degradations.
We propose SPAIR, a network design that harnesses distortion-localization information and dynamically adjusts to difficult regions in the image.
arXiv Detail & Related papers (2021-08-19T11:02:25Z) - Interpretable Detail-Fidelity Attention Network for Single Image
Super-Resolution [89.1947690981471]
We propose a purposeful and interpretable detail-fidelity attention network to progressively process smoothes and details in divide-and-conquer manner.
Particularly, we propose a Hessian filtering for interpretable feature representation which is high-profile for detail inference.
Experiments demonstrate that the proposed methods achieve superior performances over the state-of-the-art methods.
arXiv Detail & Related papers (2020-09-28T08:31:23Z) - Learning Enriched Features for Real Image Restoration and Enhancement [166.17296369600774]
convolutional neural networks (CNNs) have achieved dramatic improvements over conventional approaches for image restoration task.
We present a novel architecture with the collective goals of maintaining spatially-precise high-resolution representations through the entire network.
Our approach learns an enriched set of features that combines contextual information from multiple scales, while simultaneously preserving the high-resolution spatial details.
arXiv Detail & Related papers (2020-03-15T11:04:30Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.