Related papers: Forward-Looking Sonar Patch Matching: Modern CNNs, Ensembling, and Uncertainty

Forward-Looking Sonar Patch Matching: Modern CNNs, Ensembling, and Uncertainty

URL: http://arxiv.org/abs/2108.01066v1
Date: Mon, 2 Aug 2021 17:49:56 GMT
Title: Forward-Looking Sonar Patch Matching: Modern CNNs, Ensembling, and Uncertainty
Authors: Arka Mallick and Paul Pl\"oger and Matias Valdenegro-Toro
Abstract summary: Convolutional Neural Network (CNN) learns a similarity function and predicts if two input sonar images are similar or not. Best performing models are DenseNet Two-Channel network with 0.955 AUC, VGG-Siamese with contrastive loss at 0.949 AUC and DenseNet Siamese with 0.921 AUC.
Score: 0.0
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Application of underwater robots are on the rise, most of them are dependent on sonar for underwater vision, but the lack of strong perception capabilities limits them in this task. An important issue in sonar perception is matching image patches, which can enable other techniques like localization, change detection, and mapping. There is a rich literature for this problem in color images, but for acoustic images, it is lacking, due to the physics that produce these images. In this paper we improve on our previous results for this problem (Valdenegro-Toro et al, 2017), instead of modeling features manually, a Convolutional Neural Network (CNN) learns a similarity function and predicts if two input sonar images are similar or not. With the objective of improving the sonar image matching problem further, three state of the art CNN architectures are evaluated on the Marine Debris dataset, namely DenseNet, and VGG, with a siamese or two-channel architecture, and contrastive loss. To ensure a fair evaluation of each network, thorough hyper-parameter optimization is executed. We find that the best performing models are DenseNet Two-Channel network with 0.955 AUC, VGG-Siamese with contrastive loss at 0.949 AUC and DenseNet Siamese with 0.921 AUC. By ensembling the top performing DenseNet two-channel and DenseNet-Siamese models overall highest prediction accuracy obtained is 0.978 AUC, showing a large improvement over the 0.91 AUC in the state of the art.

Related papers

DGNet: Dynamic Gradient-Guided Network for Water-Related Optics Image Enhancement [77.0360085530701]
Underwater image enhancement (UIE) is a challenging task due to the complex degradation caused by underwater environments. Previous methods often idealize the degradation process, and neglect the impact of medium noise and object motion on the distribution of image features. Our approach utilizes predicted images to dynamically update pseudo-labels, adding a dynamic gradient to optimize the network's gradient space.
arXiv Detail & Related papers (2023-12-12T06:07:21Z)
Yin Yang Convolutional Nets: Image Manifold Extraction by the Analysis of Opposites [1.1560177966221703]
Yin Yang Convolutional Network is an architecture that extracts visual manifold. Our first model reached 93.32% test accuracy, 0.8% more than the older SOTA in this category. We also performed an analysis on ImageNet, where we reached 66.49% validation accuracy with 1.6M parameters.
arXiv Detail & Related papers (2023-10-24T19:48:07Z)
Combining UPerNet and ConvNeXt for Contrails Identification to reduce Global Warming [0.0]
This study focuses on aircraft contrail detection in global satellite images to improve contrail models and mitigate their impact on climate change. An innovative data preprocessing technique for NOAA GOES-16 satellite images is developed, using temperature data from the infrared channel to create false-color images, enhancing model perception. To tackle class imbalance, the training dataset exclusively includes images with positive contrail labels.
arXiv Detail & Related papers (2023-10-07T13:59:05Z)
EdgeNeXt: Efficiently Amalgamated CNN-Transformer Architecture for Mobile Vision Applications [68.35683849098105]
We introduce split depth-wise transpose attention (SDTA) encoder that splits input tensors into multiple channel groups. Our EdgeNeXt model with 1.3M parameters achieves 71.2% top-1 accuracy on ImageNet-1K. Our EdgeNeXt model with 5.6M parameters achieves 79.4% top-1 accuracy on ImageNet-1K.
arXiv Detail & Related papers (2022-06-21T17:59:56Z)
Unsupervised Denoising of Optical Coherence Tomography Images with Dual_Merged CycleWGAN [3.3909577600092122]
We propose a new Cycle-Consistent Generative Adversarial Nets called Dual-Merged Cycle-WGAN for retinal OCT image denoiseing. Our model consists of two Cycle-GAN networks with imporved generator, descriminator and wasserstein loss to achieve good training stability and better performance.
arXiv Detail & Related papers (2022-05-02T07:38:19Z)
Attentions Help CNNs See Better: Attention-based Hybrid Image Quality Assessment Network [20.835800149919145]
Image quality assessment (IQA) algorithm aims to quantify the human perception of image quality. There is a performance drop when assessing distortion images generated by generative adversarial network (GAN) with seemingly realistic texture. We propose an Attention-based Hybrid Image Quality Assessment Network (AHIQ) to deal with the challenge and get better performance on the GAN-based IQA task.
arXiv Detail & Related papers (2022-04-22T03:59:18Z)
Image Quality Assessment using Contrastive Learning [50.265638572116984]
We train a deep Convolutional Neural Network (CNN) using a contrastive pairwise objective to solve the auxiliary problem. We show through extensive experiments that CONTRIQUE achieves competitive performance when compared to state-of-the-art NR image quality models. Our results suggest that powerful quality representations with perceptual relevance can be obtained without requiring large labeled subjective image quality datasets.
arXiv Detail & Related papers (2021-10-25T21:01:00Z)
Contemplating real-world object classification [53.10151901863263]
We reanalyze the ObjectNet dataset recently proposed by Barbu et al. containing objects in daily life situations. We find that applying deep models to the isolated objects, rather than the entire scene as is done in the original paper, results in around 20-30% performance improvement.
arXiv Detail & Related papers (2021-03-08T23:29:59Z)
Shape-Texture Debiased Neural Network Training [50.6178024087048]
Convolutional Neural Networks are often biased towards either texture or shape, depending on the training dataset. We develop an algorithm for shape-texture debiased learning. Experiments show that our method successfully improves model performance on several image recognition benchmarks.
arXiv Detail & Related papers (2020-10-12T19:16:12Z)
Improved Residual Networks for Image and Video Recognition [98.10703825716142]
Residual networks (ResNets) represent a powerful type of convolutional neural network (CNN) architecture. We show consistent improvements in accuracy and learning convergence over the baseline. Our proposed approach allows us to train extremely deep networks, while the baseline shows severe optimization issues.
arXiv Detail & Related papers (2020-04-10T11:09:50Z)
Adversarial Perturbations Prevail in the Y-Channel of the YCbCr Color Space [43.49959098842923]
In a white-box attack, adversarial perturbations are generally learned for deep models that operate on RGB images. In this paper, we show that the adversarial perturbations prevail in the Y-channel of the YCbCr space. Based on our finding, we propose a defense against adversarial images.
arXiv Detail & Related papers (2020-02-25T02:41:42Z)

This list is automatically generated from the titles and abstracts of the papers in this site.