Structural Prior Driven Regularized Deep Learning for Sonar Image
Classification
- URL: http://arxiv.org/abs/2010.13317v1
- Date: Mon, 26 Oct 2020 04:00:46 GMT
- Title: Structural Prior Driven Regularized Deep Learning for Sonar Image
Classification
- Authors: Isaac D. Gerg and Vishal Monga
- Abstract summary: Deep learning has been shown to improve performance in the domain of synthetic aperture sonar (SAS) image classification.
Despite deep learning's recent success, there are still compelling open challenges in reducing the high false alarm rate.
We introduce a new deep learning architecture which incorporates priors with the goal of improving automatic target recognition.
- Score: 28.306713374371814
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Deep learning has been recently shown to improve performance in the domain of
synthetic aperture sonar (SAS) image classification. Given the constant
resolution with range of a SAS, it is no surprise that deep learning techniques
perform so well. Despite deep learning's recent success, there are still
compelling open challenges in reducing the high false alarm rate and enabling
success when training imagery is limited, which is a practical challenge that
distinguishes the SAS classification problem from standard image classification
set-ups where training imagery may be abundant. We address these challenges by
exploiting prior knowledge that humans use to grasp the scene. These include
unconscious elimination of the image speckle and localization of objects in the
scene. We introduce a new deep learning architecture which incorporates these
priors with the goal of improving automatic target recognition (ATR) from SAS
imagery. Our proposal -- called SPDRDL, Structural Prior Driven Regularized
Deep Learning -- incorporates the previously mentioned priors in a multi-task
convolutional neural network (CNN) and requires no additional training data
when compared to traditional SAS ATR methods. Two structural priors are
enforced via regularization terms in the learning of the network: (1)
structural similarity prior -- enhanced imagery (often through despeckling)
aids human interpretation and is semantically similar to the original imagery
and (2) structural scene context priors -- learned features ideally encapsulate
target centering information; hence learning may be enhanced via a
regularization that encourages fidelity against known ground truth target
shifts (relative target position from scene center). Experiments on a
challenging real-world dataset reveal that SPDRDL outperforms state-of-the-art
deep learning and other competing methods for SAS image classification.
Related papers
- Advances in Self-Supervised Learning for Synthetic Aperture Sonar Data
Processing, Classification, and Pattern Recognition [0.36700088931938835]
This paper proposes MoCo-SAS that leverages self-supervised learning for SAS data processing, classification, and pattern recognition.
The experimental results demonstrate that MoCo-SAS significantly outperforms traditional supervised learning methods.
These findings highlight the potential of SSL in advancing the state-of-the-art in SAS data processing, offering promising avenues for enhanced underwater object detection and classification.
arXiv Detail & Related papers (2023-08-12T20:59:39Z) - Self-Supervised Learning for Improved Synthetic Aperture Sonar Target
Recognition [0.0]
This study explores the application of self-supervised learning (SSL) for improved target recognition in synthetic aperture sonar (SAS) imagery.
The voluminous high-resolution SAS data presents a significant challenge for labeling; a crucial step for training deep neural networks (DNNs)
The study evaluates the performance of two prominent SSL algorithms, MoCov2 and BYOL, against the well-regarded supervised learning model, ResNet18, for binary image classification tasks.
arXiv Detail & Related papers (2023-07-27T14:17:24Z) - Location-Aware Self-Supervised Transformers [74.76585889813207]
We propose to pretrain networks for semantic segmentation by predicting the relative location of image parts.
We control the difficulty of the task by masking a subset of the reference patch features visible to those of the query.
Our experiments show that this location-aware pretraining leads to representations that transfer competitively to several challenging semantic segmentation benchmarks.
arXiv Detail & Related papers (2022-12-05T16:24:29Z) - Iterative, Deep Synthetic Aperture Sonar Image Segmentation [21.319490900396474]
We propose an unsupervised learning framework called Iterative Deep Unsupervised (IDUS) for SAS image segmentation.
IDUS can be divided into four main steps: 1) A deep network estimates class assignments; 2) Low-level image features from the deep network are clustered into superpixels; 3) Superpixels are clustered into class assignments; 4) Resulting pseudo-labels are used for loss backpropagation of the deep network prediction.
A comparison of IDUS to current state-of-the-art methods on a realistic benchmark dataset for SAS image segmentation demonstrates the benefits of our proposal.
arXiv Detail & Related papers (2022-03-28T20:41:24Z) - Semantic-Aware Generation for Self-Supervised Visual Representation
Learning [116.5814634936371]
We advocate for Semantic-aware Generation (SaGe) to facilitate richer semantics rather than details to be preserved in the generated image.
SaGe complements the target network with view-specific features and thus alleviates the semantic degradation brought by intensive data augmentations.
We execute SaGe on ImageNet-1K and evaluate the pre-trained models on five downstream tasks including nearest neighbor test, linear classification, and fine-scaled image recognition.
arXiv Detail & Related papers (2021-11-25T16:46:13Z) - Rectifying the Shortcut Learning of Background: Shared Object
Concentration for Few-Shot Image Recognition [101.59989523028264]
Few-Shot image classification aims to utilize pretrained knowledge learned from a large-scale dataset to tackle a series of downstream classification tasks.
We propose COSOC, a novel Few-Shot Learning framework, to automatically figure out foreground objects at both pretraining and evaluation stage.
arXiv Detail & Related papers (2021-07-16T07:46:41Z) - SCARF: Self-Supervised Contrastive Learning using Random Feature
Corruption [72.35532598131176]
We propose SCARF, a technique for contrastive learning, where views are formed by corrupting a random subset of features.
We show that SCARF complements existing strategies and outperforms alternatives like autoencoders.
arXiv Detail & Related papers (2021-06-29T08:08:33Z) - Speckle2Void: Deep Self-Supervised SAR Despeckling with Blind-Spot
Convolutional Neural Networks [30.410981386006394]
despeckling is a crucial preliminary step in scene analysis algorithms.
Recent success of deep learning envisions a new generation of despeckling techniques.
We propose a self-supervised Bayesian despeckling method.
arXiv Detail & Related papers (2020-07-04T11:38:48Z) - Joint Deep Learning of Facial Expression Synthesis and Recognition [97.19528464266824]
We propose a novel joint deep learning of facial expression synthesis and recognition method for effective FER.
The proposed method involves a two-stage learning procedure. Firstly, a facial expression synthesis generative adversarial network (FESGAN) is pre-trained to generate facial images with different facial expressions.
In order to alleviate the problem of data bias between the real images and the synthetic images, we propose an intra-class loss with a novel real data-guided back-propagation (RDBP) algorithm.
arXiv Detail & Related papers (2020-02-06T10:56:00Z) - Towards Deep Unsupervised SAR Despeckling with Blind-Spot Convolutional
Neural Networks [30.410981386006394]
Deep learning techniques have outperformed classical model-based despeckling algorithms.
In this paper, we propose a self-supervised Bayesian despeckling method.
We show that the performance of the proposed network is very close to the supervised training approach on synthetic data and competitive on real data.
arXiv Detail & Related papers (2020-01-15T12:21:12Z) - Scene Text Synthesis for Efficient and Effective Deep Network Training [62.631176120557136]
We develop an innovative image synthesis technique that composes annotated training images by embedding foreground objects of interest into background images.
The proposed technique consists of two key components that in principle boost the usefulness of the synthesized images in deep network training.
Experiments over a number of public datasets demonstrate the effectiveness of our proposed image synthesis technique.
arXiv Detail & Related papers (2019-01-26T10:15:24Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.