Image-to-Height Domain Translation for Synthetic Aperture Sonar
- URL: http://arxiv.org/abs/2112.06307v1
- Date: Sun, 12 Dec 2021 19:53:14 GMT
- Title: Image-to-Height Domain Translation for Synthetic Aperture Sonar
- Authors: Dylan Stewart, Shawn Johnson, and Alina Zare
- Abstract summary: In this work, we focus on collection geometry with respect to isotropic and anisotropic textures.
The low grazing angle of the collection geometry, combined with orientation of the sonar path relative to anisotropic texture, poses a significant challenge for image-alignment and other multi-view scene understanding frameworks.
- Score: 3.2662392450935416
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Observations of seabed texture with synthetic aperture sonar are dependent
upon several factors. In this work, we focus on collection geometry with
respect to isotropic and anisotropic textures. The low grazing angle of the
collection geometry, combined with orientation of the sonar path relative to
anisotropic texture, poses a significant challenge for image-alignment and
other multi-view scene understanding frameworks. We previously proposed using
features captured from estimated seabed relief to improve scene understanding.
While several methods have been developed to estimate seabed relief via
intensity, no large-scale study exists in the literature. Furthermore, a
dataset of coregistered seabed relief maps and sonar imagery is nonexistent to
learn this domain translation. We address these problems by producing a large
simulated dataset containing coregistered pairs of seabed relief and intensity
maps from two unique sonar data simulation techniques. We apply three types of
models, with varying complexity, to translate intensity imagery to seabed
relief: a Gaussian Markov Random Field approach (GMRF), a conditional
Generative Adversarial Network (cGAN), and UNet architectures. Methods are
compared in reference to the coregistered simulated datasets using L1 error.
Additionally, predictions on simulated and real SAS imagery are shown. Finally,
models are compared on two datasets of hand-aligned SAS imagery and evaluated
in terms of L1 error across multiple aspects in comparison to using intensity.
Our comprehensive experiments show that the proposed UNet architectures
outperform the GMRF and pix2pix cGAN models on seabed relief estimation for
simulated and real SAS imagery.
Related papers
- Latent Semantic Consensus For Deterministic Geometric Model Fitting [109.44565542031384]
We propose an effective method called Latent Semantic Consensus (LSC)
LSC formulates the model fitting problem into two latent semantic spaces based on data points and model hypotheses.
LSC is able to provide consistent and reliable solutions within only a few milliseconds for general multi-structural model fitting.
arXiv Detail & Related papers (2024-03-11T05:35:38Z) - RANRAC: Robust Neural Scene Representations via Random Ray Consensus [12.161889666145127]
RANdom RAy Consensus (RANRAC) is an efficient approach to eliminate the effect of inconsistent data.
We formulate a fuzzy adaption of the RANSAC paradigm, enabling its application to large scale models.
Results indicate significant improvements compared to state-of-the-art robust methods for novel-view synthesis.
arXiv Detail & Related papers (2023-12-15T13:33:09Z) - DiAD: A Diffusion-based Framework for Multi-class Anomaly Detection [55.48770333927732]
We propose a Difusion-based Anomaly Detection (DiAD) framework for multi-class anomaly detection.
It consists of a pixel-space autoencoder, a latent-space Semantic-Guided (SG) network with a connection to the stable diffusion's denoising network, and a feature-space pre-trained feature extractor.
Experiments on MVTec-AD and VisA datasets demonstrate the effectiveness of our approach.
arXiv Detail & Related papers (2023-12-11T18:38:28Z) - Underwater object classification combining SAS and transferred
optical-to-SAS Imagery [12.607649347048442]
We propose a multi-modal combination to discriminate between man-made targets and objects such as rocks or litter.
We offer a novel classification algorithm that overcomes the problem of intensity and object formation differences between the two modalities.
Results from 7,052 pairs of SAS and optical images collected during sea experiments show improved classification performance.
arXiv Detail & Related papers (2023-04-24T07:42:16Z) - Parents and Children: Distinguishing Multimodal DeepFakes from Natural Images [60.34381768479834]
Recent advancements in diffusion models have enabled the generation of realistic deepfakes from textual prompts in natural language.
We pioneer a systematic study on deepfake detection generated by state-of-the-art diffusion models.
arXiv Detail & Related papers (2023-04-02T10:25:09Z) - DeepDC: Deep Distance Correlation as a Perceptual Image Quality
Evaluator [53.57431705309919]
ImageNet pre-trained deep neural networks (DNNs) show notable transferability for building effective image quality assessment (IQA) models.
We develop a novel full-reference IQA (FR-IQA) model based exclusively on pre-trained DNN features.
We conduct comprehensive experiments to demonstrate the superiority of the proposed quality model on five standard IQA datasets.
arXiv Detail & Related papers (2022-11-09T14:57:27Z) - Dual-Scale Single Image Dehazing Via Neural Augmentation [29.019279446792623]
A novel single image dehazing algorithm is introduced by combining model-based and data-driven approaches.
Results indicate that the proposed algorithm can remove haze well from real-world and synthetic hazy images.
arXiv Detail & Related papers (2022-09-13T11:56:03Z) - Sci-Net: a Scale Invariant Model for Building Detection from Aerial
Images [0.0]
We propose a Scale-invariant neural network (Sci-Net) that is able to segment buildings present in aerial images at different spatial resolutions.
Specifically, we modified the U-Net architecture and fused it with dense Atrous Spatial Pyramid Pooling (ASPP) to extract fine-grained multi-scale representations.
arXiv Detail & Related papers (2021-11-12T16:45:20Z) - Deep Two-View Structure-from-Motion Revisited [83.93809929963969]
Two-view structure-from-motion (SfM) is the cornerstone of 3D reconstruction and visual SLAM.
We propose to revisit the problem of deep two-view SfM by leveraging the well-posedness of the classic pipeline.
Our method consists of 1) an optical flow estimation network that predicts dense correspondences between two frames; 2) a normalized pose estimation module that computes relative camera poses from the 2D optical flow correspondences, and 3) a scale-invariant depth estimation network that leverages epipolar geometry to reduce the search space, refine the dense correspondences, and estimate relative depth maps.
arXiv Detail & Related papers (2021-04-01T15:31:20Z) - Two-shot Spatially-varying BRDF and Shape Estimation [89.29020624201708]
We propose a novel deep learning architecture with a stage-wise estimation of shape and SVBRDF.
We create a large-scale synthetic training dataset with domain-randomized geometry and realistic materials.
Experiments on both synthetic and real-world datasets show that our network trained on a synthetic dataset can generalize well to real-world images.
arXiv Detail & Related papers (2020-04-01T12:56:13Z) - Leveraging Photogrammetric Mesh Models for Aerial-Ground Feature Point
Matching Toward Integrated 3D Reconstruction [19.551088857830944]
Integration of aerial and ground images has been proved as an efficient approach to enhance the surface reconstruction in urban environments.
Previous studies based on geometry-aware image rectification have alleviated this problem.
We propose a novel approach: leveraging photogrammetric mesh models for aerial-ground image matching.
arXiv Detail & Related papers (2020-02-21T01:47:59Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.