Cross-domain Sound Recognition for Efficient Underwater Data Analysis
- URL: http://arxiv.org/abs/2309.03451v2
- Date: Wed, 21 Feb 2024 05:02:34 GMT
- Title: Cross-domain Sound Recognition for Efficient Underwater Data Analysis
- Authors: Jeongsoo Park, Dong-Gyun Han, Hyoung Sul La, Sangmin Lee, Yoonchang
Han, and Eun-Jin Yang
- Abstract summary: This paper presents a novel deep learning approach for analyzing massive underwater acoustic data by leveraging a model trained on a broad spectrum of non-underwater (aerial) sounds.
We use PCA and UMAP visualization to cluster the data in a two dimensional space and listen to points within these clusters to understand their defining characteristics.
In the second part, we train a neural network model using both the selected underwater data and the non-underwater dataset.
- Score: 4.373836150479923
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This paper presents a novel deep learning approach for analyzing massive
underwater acoustic data by leveraging a model trained on a broad spectrum of
non-underwater (aerial) sounds. Recognizing the challenge in labeling vast
amounts of underwater data, we propose a two-fold methodology to accelerate
this labor-intensive procedure.
The first part of our approach involves PCA and UMAP visualization of the
underwater data using the feature vectors of an aerial sound recognition model.
This enables us to cluster the data in a two dimensional space and listen to
points within these clusters to understand their defining characteristics. This
innovative method simplifies the process of selecting candidate labels for
further training.
In the second part, we train a neural network model using both the selected
underwater data and the non-underwater dataset. We conducted a quantitative
analysis to measure the precision, recall, and F1 score of our model for
recognizing airgun sounds, a common type of underwater sound. The F1 score
achieved by our model exceeded 84.3%, demonstrating the effectiveness of our
approach in analyzing underwater acoustic data.
The methodology presented in this paper holds significant potential to reduce
the amount of labor required in underwater data analysis and opens up new
possibilities for further research in the field of cross-domain data analysis.
Related papers
- Underwater Object Detection in the Era of Artificial Intelligence: Current, Challenge, and Future [119.88454942558485]
Underwater object detection (UOD) aims to identify and localise objects in underwater images or videos.
In recent years, artificial intelligence (AI) based methods, especially deep learning methods, have shown promising performance in UOD.
arXiv Detail & Related papers (2024-10-08T00:25:33Z) - Learning from the Giants: A Practical Approach to Underwater Depth and Surface Normals Estimation [3.0516727053033392]
This paper presents a novel deep learning model for Monocular Depth and Surface Normals Estimation (MDSNE)
It is specifically tailored for underwater environments, using a hybrid architecture that integrates CNNs with Transformers.
Our model reduces parameters by 90% and training costs by 80%, allowing real-time 3D perception on resource-constrained devices.
arXiv Detail & Related papers (2024-10-02T22:41:12Z) - Learning with Noisy Foundation Models [95.50968225050012]
This paper is the first work to comprehensively understand and analyze the nature of noise in pre-training datasets.
We propose a tuning method (NMTune) to affine the feature space to mitigate the malignant effect of noise and improve generalization.
arXiv Detail & Related papers (2024-03-11T16:22:41Z) - Histogram Layer Time Delay Neural Networks for Passive Sonar
Classification [58.720142291102135]
A novel method combines a time delay neural network and histogram layer to incorporate statistical contexts for improved feature learning and underwater acoustic target classification.
The proposed method outperforms the baseline model, demonstrating the utility in incorporating statistical contexts for passive sonar target recognition.
arXiv Detail & Related papers (2023-07-25T19:47:26Z) - Learning Visual Representation of Underwater Acoustic Imagery Using
Transformer-Based Style Transfer Method [4.885034271315195]
This letter proposes a framework for learning the visual representation of underwater acoustic imageries.
It could replace the low-level texture features of optical images with the visual features of underwater acoustic imageries.
The proposed framework could fully use the rich optical image dataset to generate a pseudo-acoustic image dataset.
arXiv Detail & Related papers (2022-11-10T07:54:46Z) - Learning-based estimation of in-situ wind speed from underwater
acoustics [58.293528982012255]
We introduce a deep learning approach for the retrieval of wind speed time series from underwater acoustics.
Our approach bridges data assimilation and learning-based frameworks to benefit both from prior physical knowledge and computational efficiency.
arXiv Detail & Related papers (2022-08-18T15:27:40Z) - Fast accuracy estimation of deep learning based multi-class musical
source separation [79.10962538141445]
We propose a method to evaluate the separability of instruments in any dataset without training and tuning a neural network.
Based on the oracle principle with an ideal ratio mask, our approach is an excellent proxy to estimate the separation performances of state-of-the-art deep learning approaches.
arXiv Detail & Related papers (2020-10-19T13:05:08Z) - Deep Learning based Segmentation of Fish in Noisy Forward Looking MBES
Images [1.5469452301122177]
We build on recent advances in Deep Learning (DL) and Convolutional Neural Networks (CNNs) for semantic segmentation.
We demonstrate an end-to-end approach for a fish/non-fish probability prediction for all range-azimuth positions projected by an imaging sonar.
We show that our model proves the desired performance and has learned to harness the importance of semantic context.
arXiv Detail & Related papers (2020-06-16T09:57:38Z) - Attention-based Neural Bag-of-Features Learning for Sequence Data [143.62294358378128]
2D-Attention (2DA) is a generic attention formulation for sequence data.
The proposed attention module is incorporated into the recently proposed Neural Bag of Feature (NBoF) model to enhance its learning capacity.
Our empirical analysis shows that the proposed attention formulations can not only improve performances of NBoF models but also make them resilient to noisy data.
arXiv Detail & Related papers (2020-05-25T17:51:54Z) - Utilizing Mask R-CNN for Waterline Detection in Canoe Sprint Video
Analysis [5.735035463793008]
We propose an approach for the automated waterline detection.
We developed a multi-stage approach to estimate a waterline from the outline of the segments.
We conducted a study among several experts to estimate the ground truth waterlines.
arXiv Detail & Related papers (2020-04-20T19:00:45Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.