Related papers: Acoustic Classification of Maritime Vessels using Learnable Filterbanks

Acoustic Classification of Maritime Vessels using Learnable Filterbanks

URL: http://arxiv.org/abs/2505.23964v1
Date: Thu, 29 May 2025 19:41:15 GMT
Title: Acoustic Classification of Maritime Vessels using Learnable Filterbanks
Authors: Jonas Elsborg, Tejs Vegge, Arghya Bhowmik,
Abstract summary: We present a deep learning model with robust performance across different recording scenarios.<n>Trained on the VTUAD hydrophone recordings from the Strait of Georgia, our model, CATFISH, achieves a state-of-the-art 96.63 % percent test accuracy.
Score: 0.0
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Reliably monitoring and recognizing maritime vessels based on acoustic signatures is complicated by the variability of different recording scenarios. A robust classification framework must be able to generalize across diverse acoustic environments and variable source-sensor distances. To this end, we present a deep learning model with robust performance across different recording scenarios. Using a trainable spectral front-end and temporal feature encoder to learn a Gabor filterbank, the model can dynamically emphasize different frequency components. Trained on the VTUAD hydrophone recordings from the Strait of Georgia, our model, CATFISH, achieves a state-of-the-art 96.63 % percent test accuracy across varying source-sensor distances, surpassing the previous benchmark by over 12 percentage points. We present the model, justify our architectural choices, analyze the learned Gabor filters, and perform ablation studies on sensor data fusion and attention-based pooling.

Related papers

Adaptive Control Attention Network for Underwater Acoustic Localization and Domain Adaptation [8.017203108408973]
Localizing acoustic sound sources in the ocean is a challenging task due to the complex and dynamic nature of the environment.<n>We propose a multi-branch network architecture designed to accurately predict the distance between a moving acoustic source and a receiver.<n>Our proposed method outperforms state-of-the-art (SOTA) approaches in similar settings.
arXiv Detail & Related papers (2025-06-20T18:13:30Z)
AquaSignal: An Integrated Framework for Robust Underwater Acoustic Analysis [0.0]
AquaSignal is a modular and scalable pipeline for preprocessing, denoising, classification, and novelty detection of underwater acoustic signals.<n>System is evaluated on a combined dataset from the Deepship and Ocean Networks Canada (ONC) benchmarks.
arXiv Detail & Related papers (2025-05-20T12:35:43Z)
Image-Based Relocalization and Alignment for Long-Term Monitoring of Dynamic Underwater Environments [57.59857784298534]
We propose an integrated pipeline that combines Visual Place Recognition (VPR), feature matching, and image segmentation on video-derived images.<n>This method enables robust identification of revisited areas, estimation of rigid transformations, and downstream analysis of ecosystem changes.
arXiv Detail & Related papers (2025-03-06T05:13:19Z)
A Novel Score-CAM based Denoiser for Spectrographic Signature Extraction without Ground Truth [0.0]
This paper develops a novel Score-CAM based denoiser to extract an object's signature from noisy spectrographic data. In particular, this paper proposes a novel generative adversarial network architecture for learning and producing spectrographic training data.
arXiv Detail & Related papers (2024-10-28T21:40:46Z)
WhaleNet: a Novel Deep Learning Architecture for Marine Mammals Vocalizations on Watkins Marine Mammal Sound Database [49.1574468325115]
We introduce textbfWhaleNet (Wavelet Highly Adaptive Learning Ensemble Network), a sophisticated deep ensemble architecture for the classification of marine mammal vocalizations. We achieve an improvement in classification accuracy by $8-10%$ over existing architectures, corresponding to a classification accuracy of $97.61%$.
arXiv Detail & Related papers (2024-02-20T11:36:23Z)
Histogram Layer Time Delay Neural Networks for Passive Sonar Classification [58.720142291102135]
A novel method combines a time delay neural network and histogram layer to incorporate statistical contexts for improved feature learning and underwater acoustic target classification. The proposed method outperforms the baseline model, demonstrating the utility in incorporating statistical contexts for passive sonar target recognition.
arXiv Detail & Related papers (2023-07-25T19:47:26Z)
Adaptive ship-radiated noise recognition with learnable fine-grained wavelet transform [25.887932248706218]
This work proposes an adaptive generalized recognition system - AGNet. By converting fixed wavelet parameters into fine-grained learnable parameters, AGNet learns the characteristics of underwater sound at different frequencies. Experiments reveal that our AGNet outperforms all baseline methods on several underwater acoustic datasets.
arXiv Detail & Related papers (2023-05-31T06:56:01Z)
Interpretable Acoustic Representation Learning on Breathing and Speech Signals for COVID-19 Detection [37.01066509527848]
We describe an approach for representation learning of audio signals for the task of COVID-19 detection. The raw audio samples are processed with a bank of 1-D convolutional filters that are parameterized as cosine modulated Gaussian functions. The filtered outputs are pooled, log-compressed and used in a self-attention based relevance weighting mechanism.
arXiv Detail & Related papers (2022-06-27T15:20:51Z)
Discriminative Singular Spectrum Classifier with Applications on Bioacoustic Signal Recognition [67.4171845020675]
We present a bioacoustic signal classifier equipped with a discriminative mechanism to extract useful features for analysis and classification efficiently. Unlike current bioacoustic recognition methods, which are task-oriented, the proposed model relies on transforming the input signals into vector subspaces. The validity of the proposed method is verified using three challenging bioacoustic datasets containing anuran, bee, and mosquito species.
arXiv Detail & Related papers (2021-03-18T11:01:21Z)
Conditioning Trick for Training Stable GANs [70.15099665710336]
We propose a conditioning trick, called difference departure from normality, applied on the generator network in response to instability issues during GAN training. We force the generator to get closer to the departure from normality function of real samples computed in the spectral domain of Schur decomposition.
arXiv Detail & Related papers (2020-10-12T16:50:22Z)
From Sound Representation to Model Robustness [82.21746840893658]
We investigate the impact of different standard environmental sound representations (spectrograms) on the recognition performance and adversarial attack robustness of a victim residual convolutional neural network. Averaged over various experiments on three environmental sound datasets, we found the ResNet-18 model outperforms other deep learning architectures.
arXiv Detail & Related papers (2020-07-27T17:30:49Z)
Capturing scattered discriminative information using a deep architecture in acoustic scene classification [49.86640645460706]
In this study, we investigate various methods to capture discriminative information and simultaneously mitigate the overfitting problem. We adopt a max feature map method to replace conventional non-linear activations in a deep neural network. Two data augment methods and two deep architecture modules are further explored to reduce overfitting and sustain the system's discriminative power.
arXiv Detail & Related papers (2020-07-09T08:32:06Z)

This list is automatically generated from the titles and abstracts of the papers in this site.