Adaptive ship-radiated noise recognition with learnable fine-grained
wavelet transform
- URL: http://arxiv.org/abs/2306.01002v2
- Date: Mon, 19 Feb 2024 09:28:15 GMT
- Title: Adaptive ship-radiated noise recognition with learnable fine-grained
wavelet transform
- Authors: Yuan Xie, Jiawei Ren, Ji Xu
- Abstract summary: This work proposes an adaptive generalized recognition system - AGNet.
By converting fixed wavelet parameters into fine-grained learnable parameters, AGNet learns the characteristics of underwater sound at different frequencies.
Experiments reveal that our AGNet outperforms all baseline methods on several underwater acoustic datasets.
- Score: 25.887932248706218
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Analyzing the ocean acoustic environment is a tricky task. Background noise
and variable channel transmission environment make it complicated to implement
accurate ship-radiated noise recognition. Existing recognition systems are weak
in addressing the variable underwater environment, thus leading to
disappointing performance in practical application. In order to keep the
recognition system robust in various underwater environments, this work
proposes an adaptive generalized recognition system - AGNet (Adaptive
Generalized Network). By converting fixed wavelet parameters into fine-grained
learnable parameters, AGNet learns the characteristics of underwater sound at
different frequencies. Its flexible and fine-grained design is conducive to
capturing more background acoustic information (e.g., background noise,
underwater transmission channel). To utilize the implicit information in
wavelet spectrograms, AGNet adopts the convolutional neural network with
parallel convolution attention modules as the classifier. Experiments reveal
that our AGNet outperforms all baseline methods on several underwater acoustic
datasets, and AGNet could benefit more from transfer learning. Moreover, AGNet
shows robust performance against various interference factors.
Related papers
- A Novel Score-CAM based Denoiser for Spectrographic Signature Extraction without Ground Truth [0.0]
This paper develops a novel Score-CAM based denoiser to extract an object's signature from noisy spectrographic data.
In particular, this paper proposes a novel generative adversarial network architecture for learning and producing spectrographic training data.
arXiv Detail & Related papers (2024-10-28T21:40:46Z) - DenoDet: Attention as Deformable Multi-Subspace Feature Denoising for Target Detection in SAR Images [20.11145540094807]
We propose a network aided by explicit frequency domain transform to calibrate convolutional biases and pay more attention to high-frequencies.
We design TransDeno, a dynamic frequency domain attention module that performs as a transform domain soft thresholding operation.
Our plug-and-play TransDeno sets state-of-the-art scores on multiple SAR target detection datasets.
arXiv Detail & Related papers (2024-06-05T01:05:26Z) - Histogram Layer Time Delay Neural Networks for Passive Sonar
Classification [58.720142291102135]
A novel method combines a time delay neural network and histogram layer to incorporate statistical contexts for improved feature learning and underwater acoustic target classification.
The proposed method outperforms the baseline model, demonstrating the utility in incorporating statistical contexts for passive sonar target recognition.
arXiv Detail & Related papers (2023-07-25T19:47:26Z) - Timbre Transfer with Variational Auto Encoding and Cycle-Consistent
Adversarial Networks [0.6445605125467573]
This research project investigates the application of deep learning to timbre transfer, where the timbre of a source audio can be converted to the timbre of a target audio with minimal loss in quality.
The adopted approach combines Variational Autoencoders with Generative Adversarial Networks to construct meaningful representations of the source audio and produce realistic generations of the target audio.
arXiv Detail & Related papers (2021-09-05T15:06:53Z) - PILOT: Introducing Transformers for Probabilistic Sound Event
Localization [107.78964411642401]
This paper introduces a novel transformer-based sound event localization framework, where temporal dependencies in the received multi-channel audio signals are captured via self-attention mechanisms.
The framework is evaluated on three publicly available multi-source sound event localization datasets and compared against state-of-the-art methods in terms of localization error and event detection accuracy.
arXiv Detail & Related papers (2021-06-07T18:29:19Z) - Discriminative Singular Spectrum Classifier with Applications on
Bioacoustic Signal Recognition [67.4171845020675]
We present a bioacoustic signal classifier equipped with a discriminative mechanism to extract useful features for analysis and classification efficiently.
Unlike current bioacoustic recognition methods, which are task-oriented, the proposed model relies on transforming the input signals into vector subspaces.
The validity of the proposed method is verified using three challenging bioacoustic datasets containing anuran, bee, and mosquito species.
arXiv Detail & Related papers (2021-03-18T11:01:21Z) - Conditioning Trick for Training Stable GANs [70.15099665710336]
We propose a conditioning trick, called difference departure from normality, applied on the generator network in response to instability issues during GAN training.
We force the generator to get closer to the departure from normality function of real samples computed in the spectral domain of Schur decomposition.
arXiv Detail & Related papers (2020-10-12T16:50:22Z) - A Multi-view CNN-based Acoustic Classification System for Automatic
Animal Species Identification [42.119250432849505]
We propose a deep learning based acoustic classification framework for Wireless Acoustic Sensor Network (WASN)
The proposed framework is based on cloud architecture which relaxes the computational burden on the wireless sensor node.
To improve the recognition accuracy, we design a multi-view Convolution Neural Network (CNN) to extract the short-, middle-, and long-term dependencies in parallel.
arXiv Detail & Related papers (2020-02-23T03:51:08Z) - Deep Speaker Embeddings for Far-Field Speaker Recognition on Short
Utterances [53.063441357826484]
Speaker recognition systems based on deep speaker embeddings have achieved significant performance in controlled conditions.
Speaker verification on short utterances in uncontrolled noisy environment conditions is one of the most challenging and highly demanded tasks.
This paper presents approaches aimed to achieve two goals: a) improve the quality of far-field speaker verification systems in the presence of environmental noise, reverberation and b) reduce the system qualitydegradation for short utterances.
arXiv Detail & Related papers (2020-02-14T13:34:33Z) - Temporal-Spatial Neural Filter: Direction Informed End-to-End
Multi-channel Target Speech Separation [66.46123655365113]
Target speech separation refers to extracting the target speaker's speech from mixed signals.
Two main challenges are the complex acoustic environment and the real-time processing requirement.
We propose a temporal-spatial neural filter, which directly estimates the target speech waveform from multi-speaker mixture.
arXiv Detail & Related papers (2020-01-02T11:12:50Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.