Graph Embedding with Mel-spectrograms for Underwater Acoustic Target Recognition
- URL: http://arxiv.org/abs/2512.11545v1
- Date: Fri, 12 Dec 2025 13:25:54 GMT
- Title: Graph Embedding with Mel-spectrograms for Underwater Acoustic Target Recognition
- Authors: Sheng Feng, Shuqing Ma, Xiaoqian Zhu,
- Abstract summary: The UATR-GTransformer is a non-Euclidean deep learning model that integrates Transformer architectures with graph neural networks (GNNs)<n>The proposed model effectively extracts rich frequency-domain information, highlighting its potential for applications in ocean engineering.<n> Experiments results based on two widely used benchmark datasets demonstrate that the UATR-GTransformer achieves performance competitive with state-of-the-art methods.
- Score: 3.4185611249587278
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: Underwater acoustic target recognition (UATR) is extremely challenging due to the complexity of ship-radiated noise and the variability of ocean environments. Although deep learning (DL) approaches have achieved promising results, most existing models implicitly assume that underwater acoustic data lie in a Euclidean space. This assumption, however, is unsuitable for the inherently complex topology of underwater acoustic signals, which exhibit non-stationary, non-Gaussian, and nonlinear characteristics. To overcome this limitation, this paper proposes the UATR-GTransformer, a non-Euclidean DL model that integrates Transformer architectures with graph neural networks (GNNs). The model comprises three key components: a Mel patchify block, a GTransformer block, and a classification head. The Mel patchify block partitions the Mel-spectrogram into overlapping patches, while the GTransformer block employs a Transformer Encoder to capture mutual information between split patches to generate Mel-graph embeddings. Subsequently, a GNN enhances these embeddings by modeling local neighborhood relationships, and a feed-forward network (FFN) further performs feature transformation. Experiments results based on two widely used benchmark datasets demonstrate that the UATR-GTransformer achieves performance competitive with state-of-the-art methods. In addition, interpretability analysis reveals that the proposed model effectively extracts rich frequency-domain information, highlighting its potential for applications in ocean engineering.
Related papers
- WaveMAE: Wavelet decomposition Masked Auto-Encoder for Remote Sensing [5.65492058135409]
WaveMAE is a masked autoencoding framework tailored for multispectral satellite imagery.<n>To ensure fairness in evaluation, all methods are pretrained on the same dataset (fMoW-S2)<n>WaveMAE achieves consistent improvements over prior state-of-the-art approaches.
arXiv Detail & Related papers (2025-10-26T14:45:30Z) - Wavelet-Guided Dual-Frequency Encoding for Remote Sensing Change Detection [67.84730634802204]
Change detection in remote sensing imagery plays a vital role in various engineering applications, such as natural disaster monitoring, urban expansion tracking, and infrastructure management.<n>Most existing methods still rely on spatial-domain modeling, where the limited diversity of feature representations hinders the detection of subtle change regions.<n>We observe that frequency-domain feature modeling particularly in the wavelet domain amplify fine-grained differences in frequency components, enhancing the perception of edge changes that are challenging to capture in the spatial domain.
arXiv Detail & Related papers (2025-08-07T11:14:16Z) - Adaptive Control Attention Network for Underwater Acoustic Localization and Domain Adaptation [8.017203108408973]
Localizing acoustic sound sources in the ocean is a challenging task due to the complex and dynamic nature of the environment.<n>We propose a multi-branch network architecture designed to accurately predict the distance between a moving acoustic source and a receiver.<n>Our proposed method outperforms state-of-the-art (SOTA) approaches in similar settings.
arXiv Detail & Related papers (2025-06-20T18:13:30Z) - Point Cloud Denoising With Fine-Granularity Dynamic Graph Convolutional Networks [58.050130177241186]
Noise perturbations often corrupt 3-D point clouds, hindering downstream tasks such as surface reconstruction, rendering, and further processing.
This paper introduces finegranularity dynamic graph convolutional networks called GDGCN, a novel approach to denoising in 3-D point clouds.
arXiv Detail & Related papers (2024-11-21T14:19:32Z) - WiNet: Wavelet-based Incremental Learning for Efficient Medical Image Registration [68.25711405944239]
Deep image registration has demonstrated exceptional accuracy and fast inference.
Recent advances have adopted either multiple cascades or pyramid architectures to estimate dense deformation fields in a coarse-to-fine manner.
We introduce a model-driven WiNet that incrementally estimates scale-wise wavelet coefficients for the displacement/velocity field across various scales.
arXiv Detail & Related papers (2024-07-18T11:51:01Z) - Wavelet-based Bi-dimensional Aggregation Network for SAR Image Change Detection [53.842568573251214]
Experimental results on three SAR datasets demonstrate that our WBANet significantly outperforms contemporary state-of-the-art methods.
Our WBANet achieves 98.33%, 96.65%, and 96.62% of percentage of correct classification (PCC) on the respective datasets.
arXiv Detail & Related papers (2024-07-18T04:36:10Z) - ASWT-SGNN: Adaptive Spectral Wavelet Transform-based Self-Supervised
Graph Neural Network [20.924559944655392]
This paper proposes an Adaptive Spectral Wavelet Transform-based Self-Supervised Graph Neural Network (ASWT-SGNN)
ASWT-SGNN accurately approximates the filter function in high-density spectral regions, avoiding costly eigen-decomposition.
It achieves comparable performance to state-of-the-art models in node classification tasks.
arXiv Detail & Related papers (2023-12-10T03:07:42Z) - Degradation-Noise-Aware Deep Unfolding Transformer for Hyperspectral
Image Denoising [9.119226249676501]
Hyperspectral images (HSIs) are often quite noisy because of narrow band spectral filtering.
To reduce the noise in HSI data cubes, both model-driven and learning-based denoising algorithms have been proposed.
This paper proposes a Degradation-Noise-Aware Unfolding Network (DNA-Net) that addresses these issues.
arXiv Detail & Related papers (2023-05-06T13:28:20Z) - Hierarchical Spherical CNNs with Lifting-based Adaptive Wavelets for
Pooling and Unpooling [101.72318949104627]
We propose a novel framework of hierarchical convolutional neural networks (HS-CNNs) with a lifting structure to learn adaptive spherical wavelets for pooling and unpooling.
LiftHS-CNN ensures a more efficient hierarchical feature learning for both image- and pixel-level tasks.
arXiv Detail & Related papers (2022-05-31T07:23:42Z) - Conditioning Trick for Training Stable GANs [70.15099665710336]
We propose a conditioning trick, called difference departure from normality, applied on the generator network in response to instability issues during GAN training.
We force the generator to get closer to the departure from normality function of real samples computed in the spectral domain of Schur decomposition.
arXiv Detail & Related papers (2020-10-12T16:50:22Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.