Advancing Robust Underwater Acoustic Target Recognition through Multi-task Learning and Multi-Gate Mixture-of-Experts
- URL: http://arxiv.org/abs/2411.02787v1
- Date: Tue, 05 Nov 2024 03:52:36 GMT
- Title: Advancing Robust Underwater Acoustic Target Recognition through Multi-task Learning and Multi-Gate Mixture-of-Experts
- Authors: Yuan Xie, Jiawei Ren, Junfeng Li, Ji Xu,
- Abstract summary: This study proposes a recognition framework called M3 to enhance the model's ability to capture robust patterns.
In this framework, an auxiliary task that focuses on target properties, such as estimating target size, is designed.
M3 incorporates multi-expert and multi-gate mechanisms, allowing for the allocation of distinct parameter spaces to various underwater signals.
- Score: 25.187507472845944
- License:
- Abstract: Underwater acoustic target recognition has emerged as a prominent research area within the field of underwater acoustics. However, the current availability of authentic underwater acoustic signal recordings remains limited, which hinders data-driven acoustic recognition models from learning robust patterns of targets from a limited set of intricate underwater signals, thereby compromising their stability in practical applications. To overcome these limitations, this study proposes a recognition framework called M3 (Multi-task, Multi-gate, Multi-expert) to enhance the model's ability to capture robust patterns by making it aware of the inherent properties of targets. In this framework, an auxiliary task that focuses on target properties, such as estimating target size, is designed. The auxiliary task then shares parameters with the recognition task to realize multi-task learning. This paradigm allows the model to concentrate on shared information across tasks and identify robust patterns of targets in a regularized manner, thereby enhancing the model's generalization ability. Moreover, M3 incorporates multi-expert and multi-gate mechanisms, allowing for the allocation of distinct parameter spaces to various underwater signals. This enables the model to process intricate signal patterns in a fine-grained and differentiated manner. To evaluate the effectiveness of M3, extensive experiments were implemented on the ShipsEar underwater ship-radiated noise dataset. The results substantiate that M3 has the ability to outperform the most advanced single-task recognition models, thereby achieving the state-of-the-art performance.
Related papers
- Adversarial multi-task underwater acoustic target recognition: towards robustness against various influential factors [25.187507472845944]
Underwater acoustic target recognition based on passive sonar faces numerous challenges in practical maritime applications.
One of the main challenges lies in the susceptibility of signal characteristics to diverse environmental conditions.
Influential factors are often neglected in the field of underwater acoustic target recognition.
arXiv Detail & Related papers (2024-11-05T06:42:51Z) - DEMONet: Underwater Acoustic Target Recognition based on Multi-Expert Network and Cross-Temporal Variational Autoencoder [22.271499386492533]
Building a robust underwater acoustic recognition system in real-world scenarios is challenging due to the complex underwater environment.
We propose DEMONet, which utilizes the detection of envelope modulation on noise (DEMON) to provide robust insights into the shaft frequency or blade counts of targets.
To mitigate noise and spurious modulation spectra in DEMON features, we introduce a cross-temporal alignment strategy and employ a variational autoencoder (VAE) to reconstruct noise-resistant DEMON spectra to replace the raw DEMON features.
arXiv Detail & Related papers (2024-11-05T03:04:51Z) - A Multitask Deep Learning Model for Classification and Regression of Hyperspectral Images: Application to the large-scale dataset [44.94304541427113]
We propose a multitask deep learning model to perform multiple classification and regression tasks simultaneously on hyperspectral images.
We validated our approach on a large hyperspectral dataset called TAIGA.
A comprehensive qualitative and quantitative analysis of the results shows that the proposed method significantly outperforms other state-of-the-art methods.
arXiv Detail & Related papers (2024-07-23T11:14:54Z) - Concrete Subspace Learning based Interference Elimination for Multi-task
Model Fusion [86.6191592951269]
Merging models fine-tuned from common extensively pretrained large model but specialized for different tasks has been demonstrated as a cheap and scalable strategy to construct a multitask model that performs well across diverse tasks.
We propose the CONtinuous relaxation dis (Concrete) subspace learning method to identify a common lowdimensional subspace and utilize its shared information track interference problem without sacrificing performance.
arXiv Detail & Related papers (2023-12-11T07:24:54Z) - Cross-modal Prompts: Adapting Large Pre-trained Models for Audio-Visual
Downstream Tasks [55.36987468073152]
This paper proposes a novel Dual-Guided Spatial-Channel-Temporal (DG-SCT) attention mechanism.
The DG-SCT module incorporates trainable cross-modal interaction layers into pre-trained audio-visual encoders.
Our proposed model achieves state-of-the-art results across multiple downstream tasks, including AVE, AVVP, AVS, and AVQA.
arXiv Detail & Related papers (2023-11-09T05:24:20Z) - Histogram Layer Time Delay Neural Networks for Passive Sonar
Classification [58.720142291102135]
A novel method combines a time delay neural network and histogram layer to incorporate statistical contexts for improved feature learning and underwater acoustic target classification.
The proposed method outperforms the baseline model, demonstrating the utility in incorporating statistical contexts for passive sonar target recognition.
arXiv Detail & Related papers (2023-07-25T19:47:26Z) - Edge-guided Representation Learning for Underwater Object Detection [15.832646455660278]
Underwater object detection is crucial for marine economic development, environmental protection, and the planet's sustainable development.
Main challenges of this task arise from low-contrast, small objects, and mimicry of aquatic organisms.
We propose an Edge-guided Representation Learning Network, termed ERL-Net, that aims to achieve discriminative representation learning and aggregation.
arXiv Detail & Related papers (2023-06-01T08:29:44Z) - Unifying Flow, Stereo and Depth Estimation [121.54066319299261]
We present a unified formulation and model for three motion and 3D perception tasks.
We formulate all three tasks as a unified dense correspondence matching problem.
Our model naturally enables cross-task transfer since the model architecture and parameters are shared across tasks.
arXiv Detail & Related papers (2022-11-10T18:59:54Z) - Gait Recognition in the Wild with Multi-hop Temporal Switch [81.35245014397759]
gait recognition in the wild is a more practical problem that has attracted the attention of the community of multimedia and computer vision.
This paper presents a novel multi-hop temporal switch method to achieve effective temporal modeling of gait patterns in real-world scenes.
arXiv Detail & Related papers (2022-09-01T10:46:09Z) - Depth Guided Adaptive Meta-Fusion Network for Few-shot Video Recognition [86.31412529187243]
Few-shot video recognition aims at learning new actions with only very few labeled samples.
We propose a depth guided Adaptive Meta-Fusion Network for few-shot video recognition which is termed as AMeFu-Net.
arXiv Detail & Related papers (2020-10-20T03:06:20Z) - Deep Auto-Encoders with Sequential Learning for Multimodal Dimensional
Emotion Recognition [38.350188118975616]
We propose a novel deep neural network architecture consisting of a two-stream auto-encoder and a long short term memory for emotion recognition.
We carry out extensive experiments on the multimodal emotion in the wild dataset: RECOLA.
Experimental results show that the proposed method achieves state-of-the-art recognition performance and surpasses existing schemes by a significant margin.
arXiv Detail & Related papers (2020-04-28T01:25:00Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.